Simple imputation methods versus direct likelihood analysis for missing item scores in multilevel educational data

Damazo T Kadengye; Wilfried Cools; Eva Ceulemans; Wim Van den Noortgate

doi:10.3758/s13428-011-0157-x

Simple imputation methods versus direct likelihood analysis for missing item scores in multilevel educational data

Behav Res Methods. 2012 Jun;44(2):516-31. doi: 10.3758/s13428-011-0157-x.

Authors

Damazo T Kadengye¹, Wilfried Cools, Eva Ceulemans, Wim Van den Noortgate

Affiliation

¹ Faculty of Psychology and Educational Sciences, Katholieke Universiteit Leuven, Etienne Sabbelaan 53, 8500, Kortrijk, Belgium. Trevor.Kadengye@kuleuven-kortrijk.be

PMID: 22002637
DOI: 10.3758/s13428-011-0157-x

Abstract

Missing data, such as item responses in multilevel data, are ubiquitous in educational research settings. Researchers in the item response theory (IRT) context have shown that ignoring such missing data can create problems in the estimation of the IRT model parameters. Consequently, several imputation methods for dealing with missing item data have been proposed and shown to be effective when applied with traditional IRT models. Additionally, a nonimputation direct likelihood analysis has been shown to be an effective tool for handling missing observations in clustered data settings. This study investigates the performance of six simple imputation methods, which have been found to be useful in other IRT contexts, versus a direct likelihood analysis, in multilevel data from educational settings. Multilevel item response data were simulated on the basis of two empirical data sets, and some of the item scores were deleted, such that they were missing either completely at random or simply at random. An explanatory IRT model was used for modeling the complete, incomplete, and imputed data sets. We showed that direct likelihood analysis of the incomplete data sets produced unbiased parameter estimates that were comparable to those from a complete data analysis. Multiple-imputation approaches of the two-way mean and corrected item mean substitution methods displayed varying degrees of effectiveness in imputing data that in turn could produce unbiased parameter estimates. The simple random imputation, adjusted random imputation, item means substitution, and regression imputation methods seemed to be less effective in imputing missing item scores in multilevel data settings.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Cluster Analysis
Data Interpretation, Statistical
Education / statistics & numerical data*
Humans
Likelihood Functions
Probability
Sample Size
Schools / statistics & numerical data