Person Specific Parameter Heterogeneity in the 2PL IRT Model

Multivariate Behav Res. 2023 Jun 23:1-7. doi: 10.1080/00273171.2023.2224312. Online ahead of print.

Abstract

Following Kelderman and Molenaar's demonstration that a factor model with person specific factor loadings is almost indistinguishable from the standard factor model in terms of overall fit, we examined person specific measurement models in Item Response Theory, person specific discrimination and difficulty parameters were created by adding random variation at the item by person level. Using standard fitting algorithms for the 2PL IRT there was modest evidence of person- or item-level misfit using common diagnostic tools. The item difficulties were well-estimated, but the item discriminations were noticeably underestimated. As found by Kelderman and Molenaar, factor scores were estimated with less than expected reliability due to the underlying heterogeneity. The person specific models considered here are basically limiting cases of IRT models with multilevel, mixture, or differential item functioning structure. We conclude with some thoughts regarding real-world sources of heterogeneity that might go unacknowledged in common testing applications.

Keywords: Item response theory; differential item functioning; random effects.