It Might Not Make a Big DIF: Improved Differential Test Functioning Statistics That Account for Sampling Variability

R Philip Chalmers; Alyssa Counsell; David B Flora

doi:10.1177/0013164415584576

It Might Not Make a Big DIF: Improved Differential Test Functioning Statistics That Account for Sampling Variability

Educ Psychol Meas. 2016 Feb;76(1):114-140. doi: 10.1177/0013164415584576. Epub 2015 Jun 29.

Authors

R Philip Chalmers¹, Alyssa Counsell¹, David B Flora¹

Affiliation

¹ York University, Toronto, Ontario, Canada.

Abstract

Differential test functioning, or DTF, occurs when one or more items in a test demonstrate differential item functioning (DIF) and the aggregate of these effects are witnessed at the test level. In many applications, DTF can be more important than DIF when the overall effects of DIF at the test level can be quantified. However, optimal statistical methodology for detecting and understanding DTF has not been developed. This article proposes improved DTF statistics that properly account for sampling variability in item parameter estimates while avoiding the necessity of predicting provisional latent trait estimates to create two-step approximations. The properties of the DTF statistics were examined with two Monte Carlo simulation studies using dichotomous and polytomous IRT models. The simulation results revealed that the improved DTF statistics obtained optimal and consistent statistical properties, such as obtaining consistent Type I error rates. Next, an empirical analysis demonstrated the application of the proposed methodology. Applied settings where the DTF statistics can be beneficial are suggested and future DTF research areas are proposed.

Keywords: differential item functioning; differential test functioning; item response theory; multiple imputation.