A Propensity Score Method for Investigating Differential Item Functioning in Performance Assessment

Educ Psychol Meas. 2020 Jun;80(3):476-498. doi: 10.1177/0013164419878861. Epub 2019 Oct 4.

Abstract

This study introduces a novel differential item functioning (DIF) method based on propensity score matching that tackles two challenges in analyzing performance assessment data, that is, continuous task scores and lack of a reliable internal variable as a proxy for ability or aptitude. The proposed DIF method consists of two main stages. First, propensity score matching is used to eliminate preexisting group differences before the test, ideally creating equivalent groups as in a randomized experimental study. Then, linear mixed effects models are adopted to perform DIF analysis based on the matched data set. We demonstrate this propensity DIF method using a high-stakes functional English language proficiency test. DIF due to education was investigated in the writing component, which consists of two continuously scored performance-based tasks. Although the proposed method is demonstrated in the context of language testing, it can be applied to other types of performance assessments.

Keywords: differential item functioning (DIF); mixed effects model; performance assessment; propensity score matching; validation; writing assessment.