Evaluating Equating Methods for Varying Levels of Form Difference

Educ Psychol Meas. 2024 Jun;84(3):510-529. doi: 10.1177/00131644231176989. Epub 2023 Jun 8.

Abstract

Equating is a statistical procedure used to adjust for the difference in form difficulty such that scores on those forms can be used and interpreted comparably. In practice, however, equating methods are often implemented without considering the extent to which two forms differ in difficulty. The study aims to examine the effect of the magnitude of a form difficulty difference on equating results under random group (RG) and common-item nonequivalent group (CINEG) designs. Specifically, this study evaluates the performance of six equating methods under a set of simulation conditions including varying levels of form difference. Results revealed that, under the RG design, mean equating was proven to be the most accurate method when there is no or small form difference, whereas equipercentile is the most accurate method when the difficulty difference is medium or large. Under the CINEG design, Tucker Linear was found to be the most accurate method when the difficulty difference is medium or small, and either chained equipercentile or frequency estimation is preferred with a large difficulty level. This study would provide practitioners with research evidence-based guidance in the choice of equating methods with varying levels of form difference. As the condition of no form difficulty difference is also included, this study would inform testing companies of appropriate equating methods when two forms are similar in difficulty level.

Keywords: equating; form difficulty.