Comparing the Two- and Three-Parameter Logistic Models via Likelihood Ratio Tests: A Commonly Misunderstood Problem

Appl Psychol Meas. 2015 Jul;39(5):335-348. doi: 10.1177/0146621614563326. Epub 2014 Dec 16.

Abstract

Selection of an appropriate item response model is critical in the measurement of latent examinee ability. The one-, two-, and three-parameter logistic (1PL, 2PL, and 3PL) models are nested, and as such can be compared using likelihood ratio (LR) tests. The null hypothesis in the LR test for selection among the 2PL and 3PL models sets the guessing parameters to their lower bound of 0. This violates one of the assumptions of the LR test and renders the usual χ2 reference distribution inappropriate for the comparison. A review of the current literature revealed that this problem is not well understood in the educational measurement field. Ignoring this issue can lead to selection of an overly simplified model, with implications for the ability estimates. In this article, the use of the LR test for item response model selection is investigated, with the goal of providing practitioners with an appropriate method of selecting the most parsimonious model. The results of simulation studies indicate the nature of the problem, with inaccurate Type I error rates for cases where the inappropriate null distribution was used. An analysis of data from a statewide mathematics test showed differences pertinent to subsequent analyses.

Keywords: boundary issue; item response theory; likelihood ratio test; model selection.