Item-Score Reliability in Empirical-Data Sets and Its Relationship With Other Item Indices

Eva A O Zijlmans; Jesper Tijmstra; L Andries van der Ark; Klaas Sijtsma

doi:10.1177/0013164417728358

Item-Score Reliability in Empirical-Data Sets and Its Relationship With Other Item Indices

Educ Psychol Meas. 2018 Dec;78(6):998-1020. doi: 10.1177/0013164417728358. Epub 2017 Sep 27.

Authors

Eva A O Zijlmans¹, Jesper Tijmstra¹, L Andries van der Ark², Klaas Sijtsma¹

Affiliations

¹ Tilburg University, Tilburg, Netherlands.
² University of Amsterdam, Amsterdam, Netherlands.

Abstract

Reliability is usually estimated for a total score, but it can also be estimated for item scores. Item-score reliability can be useful to assess the repeatability of an individual item score in a group. Three methods to estimate item-score reliability are discussed, known as method MS, method $λ_{6}$ , and method CA. The item-score reliability methods are compared with four well-known and widely accepted item indices, which are the item-rest correlation, the item-factor loading, the item scalability, and the item discrimination. Realistic values for item-score reliability in empirical-data sets are monitored to obtain an impression of the values to be expected in other empirical-data sets. The relation between the three item-score reliability methods and the four well-known item indices are investigated. Tentatively, a minimum value for the item-score reliability methods to be used in item analysis is recommended.

Keywords: Coefficient λ6; correction for attenuation; item discrimination; item scalability; item-factor loading; item-rest correlation; item-score reliability.