On the Validity of Forced Choice Scores Derived From the Thurstonian Item Response Theory Model

Kate E Walton; Lina Cherkasova; Richard D Roberts

doi:10.1177/1073191119843585

On the Validity of Forced Choice Scores Derived From the Thurstonian Item Response Theory Model

Assessment. 2020 Jun;27(4):706-718. doi: 10.1177/1073191119843585. Epub 2019 Apr 21.

Authors

Kate E Walton¹, Lina Cherkasova², Richard D Roberts³

Affiliations

¹ ACT, Inc., Iowa City, IA, USA.
² St. John's University, Jamaica, NY, USA.
³ Research and Assessment Design (RAD): Science Solution, Philadelphia, PA, USA.

PMID: 31007043
DOI: 10.1177/1073191119843585

Abstract

Forced choice (FC) measures may be a desirable alternative to single stimulus (SS) Likert items, which are easier to fake and can have associated response biases. However, classical methods of scoring FC measures lead to ipsative data, which have a number of psychometric problems. A Thurstonian item response theory (TIRT) model has been introduced as a way to overcome these issues, but few empirical validity studies have been conducted to ensure its effectiveness. This was the goal of the current three studies, which used FC measures of domains from popular personality frameworks including the Big Five and HEXACO, and both statement and adjective item stems. We computed TIRT and ipsative scores and compared their validity estimates. Convergent and discriminant validity of the scores were evaluated by correlating them with SS scores, and test-criterion validity evidence was evaluated by examining their relationships with meaningful outcomes. In all three studies, there was evidence for the convergent and test-criterion validity of the TIRT scores, though at times this was on par with the validity of the ipsative scores. The discriminant validity of the TIRT scores was problematic and was often worse than the ipsative scores.

Keywords: Big Five; HEXACO; Thurstonian item response theory; adjectives; forced choice; ipsative data.

MeSH terms

Deception
Humans
Motivation
Personality Disorders*
Personality*
Psychometrics
Reproducibility of Results