A tutorial on automatic post-stratification and weighting in conventional and regression-based norming of psychometric tests

Sebastian Gary; Wolfgang Lenhard; Alexandra Lenhard; David Herzberg

doi:10.3758/s13428-023-02207-0

A tutorial on automatic post-stratification and weighting in conventional and regression-based norming of psychometric tests

Behav Res Methods. 2023 Aug 21. doi: 10.3758/s13428-023-02207-0. Online ahead of print.

Authors

Sebastian Gary¹, Wolfgang Lenhard², Alexandra Lenhard¹, David Herzberg³

Affiliations

¹ Test Development Center, Psychometrica, Dettelbach, Bavaria, Germany.
² Wolfgang Lenhard, Institute of Psychology, Julius-Maximilians-University of Würzburg, Bavaria, Germany. wolfgang.lenhard@uni-wuerzburg.de.
³ WPS, Torrance, CA, USA.

PMID: 37604962
DOI: 10.3758/s13428-023-02207-0

Abstract

Norm scores are an essential source of information in individual diagnostics. Given the scope of the decisions this information may entail, establishing high-quality, representative norms is of tremendous importance in test construction. Representativeness is difficult to establish, though, especially with limited resources and when multiple stratification variables and their joint probabilities come into play. Sample stratification requires knowing which stratum an individual belongs to prior to data collection, but the required variables for the individual's classification, such as socio-economic status or demographic characteristics, are often collected within the survey or test data. Therefore, post-stratification techniques, like iterative proportional fitting (= raking), aim at simulating representativeness of normative samples and can thus enhance the overall quality of the norm scores. This tutorial describes the application of raking to normative samples, the calculation of weights, the application of these weights in percentile estimation, and the retrieval of continuous, regression-based norm models with the cNORM package on the R platform. We demonstrate this procedure using a large, non-representative dataset of vocabulary development in childhood and adolescence (N = 4542), using sex and ethnical background as stratification variables.

Keywords: Iterative proportional fitting; Post-stratification; Raking; Regression-based norming; Test construction.

Grants and funding

04 2022/3-16/Julius-Maximilians-University of Würzburg, Faculty of Human Sciences