Meta-Analysis of Interrater Reliability of Supervisory Performance Ratings: Effects of Appraisal Purpose, Scale Type, and Range Restriction

Jesús F Salgado; Silvia Moscoso

doi:10.3389/fpsyg.2019.02281

Meta-Analysis of Interrater Reliability of Supervisory Performance Ratings: Effects of Appraisal Purpose, Scale Type, and Range Restriction

Front Psychol. 2019 Oct 18:10:2281. doi: 10.3389/fpsyg.2019.02281. eCollection 2019.

Authors

Jesús F Salgado¹, Silvia Moscoso¹

Affiliation

¹ Faculty of Labor Relations, University of Santiago de Compostela, Santiago de Compostela, Spain.

Abstract

Objectives: This reliability generalization study aimed to estimate the mean and variance of the interrater reliability coefficients (r _yy ) of supervisory ratings of overall, task, contextual, and positive job performance. The moderating effect of the appraisal purpose and the scale type was examined. It was hypothesized that the ratings collected for research purposes and multi-item scales have higher r _yy . It was also examined whether r _yy was similar for the four performance dimensions. Method: A database consisting of 224 independent samples was created and hierarchical sub-grouping meta-analyses were conducted. Results: The appraisal purpose was a moderator of r _yy for the four performance dimensions. Scale type was a moderator of r _yy for overall and task performance collected for research purposes. The findings also suggest that supervisors seem to have less difficulty evaluating overall job performance than task, contextual, and positive performance. The best estimates of the observed r _yy for overall job performance are 0.61 for research-collected ratings and 0.45 for administrative-collected ratings. Conclusions: (1) Appraisal purpose moderates r _yy and researchers and practitioners should be aware of its effects before collecting ratings or using empirically-derived interrater reliability distributions, (2) Scale type seems to moderate r _yy in the case of the ratings collected for research purposes, only, (3) overall job performance is more reliably rated than task, contextual, and positive performance. Implications for research and practice are discussed.

Keywords: appraisal purpose; interrater reliability; meta-analysis; range restriction; scale type; supervisory performance ratings.

Publication types

Systematic Review