Weighted kappa for multiple raters

Percept Mot Skills. 2008 Dec;107(3):837-48. doi: 10.2466/pms.107.3.837-848.

Abstract

Five procedures to calculate the probability of weighted kappa with multiple raters under the null hypothesis of independence are described and compared in terms of accuracy, ease of use, generality, and limitations. The five procedures are (1) exact variance, (2) resampling contingency, (3) intraclass correlation, (4) randomized block, and (5) resampling block. While each procedure possesses strengths and limitations, the resampling contingency procedure is shown to be the most versatile and accurate of the five procedures, provided the number of raters is not too large. The resampling contingency procedure permits any weighting scheme, accommodates both symmetrical and asymmetrical weights, is suitable for both weighted and unweighted kappa, and makes no assumptions about either the data distribution or the probability distribution.

MeSH terms

  • Humans
  • Models, Statistical*
  • Observer Variation
  • Psychology / statistics & numerical data