Assessing the inter-rater agreement for ordinal data through weighted indexes

Donata Marasini; Piero Quatto; Enrico Ripamonti

doi:10.1177/0962280214529560

Assessing the inter-rater agreement for ordinal data through weighted indexes

Stat Methods Med Res. 2016 Dec;25(6):2611-2633. doi: 10.1177/0962280214529560. Epub 2014 Apr 16.

Authors

Donata Marasini¹, Piero Quatto¹, Enrico Ripamonti²

Affiliations

¹ Statistical Section, Department of Economics, Management and Statistics, University of Milan-Bicocca, Milano, Italy.
² Statistical Section, Department of Economics, Management and Statistics, University of Milan-Bicocca, Milano, Italy enrico.ripamonti@unimib.it.

PMID: 24740999
DOI: 10.1177/0962280214529560

Abstract

Assessing the inter-rater agreement between observers, in the case of ordinal variables, is an important issue in both the statistical theory and biomedical applications. Typically, this problem has been dealt with the use of Cohen's weighted kappa, which is a modification of the original kappa statistic, proposed for nominal variables in the case of two observers. Fleiss (1971) put forth a generalization of kappa in the case of multiple observers, but both Cohen's and Fleiss' kappa could have a paradoxical behavior, which may lead to a difficult interpretation of their magnitude. In this paper, a modification of Fleiss' kappa, not affected by paradoxes, is proposed, and subsequently generalized to the case of ordinal variables. Monte Carlo simulations are used both to testing statistical hypotheses and to calculating percentile and bootstrap-t confidence intervals based on this statistic. The normal asymptotic distribution of the proposed statistic is demonstrated. Our results are applied to the classical Holmquist et al.'s (1967) dataset on the classification, by multiple observers, of carcinoma in situ of the uterine cervix. Finally, we generalize the use of s* to a bivariate case.

Keywords: Fleiss’ kappa; inter-rater agreement; multiple observers; ordinal variables; weighted indexes.

MeSH terms

Carcinoma in Situ / classification
Carcinoma in Situ / diagnosis
Carcinoma in Situ / pathology
Female
Humans
Monte Carlo Method
Observer Variation*
Reproducibility of Results
Uterine Cervical Neoplasms / classification
Uterine Cervical Neoplasms / diagnosis
Uterine Cervical Neoplasms / pathology