Exact one-sided confidence limits for Cohen's kappa as a measurement of agreement

Stat Methods Med Res. 2017 Apr;26(2):615-632. doi: 10.1177/0962280214552881. Epub 2014 Oct 6.

Abstract

Cohen's kappa coefficient, κ, is a statistical measure of inter-rater agreement or inter-annotator agreement for qualitative items. In this paper, we focus on interval estimation of κ in the case of two raters and binary items. So far, only asymptotic and bootstrap intervals are available for κ due to its complexity. However, there is no guarantee that such intervals will capture κ with the desired nominal level 1- α. In other words, the statistical inferences based on these intervals are not reliable. We apply the Buehler method to obtain exact confidence intervals based on four widely used asymptotic intervals, three Wald-type confidence intervals and one interval constructed from a profile variance. These exact intervals are compared with regard to coverage probability and length for small to medium sample sizes. The exact intervals based on the Garner interval and the Lee and Tu interval are generally recommended for use in practice due to good performance in both coverage probability and length.

Keywords: Buehler method; coverage probability; exact confidence interval; expected length; order.

MeSH terms

  • Biostatistics / methods*
  • Clinical Trials as Topic / statistics & numerical data
  • Confidence Intervals*
  • Humans
  • Models, Statistical
  • Observer Variation
  • Sample Size