A computer program for assessing interexaminer agreement when multiple ratings are made on a single subject

Psychiatry Res. 1997 Aug 29;72(1):65-8. doi: 10.1016/s0165-1781(97)00093-0.

Abstract

This report describes a computer program for applying a new statistical method for determining levels of agreement, or reliability, when multiple examiners evaluate a single subject. The statistics that are performed include the following: an overall level of agreement, expressed as a percentage, that takes into account all possible levels of partial agreement; the same statistical approach for deriving a separate level of agreement of every examiner with every other examiner; and tests of the extent to which a giver examiner's rating (say a symptom score of three on a five-category ordinal rating scale) deviates from the group or overall average rating. These deviation scores are interpreted as standard Z statistics. Finally, both statistical and clinical criteria are provided to evaluate levels of interexaminer agreement.

MeSH terms

  • Diagnosis, Computer-Assisted*
  • Humans
  • Neurocognitive Disorders / classification
  • Neurocognitive Disorders / diagnosis*
  • Neurocognitive Disorders / psychology
  • Neuropsychological Tests / statistics & numerical data*
  • Observer Variation
  • Psychometrics
  • Reproducibility of Results
  • Software*