EEG interpretation reliability and interpreter confidence: a large single-center study

Epilepsy Behav. 2014 Mar:32:102-7. doi: 10.1016/j.yebeh.2014.01.011. Epub 2014 Feb 13.

Abstract

The intrarater and interrater reliability (I&IR) of EEG interpretation has significant implications for the value of EEG as a diagnostic tool. We measured both the intrarater reliability and the interrater reliability of EEG interpretation based on the interpretation of complete EEGs into standard diagnostic categories and rater confidence in their interpretations and investigated sources of variance in EEG interpretations. During two distinct time intervals, six board-certified clinical neurophysiologists classified 300 EEGs into one or more of seven diagnostic categories and assigned a subjective confidence to their interpretations. Each EEG was read by three readers. Each reader interpreted 150 unique studies, and 50 studies were re-interpreted to generate intrarater data. A generalizability study assessed the contribution of subjects, readers, and the interaction between subjects and readers to interpretation variance. Five of the six readers had a median confidence of ≥99%, and the upper quartile of confidence values was 100% for all six readers. Intrarater Cohen's kappa (κc) ranged from 0.33 to 0.73 with an aggregated value of 0.59. Cohen's kappa ranged from 0.29 to 0.62 for the 15 reader pairs, with an aggregated Fleiss kappa of 0.44 for interrater agreement. Cohen's kappa was not significantly different across rater pairs (chi-square=17.3, df=14, p=0.24). Variance due to subjects (i.e., EEGs) was 65.3%, due to readers was 3.9%, and due to the interaction between readers and subjects was 30.8%. Experienced epileptologists have very high confidence in their EEG interpretations and low to moderate I&IR, a common paradox in clinical medicine. A necessary, but insufficient, condition to improve EEG interpretation accuracy is to increase intrarater and interrater reliability. This goal could be accomplished, for instance, with an automated online application integrated into a continuing medical education module that measures and reports EEG I&IR to individual users.

Keywords: Confidence; EEG; Interrater reliability; Intrarater reliability.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Electroencephalography / methods*
  • Humans
  • Male
  • Observer Variation*
  • Reproducibility of Results
  • Seizures / diagnosis*
  • Seizures / etiology