Hypsarrhythmia assessment exhibits poor interrater reliability: a threat to clinical trial validity

Shaun A Hussain; Grace Kwong; John J Millichap; John R Mytinger; Nicole Ryan; Joyce H Matsumoto; Joyce Y Wu; Jason T Lerner; Raman Sankar

doi:10.1111/epi.12861

Hypsarrhythmia assessment exhibits poor interrater reliability: a threat to clinical trial validity

Epilepsia. 2015 Jan;56(1):77-81. doi: 10.1111/epi.12861. Epub 2014 Nov 10.

Authors

Shaun A Hussain¹, Grace Kwong, John J Millichap, John R Mytinger, Nicole Ryan, Joyce H Matsumoto, Joyce Y Wu, Jason T Lerner, Raman Sankar

Affiliation

¹ Division of Pediatric Neurology, Mattel Children's Hospital at UCLA, David Geffen School of Medicine, Los Angeles, California, U.S.A.

PMID: 25385396
DOI: 10.1111/epi.12861

Abstract

Objective: Hypsarrhythmia is the classic interictal electroencephalographic pattern associated with infantile spasms, and characterized by high voltage, disorganization, and multifocal independent epileptiform discharges. Given this seemingly simple definition, one might expect excellent interrater reliability (IRR) in the identification of this pattern. Alternatively, it may be argued that assessments of voltage and disorganization are fairly subjective, and thus quite challenging in borderline cases. We sought to test the IRR of hypsarrhythmia assessment in a systematic fashion.

Methods: Six blinded pediatric electroencephalographers from four centers reviewed 22 electroencephalography (EEG) samples from patients with infantile spasms. Each sample was 5 min in duration and included only wakefulness. Raters determined if each EEG was abnormal and if hypsarrhythmia was present/absent, and characterized relevant features: voltage, organization, epileptiform discharges, slowing, interictal attenuations, symmetry, and synchrony. In addition, raters indicated their level of confidence for each assessment. Multirater kappa statistics (κ) were calculated for the assessment of hypsarrhythmia and each feature.

Results: Although IRR was favorable in determining whether a study was normal or abnormal (κ=0.89), reliability was unfavorable for assessment of hypsarrhythmia (κ=0.40), modified hypsarrhythmia (κ=0.47), high voltage (κ=0.37), disorganization (κ=0.22), multifocal epileptiform discharges (κ=0.68), interictal voltage attenuations (κ=0.21), slowing (κ=0.20), asymmetry (κ=0.26), and asynchrony (κ=0.08). Despite generally unsatisfactory interrater agreement, raters consistently reported high confidence in assessments.

Significance: This study contradicts the view that hypsarrhythmia assessment is straightforward. Even small variability in the identification of hypsarrhythmia has potentially deleterious consequences for clinical care, as its presence or absence impacts decisions to pursue high-risk and high-cost therapies. These inconsistencies may similarly confound studies in which abolition of hypsarrhythmia is an outcome measure. There is a great need for practical, reliable, and unbiased measures of hypsarrhythmia.

Keywords: Electroencephalography; Hypsarrhythmia; Infantile spasms; Interrater reliability; West syndrome.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Child, Preschool
Clinical Trials as Topic / standards
Electroencephalography / statistics & numerical data*
Humans
Infant
Neurology / standards*
Observer Variation
Reproducibility of Results
Spasms, Infantile / diagnosis*

Abstract

Publication types

MeSH terms

Grants and funding