Infant polysomnography: reliability. Collaborative Home Infant Monitoring Evaluation (CHIME) Steering Committee

D H Crowell; L J Brooks; T Colton; M J Corwin; T T Hoppenbrouwers; C E Hunt; L E Kapuniai; G Lister; M R Neuman; M Peucker; S L Ward; D E Weese-Mayer; M Willinger

Infant polysomnography: reliability. Collaborative Home Infant Monitoring Evaluation (CHIME) Steering Committee

Sleep. 1997 Jul;20(7):553-60.

Authors

D H Crowell¹, L J Brooks, T Colton, M J Corwin, T T Hoppenbrouwers, C E Hunt, L E Kapuniai, G Lister, M R Neuman, M Peucker, S L Ward, D E Weese-Mayer, M Willinger

Affiliation

¹ Kapiolani Medical Center for Women and Children, Honolulu, Hawaii 96826, USA.

PMID: 9322271

Abstract

Infant polysomnography (IPSG) is an increasingly important procedure for studying infants with sleep and breathing disorders. Since analyses of these IPSG data are subjective, an equally important issue is the reliability or strength of agreement among scorers (especially among experienced clinicians) of sleep parameters (SP) and sleep states (SS). One basic issue of this problem was examined by proposing and testing the hypothesis that infant SP and SS ratings can be reliably scored at substantial levels of agreement, that is, kappa (kappa) > or = 0.61. In light of the importance of IPSG reliability in the collaborative home infant monitoring evaluation (CHIME) study, a reliability training and evaluation process was developed and implemented. The bases for training on SP and SS scoring were CHIME criteria that were modifications and supplements to Anders, Emde, and Parmelee (10). The kappa statistic was adopted as the method for evaluating reliability between and among scorers. Scorers were three experienced investigators and four trainees. Inter- and intrarater reliabilities for SP codes and SSs were calculated for 408 randomly selected 30-second epochs of nocturnal IPSG recorded at five CHIME clinical sites from healthy full term (n = 5), preterm (n = 4), apnea of infancy (n = 2), and siblings of the sudden infant death syndrome (SIDS) (n = 4) enrolled subjects. Infant PSG data set 1 was scored by both experienced investigators and trained scorers and was used to assess initial interrater reliability. Infant PSG data set 2 was scored twice by the trained scorers and was used to reassess inter-rater reliability and to assess intrarater reliability. The kappa s for SS ranged from 0.45 to 0.58 for data set 1 and represented a moderate level of agreement. Therefore, rater disagreements were reviewed, and the scoring criteria were modified to clarify ambiguities. The kappa s and confidence intervals (CIs) computed for data set 2 yielded substantial inter-rater and intrarater agreements for the four trained scorers; for SS, the kappa = 0.68 and for SP the kappa s ranged from 0.62 to 0.76. Acceptance of the hypothesis supports the conclusion that the IPSG is a reliable source of clinical and research data when supported by significant kappa s and CIs. Reliability can be maximized with strictly detailed scoring guidelines and training.

Publication types

Comparative Study
Research Support, U.S. Gov't, P.H.S.

MeSH terms

Humans
Infant
Polysomnography*
Reproducibility of Results
Sudden Infant Death

Grants and funding

etc