Assessing inter-rater reliability (IRR) of Tanner staging and orchidometer use with boys: a study from PROS

J Pediatr Endocrinol Metab. 2009 Apr;22(4):291-9. doi: 10.1515/jpem.2009.22.4.291.

Abstract

Background: Few studies have systematically assessed the reliability of pubertal markers; most are flawed by limited numbers of markers and ages studied.

Aim: To conduct a comprehensive examination of inter-rater reliability in the assessment of boys' sexual maturity.

Subjects: Eight pairs of practitioners independently rated 79 consecutive boys aged 8-14 years.

Methods: Two raters in each of eight practices independently rated boys aged 8-14 years, presenting for physical examinations, on key pubertal markers: pubic hair and genitalia (both on 5-point Tanner scales), testicular size (via palpation and comparison with a four-bead Prader orchidometer), and axillary hair (via a three-point scale).

Results: Intraclass correlations assessing degree of inter-rater reliability for pubertal markers ranged from 0.61 to 0.94 (all significant at p < 0.001). Rater Kappas for signs of pubertal initiation ranged from 0.49 to 0.79.

Conclusions: Practitioners are able to reliably stage key markers of male puberty and identify signs of pubertal initiation.

Publication types

  • Multicenter Study
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Adolescent
  • Child
  • Genitalia, Male / growth & development
  • Humans
  • Male
  • Middle Aged
  • Observer Variation
  • Puberty*
  • Reproducibility of Results
  • Sexual Maturation*
  • Testis / anatomy & histology