Inter-rater reliability, intra-rater reliability and internal consistency of the Brisbane Evidence-Based Language Test

Disabil Rehabil. 2022 Feb;44(4):637-645. doi: 10.1080/09638288.2020.1776774. Epub 2020 Jun 22.

Abstract

Purpose: To examine the inter-rater reliability, intra-rater reliability, internal consistency and practice effects associated with a new test, the Brisbane Evidence-Based Language Test.

Methods: Reliability estimates were obtained in a repeated-measures design through analysis of clinician video ratings of stroke participants completing the Brisbane Evidence-Based Language Test. Inter-rater reliability was determined by comparing 15 independent clinicians' scores of 15 randomly selected videos. Intra-rater reliability was determined by comparing two clinicians' scores of 35 videos when re-scored after a two-week interval.

Results: Intraclass correlation coefficient (ICC) analysis demonstrated almost perfect inter-rater reliability (0.995; 95% confidence interval: 0.990-0.998), intra-rater reliability (0.994; 95% confidence interval: 0.989-0.997) and internal consistency (Cronbach's α = 0.940 (95% confidence interval: 0.920-1.0)). Almost perfect correlations (0.998; 95% confidence interval: 0.995-0.999) between face-to-face and video ratings were obtained.

Conclusion: The Brisbane Evidence-Based Language Test demonstrates almost perfect inter-rater reliability, intra-rater reliability and internal consistency. High correlation coefficients and narrow confidence intervals demonstrated minimal practice effects with scoring or influence of years of clinical experience on test scores. Almost perfect correlations between face-to-face and video scoring methods indicate these reliability estimates have direct application to everyday practice. The test is available from brisbanetest.org.Implications for RehabilitationThe Brisbane Evidence-Based Language Test is a new measure for the assessment of acquired language disorders.The Brisbane Evidence-Based Language Test demonstrated almost perfect inter-rater reliability, intra-rater reliability and internal consistency.High reliability estimates and narrow confidence intervals indicated that test ratings vary minimally when administered by clinicians of different experience levels, or different levels of familiarity with the new measure.The test is a reliable measure of language performance for use in clinical practice and research.

Keywords: Aphasia; outcome measures; psychometric properties; reliability; stroke; test.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Humans
  • Language Tests
  • Language*
  • Observer Variation
  • Reproducibility of Results