Unsatisfactory reproducibility of interstitial inflammation scoring in allograft kidney biopsy

Sci Rep. 2023 May 1;13(1):7095. doi: 10.1038/s41598-023-33908-3.

Abstract

Interstitial inflammation scoring is incorporated into the Banff Classification of Renal Allograft Pathology and is essential for the diagnosis of T-cell mediated rejection. However, its reproducibility, including inter-rater and intra-rater reliabilities, has not been carefully investigated. In this study, eight renal pathologists from different hospitals independently scored 45 kidney allograft biopsies with varying extents of interstitial inflammation. Inter-rater reliabilities and intra-rater reliabilities were investigated by kappa statistics and conditional agreement probabilities. Individual pathologists' scoring patterns were examined by chi-squared tests and proportions tests. The mean pairwise kappa values for inter-rater reliability were 0.27, 0.30, and 0.26 for the Banff i score, ti score, and i-IFTA, respectively. No rater pair performed consistently better or worse than others on all three scorings. After dichotomizing the scores into two groups (none/mild and moderate/severe inflammation), the averaged conditional agreements ranged from 47.1% to 50.0%. The distributions of the scores differed, but some pathologists persistently scored higher or lower than others. Given the important role of interstitial inflammation scoring in the diagnosis of T-cell mediated rejection, transplant practitioners should be aware of the possible clinical implications of the far-from-optimal reproducibility.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Allografts
  • Biopsy
  • Graft Rejection / pathology
  • Humans
  • Inflammation / pathology
  • Kidney / pathology
  • Kidney Transplantation*
  • Reproducibility of Results