Learning more from the inter-rater reliability of interstitial fibrosis assessment beyond just a statistic

Sci Rep. 2023 Aug 15;13(1):13260. doi: 10.1038/s41598-023-40221-6.

Abstract

Interstitial fibrosis assessment by renal pathologists lacks good agreement, and we aimed to investigate its hidden properties and infer possible clinical impact. Fifty kidney biopsies were assessed by 9 renal pathologists and evaluated by intraclass correlation coefficients (ICCs) and kappa statistics. Probabilities of pathologists' assessments that would deviate far from true values were derived from quadratic regression and multilayer perceptron nonlinear regression. Likely causes of variation in interstitial fibrosis assessment were investigated. Possible misclassification rates were inferred on reported large cohorts. We found inter-rater reliabilities ranged from poor to good (ICCs 0.48 to 0.90), and pathologists' assessments had the worst agreements when the extent of interstitial fibrosis was moderate. 33.5% of pathologists' assessments were expected to deviate far from the true values. Variation in interstitial fibrosis assessment was found to be correlated with variation in interstitial inflammation assessment (r2 = 32.1%). Taking IgA nephropathy as an example, the Oxford T scores for interstitial fibrosis were expected to be misclassified in 21.9% of patients. This study demonstrated the complexity of the inter-rater reliability of interstitial fibrosis assessment, and our proposed approaches discovered previously unknown properties in pathologists' practice and inferred a possible clinical impact on patients.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Fibrosis
  • Glomerulonephritis, IGA* / pathology
  • Humans
  • Kidney* / pathology
  • Observer Variation
  • Reproducibility of Results