Learning more from the inter-rater reliability of interstitial fibrosis assessment beyond just a statistic

Peir-In Liang; Wei-Chou Lin; Mei-Chin Wen; Shun-Chen Huang; Pei-Wei Fang; Hao-Wen Chuang; Yi-Jia Lin; Hui-Ping Chien; Huan-Da Chen; Tai-Di Chen

doi:10.1038/s41598-023-40221-6

Learning more from the inter-rater reliability of interstitial fibrosis assessment beyond just a statistic

Sci Rep. 2023 Aug 15;13(1):13260. doi: 10.1038/s41598-023-40221-6.

Authors

Peir-In Liang¹, Wei-Chou Lin², Mei-Chin Wen³, Shun-Chen Huang⁴, Pei-Wei Fang⁵, Hao-Wen Chuang⁶, Yi-Jia Lin⁷, Hui-Ping Chien⁸, Huan-Da Chen¹, Tai-Di Chen⁹

Affiliations

¹ Department of Pathology, Kaohsiung Medical University Hospital, Kaohsiung Medical University, Kaohsiung, Taiwan.
² Department of Pathology, National Taiwan University Hospital, Taipei, Taiwan.
³ Department of Pathology, China Medical University Hsinchu Hospital, Hsinchu, Taiwan.
⁴ Department of Anatomic Pathology, Chang Gung Memorial Hospital Kaohsiung Branch, Kaohsiung, Taiwan.
⁵ Department of Pathology, Fu Jen Catholic University Hospital, Fu Jen Catholic University, New Taipei City, Taiwan.
⁶ Department of Pathology and Laboratory Medicine, Kaohsiung Veterans General Hospital, Kaohsiung, Taiwan.
⁷ Department of Pathology, Tri-service General Hospital, National Defense Medical Center, Taipei, Taiwan.
⁸ Department of Pathology and Laboratory Medicine, Shin Kong Wu Ho-Su Memorial Hospital, Taipei, Taiwan.
⁹ Department of Anatomic Pathology, Chang Gung Memorial Hospital Linkou Main Branch, Taoyuan, Taiwan. b8902028@msn.com.

Abstract

Interstitial fibrosis assessment by renal pathologists lacks good agreement, and we aimed to investigate its hidden properties and infer possible clinical impact. Fifty kidney biopsies were assessed by 9 renal pathologists and evaluated by intraclass correlation coefficients (ICCs) and kappa statistics. Probabilities of pathologists' assessments that would deviate far from true values were derived from quadratic regression and multilayer perceptron nonlinear regression. Likely causes of variation in interstitial fibrosis assessment were investigated. Possible misclassification rates were inferred on reported large cohorts. We found inter-rater reliabilities ranged from poor to good (ICCs 0.48 to 0.90), and pathologists' assessments had the worst agreements when the extent of interstitial fibrosis was moderate. 33.5% of pathologists' assessments were expected to deviate far from the true values. Variation in interstitial fibrosis assessment was found to be correlated with variation in interstitial inflammation assessment (r² = 32.1%). Taking IgA nephropathy as an example, the Oxford T scores for interstitial fibrosis were expected to be misclassified in 21.9% of patients. This study demonstrated the complexity of the inter-rater reliability of interstitial fibrosis assessment, and our proposed approaches discovered previously unknown properties in pathologists' practice and inferred a possible clinical impact on patients.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Fibrosis
Glomerulonephritis, IGA* / pathology
Humans
Kidney* / pathology
Observer Variation
Reproducibility of Results