Reliability of the Walch Classification for Characterization of Primary Glenohumeral Arthritis: A Systematic Review

J Am Acad Orthop Surg. 2024 May 14. doi: 10.5435/JAAOS-D-22-01086. Online ahead of print.

Abstract

Introduction: The Walch classification has been widely accepted and further developed as a method to characterize glenohumeral arthritis. However, many studies have reported low and inconsistent measures of the reliability of the Walch classification. The purpose of this study was to review the literature on the reliability of the Walch classification and characterize how imaging modality and classification modifications affect reliability.

Methods: A systematic review of publications that included reliability of the Walch classification reported through intraobserver and interobserver kappa values was conducted. A search in January 2021 and repeated in July 2023 used the terms ["Imaging" OR "radiography" OR "CT" OR "MRI"] AND ["Walch classification"] AND ["Glenoid arthritis" OR "Shoulder arthritis"]. All clinical studies from database inception to July 2023 that evaluated the Walch or modified Walch classification's intraobserver and/or interobserver reliability were included. Cadaveric studies and studies that involved subjects with previous arthroplasty, shoulder débridement, glenoid reaming, interposition arthroplasty, and latarjet or bankart procedure were excluded. Articles were categorized by imaging modality and classification modification.

Results: Thirteen articles met all inclusion criteria. Three involved the evaluation of plain radiographs, 10 used CT, two used three-dimensional (3D) CT, and four used magnetic resonance imaging. Nine studies involved the original Walch classification system, five involved a simplified version, and four involved the modified Walch. Six studies examined the reliability of raters of varying experience levels with none reporting consistent differences based on experience. Overall intraobserver reliability of the Walch classifications ranged from 0.34 to 0.92, and interobserver reliability ranged from 0.132 to 0.703. No consistent trends were observed in the effect of the imaging modalities or classification modifications on reliability.

Discussion: The reliability of the Walch classification remains inconsistent, despite modification and imaging advances. Consideration of the limitations of the classification system is important when using it for treatment or prognostic purposes.