Interobserver variation in the assessment of appendiceal perforation

J Laparoendosc Adv Surg Tech A. 2009 Apr:19 Suppl 1:S15-8. doi: 10.1089/lap.2008.0095.supp.

Abstract

Background: Following an appendectomy, surgeons define appendicitis, for treatment and billing purposes, into one of four categories: normal appendix, acute appendicitis, gangrenous appendicitis, and perforated appendicitis. Treatment of appendicitis is predicated upon classification at the time of visual inspection. Further, this classification often plays a role in the assessment of hospital outcomes. The currently accepted classification system is based solely upon intraoperative surgeon opinion and not objective data. Inconsistent surgeon grading of the severity of appendicitis may have implications in both management and outcomes.

Objective: The aim of this study was to assess the interobserver variation among surgeons in grading of the inflammatory severity of acute appendicitis as recognized on visual findings at operation.

Methods: A cross-sectional study design. 110 surgeons, and surgical residents were randomly selected. Surgeons were shown images of intraoperative appendicitis and were asked to evaluate the severity of the appendicitis (i.e., normal, inflamed, gangrenous, and perforated). Demographic information regarding the type of practice, hospital setting, and the number of encounters with patients with acute appendicitis were assessed. An Intraclass Correlation Coefficient score, represented by R, was calculated to assess interobserver reliability in grading the inflammatory severity of acute appendicitis. The two-way analysis of variance procedure for multivariate analysis was used for this calculation.

Results: The study group consisted of 100 surgeons, 62 practicing surgeons, and 48 surgical trainees. Overall, 79% of the surgeons treated predominantly adults with appendicitis, 18% treated primarily children, and 3% treated both children and adults. Hospital practices included university hospitals (47%), community hospitals (33%), children's hospitals (14%), and others (6%). Overall, there was poor agreement among surgeons in assessing the severity of appendicitis. Among all attending surgeons, the agreement of defining an image as to whether it was perforated or not was 27% (R4 = 0.27). Completion of a general surgery residency did improve the interobserver agreement, when compared with trainees.

Conclusion: There is poor agreement among surgeons in describing the severity of appendicitis. Treatment protocols based on more accurate assessment and categorization could potentially lead to more favorable, cost-effective outcomes. Further, studies determining efficacy in the diagnosis and treatment of appendicitis should consider observer variability. Future work must attempt to define critical objective assessment points, such as visible discontinuity of the appendix or fecal soilage, to assure a better correlation of findings with prognosis.

MeSH terms

  • Adult
  • Appendicitis / classification
  • Appendicitis / diagnosis*
  • Child
  • Cross-Sectional Studies
  • General Surgery
  • Humans
  • Observer Variation