Assessment of an artificial intelligence aid for the detection of appendicular skeletal fractures in children and young adults by senior and junior radiologists

Toan Nguyen; Richard Maarek; Anne-Laure Hermann; Amina Kammoun; Antoine Marchi; Mohamed R Khelifi-Touhami; Mégane Collin; Aliénor Jaillard; Andrew J Kompel; Daichi Hayashi; Ali Guermazi; Hubert Ducou Le Pointe

doi:10.1007/s00247-022-05496-3

Assessment of an artificial intelligence aid for the detection of appendicular skeletal fractures in children and young adults by senior and junior radiologists

Pediatr Radiol. 2022 Oct;52(11):2215-2226. doi: 10.1007/s00247-022-05496-3. Epub 2022 Sep 28.

Authors

Affiliations

¹ Department of Pediatric Radiology, Armand Trousseau Hospital, 26 Av. du Dr Arnold Netter, 75012, Paris, France. toan.nguyen@aphp.fr.
² Department of Pediatric Radiology, Armand Trousseau Hospital, 26 Av. du Dr Arnold Netter, 75012, Paris, France.
³ Department of Radiology, Boston University School of Medicine, Boston, MA, USA.
⁴ Department of Radiology, Stony Brook University Renaissance School of Medicine, Stony Brook, NY, USA.
⁵ Department of Radiology, VA Boston Healthcare System, West Roxbury, MA, USA.

PMID: 36169667
DOI: 10.1007/s00247-022-05496-3

Abstract

Background: As the number of conventional radiographic examinations in pediatric emergency departments increases, so, too, does the number of reading errors by radiologists.

Objective: The aim of this study is to investigate the ability of artificial intelligence (AI) to improve the detection of fractures by radiologists in children and young adults.

Materials and methods: A cohort of 300 anonymized radiographs performed for the detection of appendicular fractures in patients ages 2 to 21 years was collected retrospectively. The ground truth for each examination was established after an independent review by two radiologists with expertise in musculoskeletal imaging. Discrepancies were resolved by consensus with a third radiologist. Half of the 300 examinations showed at least 1 fracture. Radiographs were read by three senior pediatric radiologists and five radiology residents in the usual manner and then read again immediately after with the help of AI.

Results: The mean sensitivity for all groups was 73.3% (110/150) without AI; it increased significantly by almost 10% (P<0.001) to 82.8% (125/150) with AI. For junior radiologists, it increased by 10.3% (P<0.001) and for senior radiologists by 8.2% (P=0.08). On average, there was no significant change in specificity (from 89.6% to 90.3% [+0.7%, P=0.28]); for junior radiologists, specificity increased from 86.2% to 87.6% (+1.4%, P=0.42) and for senior radiologists, it decreased from 95.1% to 94.9% (-0.2%, P=0.23). The stand-alone sensitivity and specificity of the AI were, respectively, 91% and 90%.

Conclusion: With the help of AI, sensitivity increased by an average of 10% without significantly decreasing specificity in fracture detection in a predominantly pediatric population.

Keywords: Artificial intelligence; Bone; Children; Diagnosis; Diagnostic accuracy; Fracture; Radiography; Radiology.

MeSH terms

Adolescent
Adult
Artificial Intelligence*
Child
Child, Preschool
Fractures, Bone* / diagnostic imaging
Humans
Radiography
Radiologists
Retrospective Studies
Young Adult