An Artificial Intelligence Model Based on ACR TI-RADS Characteristics for US Diagnosis of Thyroid Nodules

Radiology. 2022 Jun;303(3):613-619. doi: 10.1148/radiol.211455. Epub 2022 Mar 22.

Abstract

Background US-based diagnosis of thyroid nodules is subjective and influenced by radiologists' experience levels. Purpose To develop an artificial intelligence model based on American College of Radiology Thyroid Imaging Reporting and Data System characteristics for diagnosing thyroid nodules and identifying nodule characteristics (hereafter, MTI-RADS) and to compare the performance of MTI-RADS, radiologists, and a model trained on benign and malignant status based on surgical histopathologic analysis (hereafter, MDiag). Materials and Methods In this retrospective study, 1588 surgically proven nodules from 636 consecutive patients (mean age, 49 years ± 14 [SD]; 485 women) were included. MTI-RADS and MDiag were trained on US images of 1345 nodules (January 2018 to December 2019). The performance of MTI-RADS was compared with that of MDiag and radiologists with different experience levels on the test data set (243 nodules, January 2019 to December 2019) with the DeLong method and McNemar test. Results The area under the receiver operating characteristic curve (AUC) and sensitivity of MTI-RADS were 0.91 and 83% (55 of 66 nodules), respectively, which were not significantly different from those of experienced radiologists (0.93 [P = .45] and 92% [61 of 66 nodules; P = .07]) and exceeded those of junior radiologists (0.78 [P < .001] and 70% [46 of 66 nodules; P = .04]). The specificity of MTI-RADS (87% [154 of 177 nodules]) was higher than that of both experienced and junior radiologists (80% [141 of 177 nodules; P = .02] and 75% [133 of 177 nodules; P = .001], respectively). The AUC of MTI-RADS was higher than that of MDiag (0.91 vs 0.84, respectively; P = .001). In the test set of 243 nodules, the consistency rates between MTI-RADS and the experienced group were higher than those between MTI-RADS and the junior group for composition (79% [n = 193] vs 73% [n = 178], respectively; P = .02), echogenicity (75% [n = 183] vs 68% [n = 166]; P = .04), shape (93% [n = 227] vs 88% [n = 215]; P = .04), and smooth or ill-defined margin (72% [n = 174] vs 63% [n = 152]; P = .002). Conclusion The area under the receiver operating characteristic curve (AUC) of an artificial intelligence model based on the American College of Radiology Thyroid Imaging Reporting and Data System (TI-RADS) was higher than that of a model trained on benign and malignant status based on surgical histopathologic analysis. The AUC and sensitivity of the model based on TI-RADS exceeded those of junior radiologists; the specificity of the model was higher than that of both experienced and junior radiologists. © RSNA, 2022.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Artificial Intelligence
  • Female
  • Humans
  • Middle Aged
  • Retrospective Studies
  • Thyroid Nodule* / diagnostic imaging
  • Thyroid Nodule* / pathology
  • Ultrasonography / methods