Fully automatic coronary calcium scoring in non-ECG-gated low-dose chest CT: comparison with ECG-gated cardiac CT

Eur Radiol. 2023 Feb;33(2):1254-1265. doi: 10.1007/s00330-022-09117-3. Epub 2022 Sep 13.

Abstract

Objectives: To validate an artificial intelligence (AI)-based fully automatic coronary artery calcium (CAC) scoring system on non-electrocardiogram (ECG)-gated low-dose chest computed tomography (LDCT) using multi-institutional datasets with manual CAC scoring as the reference standard.

Methods: This retrospective study included 452 subjects from three academic institutions, who underwent both ECG-gated calcium scoring computed tomography (CSCT) and LDCT scans. For all CSCT and LDCT scans, automatic CAC scoring (CAC_auto) was performed using AI-based software, and manual CAC scoring (CAC_man) was set as the reference standard. The reliability and agreement of CAC_auto was evaluated and compared with that of CAC_man using intraclass correlation coefficients (ICCs) and Bland-Altman plots. The reliability between CAC_auto and CAC_man for CAC severity categories was analyzed using weighted kappa (κ) statistics.

Results: CAC_auto on CSCT and LDCT yielded a high ICC (0.998, 95% confidence interval (CI) 0.998-0.999 and 0.989, 95% CI 0.987-0.991, respectively) and a mean difference with 95% limits of agreement of 1.3 ± 37.1 and 0.8 ± 75.7, respectively. CAC_auto achieved excellent reliability for CAC severity (κ = 0.918-0.972) on CSCT and good to excellent but heterogenous reliability among datasets (κ = 0.748-0.924) on LDCT.

Conclusions: The application of an AI-based automatic CAC scoring software to LDCT shows good to excellent reliability in CAC score and CAC severity categorization in multi-institutional datasets; however, the reliability varies among institutions.

Key points: • AI-based automatic CAC scoring on LDCT shows excellent reliability with manual CAC scoring in multi-institutional datasets. • The reliability for CAC score-based severity categorization varies among datasets. • Automatic scoring for LDCT shows a higher false-positive rate than automatic scoring for CSCT, and most common causes of a false-positive are image noise and artifacts for both CSCT and LDCT.

Keywords: Artificial intelligence; Calcium; Coronary vessels; Thorax; Tomography, X-ray computed.

MeSH terms

  • Artificial Intelligence
  • Calcium* / analysis
  • Cardiac-Gated Imaging Techniques* / methods
  • Coronary Vessels* / diagnostic imaging
  • Datasets as Topic
  • Electrocardiography
  • Humans
  • Multicenter Studies as Topic
  • Reproducibility of Results
  • Retrospective Studies
  • Tomography, X-Ray Computed* / methods

Substances

  • Calcium