Can Additional Patient Information Improve the Diagnostic Performance of Deep Learning for the Interpretation of Knee Osteoarthritis Severity

Dong Hyun Kim; Kyong Joon Lee; Dongjun Choi; Jae Ik Lee; Han Gyeol Choi; Yong Seuk Lee

doi:10.3390/jcm9103341

Can Additional Patient Information Improve the Diagnostic Performance of Deep Learning for the Interpretation of Knee Osteoarthritis Severity

J Clin Med. 2020 Oct 18;9(10):3341. doi: 10.3390/jcm9103341.

Authors

Dong Hyun Kim^{1

2}, Kyong Joon Lee³, Dongjun Choi³, Jae Ik Lee¹, Han Gyeol Choi¹, Yong Seuk Lee¹

Affiliations

¹ Department of Orthopedic Surgery, Seoul National University College of Medicine, Seoul National University Bundang Hospital, Seoul 03080, Korea.
² Department of Orthopaedic Surgery, Gwangmyeong 21st Century Hospital, Gyeonggi-do 14100, Korea.
³ Department of Radiology, Seoul National University College of Medicine, Seoul National University Bundang Hospital, Seoul 03080, Korea.

Abstract

The study compares the diagnostic performance of deep learning (DL) with that of the former radiologist reading of the Kellgren-Lawrence (KL) grade and evaluates whether additional patient data can improve the diagnostic performance of DL. From March 2003 to February 2017, 3000 patients with 4366 knee AP radiographs were randomly selected. DL was trained using knee images and clinical information in two stages. In the first stage, DL was trained only with images and then in the second stage, it was trained with image data and clinical information. In the test set of image data, the areas under the receiver operating characteristic curve (AUC)s of the DL algorithm in diagnosing KL 0 to KL 4 were 0.91 (95% confidence interval (CI), 0.88-0.95), 0.80 (95% CI, 0.76-0.84), 0.69 (95% CI, 0.64-0.73), 0.86 (95% CI, 0.83-0.89), and 0.96 (95% CI, 0.94-0.98), respectively. In the test set with image data and additional patient information, the AUCs of the DL algorithm in diagnosing KL 0 to KL 4 were 0.97 (95% confidence interval (CI), 0.71-0.74), 0.85 (95% CI, 0.80-0.86), 0.75 (95% CI, 0.66-0.73), 0.86 (95% CI, 0.79-0.85), and 0.95 (95% CI, 0.91-0.97), respectively. The diagnostic performance of image data with additional patient information showed a statistically significantly higher AUC than image data alone in diagnosing KL 0, 1, and 2 (p-values were 0.008, 0.020, and 0.027, respectively).The diagnostic performance of DL was comparable to that of the former radiologist reading of the knee osteoarthritis KL grade. Additional patient information improved DL diagnosis in interpreting early knee osteoarthritis.

Keywords: deep learning; diagnosing; knee; osteoarthritis; performance.

Grants and funding

02-2020-040/Seoul National University Bundang Hospital