Deep learning for cephalometric landmark detection: systematic review and meta-analysis

Falk Schwendicke; Akhilanand Chaurasia; Lubaina Arsiwala; Jae-Hong Lee; Karim Elhennawy; Paul-Georg Jost-Brinkmann; Flavio Demarco; Joachim Krois

doi:10.1007/s00784-021-03990-w

Deep learning for cephalometric landmark detection: systematic review and meta-analysis

Clin Oral Investig. 2021 Jul;25(7):4299-4309. doi: 10.1007/s00784-021-03990-w. Epub 2021 May 27.

Authors

Falk Schwendicke^{1

2}, Akhilanand Chaurasia^{3

4}, Lubaina Arsiwala⁵, Jae-Hong Lee^{3

6}, Karim Elhennawy⁷, Paul-Georg Jost-Brinkmann⁷, Flavio Demarco⁸, Joachim Krois^{5

3}

Affiliations

¹ Department of Oral Diagnostics, Digital Health and Health Services Research, Charité - Universitätsmedizin Berlin, Berlin, Germany. falk.schwendicke@charite.de.
² Topic Group Dental Diagnostics and Digital Dentistry, ITU/WHO Focus Group AI on Health, Berlin, Germany. falk.schwendicke@charite.de.
³ Topic Group Dental Diagnostics and Digital Dentistry, ITU/WHO Focus Group AI on Health, Berlin, Germany.
⁴ Department of Oral Medicine and Radiology, King George's Medical University, Lucknow, India.
⁵ Department of Oral Diagnostics, Digital Health and Health Services Research, Charité - Universitätsmedizin Berlin, Berlin, Germany.
⁶ Department of Periodontology, Daejeon Dental Hospital, Institute of Wonkwang Dental Research, Wonkwang University College of Dentistry, Daejeon, Korea.
⁷ Department of Orthodontics, Dentofacial Orthopedics and Pedodontics, Charité - Universitätsmedizin Berlin, Berlin, Germany.
⁸ Post-Graduate Program in Epidemiology, Federal University of Pelotas, Pelotas, Brazil.

Abstract

Objectives: Deep learning (DL) has been increasingly employed for automated landmark detection, e.g., for cephalometric purposes. We performed a systematic review and meta-analysis to assess the accuracy and underlying evidence for DL for cephalometric landmark detection on 2-D and 3-D radiographs.

Methods: Diagnostic accuracy studies published in 2015-2020 in Medline/Embase/IEEE/arXiv and employing DL for cephalometric landmark detection were identified and extracted by two independent reviewers. Random-effects meta-analysis, subgroup, and meta-regression were performed, and study quality was assessed using QUADAS-2. The review was registered (PROSPERO no. 227498).

Data: From 321 identified records, 19 studies (published 2017-2020), all employing convolutional neural networks, mainly on 2-D lateral radiographs (n=15), using data from publicly available datasets (n=12) and testing the detection of a mean of 30 (SD: 25; range.: 7-93) landmarks, were included. The reference test was established by two experts (n=11), 1 expert (n=4), 3 experts (n=3), and a set of annotators (n=1). Risk of bias was high, and applicability concerns were detected for most studies, mainly regarding the data selection and reference test conduct. Landmark prediction error centered around a 2-mm error threshold (mean; 95% confidence interval: (-0.581; 95 CI: -1.264 to 0.102 mm)). The proportion of landmarks detected within this 2-mm threshold was 0.799 (0.770 to 0.824).

Conclusions: DL shows relatively high accuracy for detecting landmarks on cephalometric imagery. The overall body of evidence is consistent but suffers from high risk of bias. Demonstrating robustness and generalizability of DL for landmark detection is needed.

Clinical significance: Existing DL models show consistent and largely high accuracy for automated detection of cephalometric landmarks. The majority of studies so far focused on 2-D imagery; data on 3-D imagery are sparse, but promising. Future studies should focus on demonstrating generalizability, robustness, and clinical usefulness of DL for this objective.

Keywords: Artificial intelligence; Convolutional neural networks; Evidence-based medicine; Meta-analysis; Orthodontics; Systematic review.

Publication types

Meta-Analysis
Review
Systematic Review

MeSH terms

Cephalometry
Deep Learning*
Radiography
Reproducibility of Results