Self-Supervised Learning Methods for Label-Efficient Dental Caries Classification

Aiham Taleb; Csaba Rohrer; Benjamin Bergner; Guilherme De Leon; Jonas Almeida Rodrigues; Falk Schwendicke; Christoph Lippert; Joachim Krois

doi:10.3390/diagnostics12051237

Self-Supervised Learning Methods for Label-Efficient Dental Caries Classification

Diagnostics (Basel). 2022 May 16;12(5):1237. doi: 10.3390/diagnostics12051237.

Authors

Aiham Taleb¹, Csaba Rohrer², Benjamin Bergner¹, Guilherme De Leon³, Jonas Almeida Rodrigues⁴, Falk Schwendicke², Christoph Lippert^{1

5}, Joachim Krois²

Affiliations

¹ Digital Health & Machine Learning, Hasso Plattner Institute, University of Potsdam, 14469 Potsdam, Germany.
² Department of Oral Diagnostics, Digital Health and Health Services Research, Charité-Universitätsmedizin Berlin, 10117 Berlin, Germany.
³ Contraste Radiologia Odontológica, Blumenau 89010-050, SC, Brazil.
⁴ Department of Surgery and Orthopedics, School of Dentistry, Universidade Federal do Rio Grande do Sul-UFRGS, Porto Alegre 90010-460, RS, Brazil.
⁵ Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.

Abstract

High annotation costs are a substantial bottleneck in applying deep learning architectures to clinically relevant use cases, substantiating the need for algorithms to learn from unlabeled data. In this work, we propose employing self-supervised methods. To that end, we trained with three self-supervised algorithms on a large corpus of unlabeled dental images, which contained 38K bitewing radiographs (BWRs). We then applied the learned neural network representations on tooth-level dental caries classification, for which we utilized labels extracted from electronic health records (EHRs). Finally, a holdout test-set was established, which consisted of 343 BWRs and was annotated by three dental professionals and approved by a senior dentist. This test-set was used to evaluate the fine-tuned caries classification models. Our experimental results demonstrate the obtained gains by pretraining models using self-supervised algorithms. These include improved caries classification performance (6 p.p. increase in sensitivity) and, most importantly, improved label-efficiency. In other words, the resulting models can be fine-tuned using few labels (annotations). Our results show that using as few as 18 annotations can produce ≥45% sensitivity, which is comparable to human-level diagnostic performance. This study shows that self-supervision can provide gains in medical image analysis, particularly when obtaining labels is costly and expensive.

Keywords: annotation efficient deep learning; data driven approaches; dental caries classification; representation learning; self-supervised learning; unsupervised methods.

Abstract

Grants and funding