Unsupervised learning for large-scale corneal topography clustering

Pierre Zéboulon; Guillaume Debellemanière; Damien Gatinel

doi:10.1038/s41598-020-73902-7

Unsupervised learning for large-scale corneal topography clustering

Sci Rep. 2020 Oct 12;10(1):16973. doi: 10.1038/s41598-020-73902-7.

Authors

Pierre Zéboulon¹, Guillaume Debellemanière¹, Damien Gatinel^{2

3}

Affiliations

¹ Department of Ophthalmology, Rothschild Foundation, 25 Rue Manin, 75019, Paris, France.
² Department of Ophthalmology, Rothschild Foundation, 25 Rue Manin, 75019, Paris, France. gatinel@gmail.com.
³ CEROC (Center of Expertise and Research in Optics for Clinicians), Paris, France. gatinel@gmail.com.

Abstract

Machine learning algorithms have recently shown their precision and potential in many different use cases and fields of medicine. Most of the algorithms used are supervised and need a large quantity of labeled data to achieve high accuracy. Also, most applications of machine learning in medicine are attempts to mimic or exceed human diagnostic capabilities but little work has been done to show the power of these algorithms to help collect and pre-process a large amount of data. In this study we show how unsupervised learning can extract and sort usable data from large unlabeled datasets with minimal human intervention. Our digital examination tools used in clinical practice store such databases and are largely under-exploited. We applied unsupervised algorithms to corneal topography examinations which remains the gold standard test for diagnosis and follow-up of many corneal diseases and refractive surgery screening. We could extract 7019 usable examinations which were automatically sorted in 3 common diagnoses (Normal, Keratoconus and History of Refractive Surgery) from an unlabeled database with an overall accuracy of 96.5%. Similar methods could be used on any form of digital examination database and greatly speed up the data collection process and yield to the elaboration of stronger supervised models.

MeSH terms

Algorithms*
Cluster Analysis
Corneal Diseases / diagnosis*
Corneal Topography / methods*
Data Collection
Databases, Factual*
Datasets as Topic
Humans
Information Storage and Retrieval
Keratoconus / diagnosis
Unsupervised Machine Learning*