Unsupervised learning for large-scale corneal topography clustering

Sci Rep. 2020 Oct 12;10(1):16973. doi: 10.1038/s41598-020-73902-7.

Abstract

Machine learning algorithms have recently shown their precision and potential in many different use cases and fields of medicine. Most of the algorithms used are supervised and need a large quantity of labeled data to achieve high accuracy. Also, most applications of machine learning in medicine are attempts to mimic or exceed human diagnostic capabilities but little work has been done to show the power of these algorithms to help collect and pre-process a large amount of data. In this study we show how unsupervised learning can extract and sort usable data from large unlabeled datasets with minimal human intervention. Our digital examination tools used in clinical practice store such databases and are largely under-exploited. We applied unsupervised algorithms to corneal topography examinations which remains the gold standard test for diagnosis and follow-up of many corneal diseases and refractive surgery screening. We could extract 7019 usable examinations which were automatically sorted in 3 common diagnoses (Normal, Keratoconus and History of Refractive Surgery) from an unlabeled database with an overall accuracy of 96.5%. Similar methods could be used on any form of digital examination database and greatly speed up the data collection process and yield to the elaboration of stronger supervised models.

MeSH terms

  • Algorithms*
  • Cluster Analysis
  • Corneal Diseases / diagnosis*
  • Corneal Topography / methods*
  • Data Collection
  • Databases, Factual*
  • Datasets as Topic
  • Humans
  • Information Storage and Retrieval
  • Keratoconus / diagnosis
  • Unsupervised Machine Learning*