A multi-step approach for tongue image classification in patients with diabetes

Comput Biol Med. 2022 Oct:149:105935. doi: 10.1016/j.compbiomed.2022.105935. Epub 2022 Aug 13.

Abstract

Background: In China, diabetes is a common, high-incidence chronic disease. Diabetes has become a severe public health problem. However, the current diagnosis and treatment methods are difficult to control the progress of diabetes. Traditional Chinese Medicine (TCM) has become an option for the treatment of diabetes due to its low cost, good curative effect, and good accessibility.

Objective: Based on the tongue images data to realize the fine classification of the diabetic population, provide a diagnostic basis for the formulation of individualized treatment plans for diabetes, ensure the accuracy and consistency of the TCM diagnosis, and promote the objective and standardized development of TCM diagnosis.

Methods: We use the TFDA-1 tongue examination instrument to collect the tongue images of the subjects. Tongue Diagnosis Analysis System (TDAS) is used to extract the TDAS features of the tongue images. Vector Quantized Variational Autoencoder (VQ-VAE) extracts VQ-VAE features from tongue images. Based on VQ-VAE features, K-means clustering tongue images. TDAS features are used to describe the differences between clusters. Vision Transformer (ViT) combined with Grad-weighted Class Activation Mapping (Grad-CAM) is used to verify the clustering results and calculate positioning diagnostic information.

Results: Based on VQ-VAE features, K-means divides the diabetic population into 4 clusters with clear boundaries. The silhouette, calinski harabasz, and davies bouldin scores are 0.391, 673.256, and 0.809, respectively. Cluster 1 had the highest Tongue Body L (TB-L) and Tongue Coating L (TC-L) and the lowest Tongue Coating Angular second moment (TC-ASM), with a pale red tongue and white coating. Cluster 2 had the highest TC-b with a yellow tongue coating. Cluster 3 had the highest TB-a with a red tongue. Group 4 had the lowest TB-L, TC-L, and TB-b and the highest Per-all with a purple tongue and the largest tongue coating area. ViT verifies the clustering results of K-means, the highest Top-1 Classification Accuracy (CA) is 87.8%, and the average CA is 84.4%.

Conclusions: The study organically combined unsupervised learning, self-supervised learning, and supervised learning and designed a complete diabetic tongue image classification method. This method does not rely on human intervention, makes decisions based entirely on tongue image data, and achieves state-of-the-art results. Our research will help TCM deeply participate in the individualized treatment of diabetes and provide new ideas for promoting the standardization of TCM diagnosis.

Keywords: Deep learning; Diabetes; K-means; Machine learning; Tongue image; Vector quantized variational autoencoder; Vision transformer.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cluster Analysis
  • Diabetes Mellitus* / diagnostic imaging
  • Humans
  • Medicine, Chinese Traditional / methods
  • Neoplasm Grading
  • Tongue* / diagnostic imaging