Information maximization-based clustering of histopathology images using deep learning

PLOS Digit Health. 2023 Dec 8;2(12):e0000391. doi: 10.1371/journal.pdig.0000391. eCollection 2023 Dec.

Abstract

Pancreatic cancer is one of the most adverse diseases and it is very difficult to treat because the cancer cells formed in the pancreas intertwine themselves with nearby blood vessels and connective tissue. Hence, the surgical procedure of treatment becomes complicated and it does not always lead to a cure. Histopathological diagnosis is the usual approach for cancer diagnosis. However, the pancreas remains so deep inside the body that experts sometimes struggle to detect cancer in it. Computer-aided diagnosis can come to the aid of pathologists in this scenario. It assists experts by supporting their diagnostic decisions. In this research, we carried out a deep learning-based approach to analyze histopathology images. We collected whole-slide images of KPC mice to implement this work. The pancreatic abnormalities observed in KPC mice develop similar histological features to human beings. We created random patches from whole-slide images. Then, a convolutional autoencoder framework was used to embed these patches into an integrated latent space. We applied 'information maximization', a deep learning clustering technique to cluster the identical patches in an unsupervised manner since our dataset does not have annotation. Moreover, Uniform manifold approximation and projection, a nonlinear dimension reduction technique was utilized to visualize the embedded patches in a 2-dimensional space. Finally, we calculated a few internal cluster validation metrics to determine the optimal cluster set. Our work concentrated on patch-based anomaly detection in the whole slide histopathology images of KPC mice.

Grants and funding

This work was supported by the Japan Society for the Promotion of Science (JSPS) KAKENHI Grant-in-Aid for Scientific Research (C): Grant Number 21K12111. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.