Comparative analysis of high- and low-level deep learning approaches in microsatellite instability prediction

Jeonghyuk Park; Yul Ri Chung; Akinao Nose

doi:10.1038/s41598-022-16283-3

Comparative analysis of high- and low-level deep learning approaches in microsatellite instability prediction

Sci Rep. 2022 Jul 18;12(1):12218. doi: 10.1038/s41598-022-16283-3.

Authors

Jeonghyuk Park^#¹, Yul Ri Chung^#², Akinao Nose^{3

4}

Affiliations

¹ Department of Physics, Graduate School of Science, The University of Tokyo, Tokyo, Japan. johnny.jhpark90@gmail.com.
² Pathology Center, Seegene Medical Foundation, Seoul, Korea.
³ Department of Physics, Graduate School of Science, The University of Tokyo, Tokyo, Japan.
⁴ Department of Complexity Science and Engineering, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan.

^# Contributed equally.

Abstract

Deep learning-based approaches in histopathology can be largely divided into two categories: a high-level approach using an end-to-end model and a low-level approach using feature extractors. Although the advantages and disadvantages of both approaches are empirically well known, there exists no scientific basis for choosing a specific approach in research, and direct comparative analysis of the two approaches has rarely been performed. Using the Cancer Genomic Atlas (TCGA)-based dataset, we compared these two different approaches in microsatellite instability (MSI) prediction and analyzed morphological image features associated with MSI. Our high-level approach was based solely on EfficientNet, while our low-level approach relied on LightGBM and multiple deep learning models trained on publicly available multiclass tissue, nuclei, and gland datasets. We compared their performance and important image features. Our high-level approach showed superior performance compared to our low-level approach. In both approaches, debris, lymphocytes, and necrotic cells were revealed as important features of MSI, which is consistent with clinical knowledge. Then, during qualitative analysis, we discovered the weaknesses of our low-level approach and demonstrated that its performance can be improved by using different image features in a complementary way. We performed our study using open-access data, and we believe this study can serve as a useful basis for discovering imaging biomarkers for clinical application.

MeSH terms

Deep Learning*
Humans
Microsatellite Instability
Microsatellite Repeats
Neoplasms* / genetics