Automated classification of protein subcellular localization in immunohistochemistry images to reveal biomarkers in colon cancer

BMC Bioinformatics. 2020 Sep 9;21(1):398. doi: 10.1186/s12859-020-03731-y.

Abstract

Background: Protein biomarkers play important roles in cancer diagnosis. Many efforts have been made on measuring abnormal expression intensity in biological samples to identity cancer types and stages. However, the change of subcellular location of proteins, which is also critical for understanding and detecting diseases, has been rarely studied.

Results: In this work, we developed a machine learning model to classify protein subcellular locations based on immunohistochemistry images of human colon tissues, and validated the ability of the model to detect subcellular location changes of biomarker proteins related to colon cancer. The model uses representative image patches as inputs, and integrates feature engineering and deep learning methods. It achieves 92.69% accuracy in classification of new proteins. Two validation datasets of colon cancer biomarkers derived from published literatures and the human protein atlas database respectively are employed. It turns out that 81.82 and 65.66% of the biomarker proteins can be identified to change locations.

Conclusions: Our results demonstrate that using image patches and combining predefined and deep features can improve the performance of protein subcellular localization, and our model can effectively detect biomarkers based on protein subcellular translocations. This study is anticipated to be useful in annotating unknown subcellular localization for proteins and discovering new potential location biomarkers.

Keywords: Bioimage processing; Bioinformatics; Cancer biomarkers; Machine learning; Protein subcellular location.

MeSH terms

  • Biomarkers, Tumor / metabolism*
  • Colonic Neoplasms / metabolism
  • Colonic Neoplasms / pathology*
  • Databases, Protein
  • Humans
  • Immunohistochemistry
  • Machine Learning
  • Proteins / classification
  • Proteins / metabolism*

Substances

  • Biomarkers, Tumor
  • Proteins