Fully unsupervised deep mode of action learning for phenotyping high-content cellular images

Bioinformatics. 2021 Dec 7;37(23):4548-4555. doi: 10.1093/bioinformatics/btab497.

Abstract

Motivation: The identification and discovery of phenotypes from high content screening images is a challenging task. Earlier works use image analysis pipelines to extract biological features, supervised training methods or generate features with neural networks pretrained on non-cellular images. We introduce a novel unsupervised deep learning algorithm to cluster cellular images with similar Mode-of-Action (MOA) together using only the images' pixel intensity values as input. It corrects for batch effect during training. Importantly, our method does not require the extraction of cell candidates and works from the entire images directly.

Results: The method achieves competitive results on the labeled subset of the BBBC021 dataset with an accuracy of 97.09% for correctly classifying the MOA by nearest neighbors matching. Importantly, we can train our approach on unannotated datasets. Therefore, our method can discover novel MOAs and annotate unlabeled compounds. The ability to train end-to-end on the full resolution images makes our method easy to apply and allows it to further distinguish treatments by their effect on proliferation.

Availability and implementation: Our code is available at https://github.com/Novartis/UMM-Discovery.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Cluster Analysis
  • Image Processing, Computer-Assisted / methods
  • Neural Networks, Computer*