Discriminative bag-of-cells for imaging-genomics

Pac Symp Biocomput. 2018:23:319-330.

Abstract

Connecting genotypes to image phenotypes is crucial for a comprehensive understanding of cancer. To learn such connections, new machine learning approaches must be developed for the better integration of imaging and genomic data. Here we propose a novel approach called Discriminative Bag-of-Cells (DBC) for predicting genomic markers using imaging features, which addresses the challenge of summarizing histopathological images by representing cells with learned discriminative types, or codewords. We also developed a reliable and efficient patch-based nuclear segmentation scheme using convolutional neural networks from which nuclear and cellular features are extracted. Applying DBC on TCGA breast cancer samples to predict basal subtype status yielded a class-balanced accuracy of 70% on a separate test partition of 213 patients. As data sets of imaging and genomic data become increasingly available, we believe DBC will be a useful approach for screening histopathological images for genomic markers. Source code of nuclear segmentation and DBC are available at: https://github.com/bchidest/DBC.

MeSH terms

  • Biomarkers, Tumor / genetics
  • Breast Neoplasms / diagnostic imaging
  • Breast Neoplasms / genetics
  • Computational Biology / methods
  • Female
  • Genetic Association Studies
  • Genomics / statistics & numerical data*
  • Humans
  • Image Interpretation, Computer-Assisted / methods*
  • Machine Learning
  • Neoplasms / diagnostic imaging
  • Neoplasms / genetics
  • Neural Networks, Computer

Substances

  • Biomarkers, Tumor