Affinity network fusion and semi-supervised learning for cancer patient clustering

Methods. 2018 Aug 1:145:16-24. doi: 10.1016/j.ymeth.2018.05.020. Epub 2018 May 26.

Abstract

Defining subtypes of complex diseases such as cancer and stratifying patient groups with the same disease but different subtypes for targeted treatments is important for personalized and precision medicine. Approaches that incorporate multi-omic data are more advantageous to those using only one data type for patient clustering and disease subtype discovery. However, it is challenging to integrate multi-omic data as they are heterogeneous and noisy. In this paper, we present Affinity Network Fusion (ANF) to integrate multi-omic data for patient clustering. ANF first constructs patient affinity networks for each omic data type, and then calculates a fused network for spectral clustering. We applied ANF to a processed harmonized cancer dataset downloaded from GDC data portal consisting of 2193 patients, and generated promising results on clustering patients into correct disease types. Moreover, we developed a semi-supervised model combining ANF and neural network for few-shot learning. In several cases, the model can achieve greater than 90% accuracy on test set with training less than 1% of the data. This demonstrates the power of ANF in learning a good representation of patients, and shows the great potential of semi-supervised learning in cancer patient clustering. .

Keywords: Affinity network fusion; Cancer subtype discovery; Multi-omic integration; Neural network; Patient clustering; Semi-supervised learning.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Antineoplastic Agents / therapeutic use*
  • Cluster Analysis
  • Computational Biology / methods*
  • Humans
  • Neoplasms / drug therapy*
  • Neural Networks, Computer*
  • Supervised Machine Learning*

Substances

  • Antineoplastic Agents