Method for Identifying Cancer-Related Genes Using Gene Similarity-Based Collaborative Filtering

J Comput Biol. 2019 Aug;26(8):875-881. doi: 10.1089/cmb.2018.0115. Epub 2019 May 22.

Abstract

The aim of this study is to diagnose the stage of renal cell carcinoma and to predict the prognosis of breast cancer by using RNA sequencing and microarray data that are representative gene expression data. To identify biomarkers for prediction, top-N genes of each class of cancer or noncancer are recommended by collaborative filtering method based on three gene similarity coefficients. We then construct a machine learning model for classification using the union of the recommended genes as the final feature set. The optimal genetic markers were used to identify the set with the highest classification performance in the model. Experiments conducted by the proposed method showed higher performance than those conducted by the machine learning model using all the gene features without performing feature selection. In addition, it showed better performance than other studies based on existing correlation-based feature selection.

Keywords: cancer diagnosis; collaborative filtering; feature selection; machine learning.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biomarkers, Tumor / genetics*
  • Breast Neoplasms / diagnosis
  • Breast Neoplasms / genetics*
  • Female
  • Genes, Neoplasm*
  • Humans
  • Machine Learning*
  • Models, Genetic*
  • RNA-Seq*

Substances

  • Biomarkers, Tumor