Discriminant Projection Shared Dictionary Learning for Classification of Tumors Using Gene Expression Data

IEEE/ACM Trans Comput Biol Bioinform. 2021 Jul-Aug;18(4):1464-1473. doi: 10.1109/TCBB.2019.2950209. Epub 2021 Aug 6.

Abstract

With a variety of tumor subtypes, personalized treatments need to identify the subtype of a tumor as accurately as possible. The development of DNA microarrays provides an opportunity to predict tumor classification. One strategy is to use gene expression profiling to extend current biological insights into the disease. However, overfitting problems exist in most machine learning methods when classifying tumor gene expression profile data characterized by high dimensional, small samples and nonlinearities. As a new machine learning methods, dictionary learning has become a more effective algorithm for gene expression profile classification. Here, a new method called discriminant projection shared dictionary learning (DPSDL) is proposed for classifying tumor subtypes using LINCS gene expression profile data. The method trains a shared dictionary, embeds Fisher discriminant criteria to obtain a class-specific sub-dictionary and coding coefficients. At the same time, a projection matrix is trained to widen the distance between different classes of samples. Experimental results show that our method performs better classification based on gene expression profile than the other dictionary learning methods and machine learning methods.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computational Biology / methods*
  • Databases, Genetic
  • Gene Expression Profiling
  • Humans
  • Machine Learning*
  • Neoplasms / classification*
  • Neoplasms / genetics
  • Transcriptome / genetics*