Multiclass cancer classification using a feature subset-based ensemble from microRNA expression profiles

Comput Biol Med. 2017 Jan 1:80:39-44. doi: 10.1016/j.compbiomed.2016.11.008. Epub 2016 Nov 21.

Abstract

Cancer classification has been a crucial topic of research in cancer treatment. In the last decade, messenger RNA (mRNA) expression profiles have been widely used to classify different types of cancers. With the discovery of a new class of small non-coding RNAs; known as microRNAs (miRNAs), various studies have shown that the expression patterns of miRNA can also accurately classify human cancers. Therefore, there is a great demand for the development of machine learning approaches to accurately classify various types of cancers using miRNA expression data. In this article, we propose a feature subset-based ensemble method in which each model is learned from a different projection of the original feature space to classify multiple cancers. In our method, the feature relevance and redundancy are considered to generate multiple feature subsets, the base classifiers are learned from each independent miRNA subset, and the average posterior probability is used to combine the base classifiers. To test the performance of our method, we used bead-based and sequence-based miRNA expression datasets and conducted 10-fold and leave-one-out cross validations. The experimental results show that the proposed method yields good results and has higher prediction accuracy than popular ensemble methods. The Java program and source code of the proposed method and the datasets in the experiments are freely available at https://sourceforge.net/projects/mirna-ensemble/.

Keywords: Cancer classification; Data mining; Ensemble learning; miRNA expression.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Data Mining / methods*
  • Databases, Genetic
  • Decision Trees
  • Female
  • Gene Expression Profiling / methods*
  • Humans
  • Machine Learning
  • Male
  • MicroRNAs / analysis*
  • MicroRNAs / genetics
  • MicroRNAs / metabolism
  • Neoplasms / classification*
  • Neoplasms / genetics*
  • Neoplasms / metabolism
  • Support Vector Machine

Substances

  • MicroRNAs