Biomarker discovery by feature ranking: Evaluation on a case study of embryonal tumors

Comput Biol Med. 2021 Jan:128:104143. doi: 10.1016/j.compbiomed.2020.104143. Epub 2020 Nov 28.

Abstract

The task of biomarker discovery is best translated to the machine learning task of feature ranking. Namely, the goal of biomarker discovery is to identify a set of potentially viable targets for addressing a given biological status. This is aligned with the definition of feature ranking and its goal - to produce a list of features ordered by their importance for the target concept. This differs from the task of feature selection (typically used for biomarker discovery) in that it catches viable biomarkers that have redundant or overlapping information with often highly important biomarkers, while with feature selection this is not the case. We propose to use a methodology for evaluating feature rankings to assess the quality of a given feature ranking and to discover the best cut-off point. We demonstrate the effectiveness of the proposed methodology on 10 datasets containing data about embryonal tumors. We evaluate two most commonly used feature ranking algorithms (Random forests and RReliefF) and using the evaluation methodology identifies a set of viable biomarkers that have been confirmed to be related to cancer.

Keywords: Biomedicine application; Feature ranking evaluation; Tumor data.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Biomarkers
  • Humans
  • Machine Learning
  • Neoplasms*
  • Neoplasms, Germ Cell and Embryonal*

Substances

  • Biomarkers