Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

J Vis Exp. 2018 Oct 11:(140):57738. doi: 10.3791/57738.

Abstract

Biomarker detection is one of the more important biomedical questions for high-throughput 'omics' researchers, and almost all existing biomarker detection algorithms generate one biomarker subset with the optimized performance measurement for a given dataset. However, a recent study demonstrated the existence of multiple biomarker subsets with similarly effective or even identical classification performances. This protocol presents a simple and straightforward methodology for detecting biomarker subsets with binary classification performances, better than a user-defined cutoff. The protocol consists of data preparation and loading, baseline information summarization, parameter tuning, biomarker screening, result visualization and interpretation, biomarker gene annotations, and result and visualization exportation at publication quality. The proposed biomarker screening strategy is intuitive and demonstrates a general rule for developing biomarker detection algorithms. A user-friendly graphical user interface (GUI) was developed using the programming language Python, allowing biomedical researchers to have direct access to their results. The source code and manual of kSolutionVis can be downloaded from http://www.healthinformaticslab.org/supp/resources.php.

Publication types

  • Research Support, Non-U.S. Gov't
  • Video-Audio Media

MeSH terms

  • Algorithms*
  • Biomarkers / chemistry*
  • Humans
  • Programming Languages
  • Software

Substances

  • Biomarkers