Biomarker Signature Discovery from Mass Spectrometry Data

IEEE/ACM Trans Comput Biol Bioinform. 2014 Jul-Aug;11(4):766-72. doi: 10.1109/TCBB.2014.2318718.

Abstract

Mass spectrometry based high throughput proteomics are used for protein analysis and clinical diagnosis. Many machine learning methods have been used to construct classifiers based on mass spectrometry data, for discrimination between cancer stages. However, the classifiers generated by machine learning such as SVM techniques typically lack biological interpretability. We present an innovative technique for automated discovery of signatures optimized to characterize various cancer stages. We validate our signature discovery algorithm on one new colorectal cancer MALDI-TOF data set, and two well-known ovarian cancer SELDI-TOF data sets. In all of these cases, our signature based classifiers performed either better or at least as well as four benchmark machine learning algorithms including SVM and KNN. Moreover, our optimized signatures automatically select smaller sets of key biomarkers than the black-boxes generated by machine learning, and are much easier to interpret.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Biomarkers, Tumor / analysis*
  • Databases, Factual
  • Humans
  • Neoplasms / chemistry*
  • Neoplasms / metabolism
  • Pattern Recognition, Automated / methods*
  • Proteomics / methods*
  • Reproducibility of Results
  • Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization / methods*

Substances

  • Biomarkers, Tumor