Ovarian cancer identification based on dimensionality reduction for high-throughput mass spectrometry data

Bioinformatics. 2005 May 15;21(10):2200-9. doi: 10.1093/bioinformatics/bti370. Epub 2005 Mar 22.

Abstract

Motivation: High-throughput and high-resolution mass spectrometry instruments are increasingly used for disease classification and therapeutic guidance. However, the analysis of immense amount of data poses considerable challenges. We have therefore developed a novel method for dimensionality reduction and tested on a published ovarian high-resolution SELDI-TOF dataset.

Results: We have developed a four-step strategy for data preprocessing based on: (1) binning, (2) Kolmogorov-Smirnov test, (3) restriction of coefficient of variation and (4) wavelet analysis. Subsequently, support vector machines were used for classification. The developed method achieves an average sensitivity of 97.38% (sd = 0.0125) and an average specificity of 93.30% (sd = 0.0174) in 1000 independent k-fold cross-validations, where k = 2, ..., 10.

Availability: The software is available for academic and non-commercial institutions.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Algorithms*
  • Artificial Intelligence
  • Biomarkers, Tumor / analysis*
  • Diagnosis, Computer-Assisted / methods*
  • Female
  • Gene Expression Profiling / methods*
  • Humans
  • Neoplasm Proteins / analysis*
  • Ovarian Neoplasms / classification
  • Ovarian Neoplasms / diagnosis*
  • Ovarian Neoplasms / metabolism*
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Software
  • Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization / methods*

Substances

  • Biomarkers, Tumor
  • Neoplasm Proteins