Swarm intelligence based wavelet coefficient feature selection for mass spectral classification: an application to proteomics data

Anal Chim Acta. 2009 Sep 28;651(1):15-23. doi: 10.1016/j.aca.2009.08.008. Epub 2009 Aug 15.

Abstract

This paper introduces the ant colony algorithm, a novel swarm intelligence based optimization method, to select appropriate wavelet coefficients from mass spectral data as a new feature selection method for ovarian cancer diagnostics. By determining the proper parameters for the ant colony algorithm (ACA) based searching algorithm, we perform the feature searching process for 100 times with the number of selected features fixed at 5. The results of this study show: (1) the classification accuracy based on the five selected wavelet coefficients can reach up to 100% for all the training, validating and independent testing sets; (2) the eight most popular selected wavelet coefficients of the 100 runs can provide 100% accuracy for the training set, 100% accuracy for the validating set, and 98.8% accuracy for the independent testing set, which suggests the robustness and accuracy of the proposed feature selection method; and (3) the mass spectral data corresponding to the eight popular wavelet coefficients can be located by reverse wavelet transformation and these located mass spectral data still maintain high classification accuracies (100% for the training set, 97.6% for the validating set, and 98.8% for the testing set) and also provide sufficient physical and medical meaning for future ovarian cancer mechanism studies. Furthermore, the corresponding mass spectral data (potential biomarkers) are in good agreement with other studies which have used the same sample set. Together these results suggest this feature extraction strategy will benefit the development of intelligent and real-time spectroscopy instrumentation based diagnosis and monitoring systems.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • Blood Proteins / chemistry
  • Female
  • Humans
  • Mass Spectrometry / methods*
  • Ovarian Neoplasms / diagnosis
  • Proteomics / methods*

Substances

  • Blood Proteins