Issues of processing and multiple testing of SELDI-TOF MS proteomic data

Merrill D Birkner; Alan E Hubbard; Mark J van der Laan; Christine F Skibola; Christine M Hegedus; Martyn T Smith

doi:10.2202/1544-6115.1198

Issues of processing and multiple testing of SELDI-TOF MS proteomic data

Stat Appl Genet Mol Biol. 2006:5:Article11. doi: 10.2202/1544-6115.1198. Epub 2006 Apr 21.

Authors

Merrill D Birkner¹, Alan E Hubbard, Mark J van der Laan, Christine F Skibola, Christine M Hegedus, Martyn T Smith

Affiliation

¹ Division of Biostatistics, School of Public Health, University of California, Berkeley, USA. mbirkner@stat.berkeley.edu

PMID: 16646865
DOI: 10.2202/1544-6115.1198

Abstract

A new data filtering method for SELDI-TOF MS proteomic spectra data is described. We examined technical repeats (2 per subject) of intensity versus m/z (mass/charge) of bone marrow cell lysate for two groups of childhood leukemia patients: acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL). As others have noted, the type of data processing as well as experimental variability can have a disproportionate impact on the list of "interesting'' proteins (see Baggerly et al. (2004)). We propose a list of processing and multiple testing techniques to correct for 1) background drift; 2) filtering using smooth regression and cross-validated bandwidth selection; 3) peak finding; and 4) methods to correct for multiple testing (van der Laan et al. (2005)). The result is a list of proteins (indexed by m/z) where average expression is significantly different among disease (or treatment, etc.) groups. The procedures are intended to provide a sensible and statistically driven algorithm, which we argue provides a list of proteins that have a significant difference in expression. Given no sources of unmeasured bias (such as confounding of experimental conditions with disease status), proteins found to be statistically significant using this technique have a low probability of being false positives.

Publication types

Evaluation Study
Research Support, N.I.H., Extramural

MeSH terms

Acute Disease
Algorithms
Bone Marrow Cells / metabolism
Child
Data Interpretation, Statistical
Humans
Leukemia, Myeloid / metabolism*
Neoplasm Proteins / metabolism*
Precursor Cell Lymphoblastic Leukemia-Lymphoma / metabolism*
Probability
Proteomics / methods*
Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization / methods*

Substances

Neoplasm Proteins

Abstract

Publication types

MeSH terms

Substances

Grants and funding