pseudoQC: A Regression-Based Simulation Software for Correction and Normalization of Complex Metabolomics and Proteomics Datasets

Shisheng Wang; Hao Yang

doi:10.1002/pmic.201900264

pseudoQC: A Regression-Based Simulation Software for Correction and Normalization of Complex Metabolomics and Proteomics Datasets

Proteomics. 2019 Oct;19(19):e1900264. doi: 10.1002/pmic.201900264. Epub 2019 Sep 18.

Authors

Shisheng Wang¹, Hao Yang¹

Affiliation

¹ West China-Washington Mitochondria and Metabolism Research Center, Key Lab of Transplant Engineering and Immunology, MOH, West China Hospital, Keyuan South Road, Hi-Tech Zone, Chengdu, 610041, China.

PMID: 31474000
DOI: 10.1002/pmic.201900264

Abstract

Various types of unwanted and uncontrollable signal variations in MS-based metabolomics and proteomics datasets severely disturb the accuracies of metabolite and protein profiling. Therefore, pooled quality control (QC) samples are often employed in quality management processes, which are indispensable to the success of metabolomics and proteomics experiments, especially in high-throughput cases and long-term projects. However, data consistency and QC sample stability are still difficult to guarantee because of the experimental operation complexity and differences between experimenters. To make things worse, numerous proteomics projects do not take QC samples into consideration at the beginning of experimental design. Herein, a powerful and interactive web-based software, named pseudoQC, is presented to simulate QC sample data for actual metabolomics and proteomics datasets using four different machine learning-based regression methods. The simulated data are used for correction and normalization of the two published datasets, and the obtained results suggest that nonlinear regression methods perform better than linear ones. Additionally, the above software is available as a web-based graphical user interface and can be utilized by scientists without a bioinformatics background. pseudoQC is open-source software and freely available at https://www.omicsolution.org/wukong/pseudoQC/.

Keywords: machine learning; metabolomics; proteomics; pseudo-quality control; regression.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Cell Line
Computational Biology / methods*
Entropy
Humans
Internet
Metabolome
Metabolomics / methods*
Metabolomics / statistics & numerical data
Proteome / metabolism
Proteomics / methods*
Proteomics / statistics & numerical data
Reproducibility of Results
Software*

Substances

Proteome