Protein-protein correlations based variable dimension expansion algorithm for high efficient serum biomarker discovery

Anal Chim Acta. 2020 Jul 4:1119:25-34. doi: 10.1016/j.aca.2020.04.013. Epub 2020 Apr 12.

Abstract

In this study, we constructed a high specific and efficient serum biomarker discovery pipeline. We utilized dysregulated proteins identified in primary tissue and potentially secreted into the blood as biomarker candidates. The scheduled multiple reaction monitoring method was performed to accurately quantify and verify these candidates directly in serum, thus circumventing the effects of high-abundance proteins. We then generated new variables through assigning values to protein-protein correlations to extend the dimensionality of the dataset (PPC-VDE), and the specificity of disease classification. We successfully applied this pipeline for biomarker discovery of dilated cardiomyopathy and achieved 88.6% accurate classification of dilated cardiomyopathy, ischemic cardiomyopathy and healthy controls with machine learning. This pipeline is straightforward for biomarker discovery in broad clinical field.

Keywords: 5′-nucleotidase; Dilated cardiomyopathy; Heart failure; Machine learning; Scheduled multiple reaction monitoring.

MeSH terms

  • Algorithms*
  • Biomarkers / blood
  • Databases, Protein
  • Humans
  • Machine Learning
  • Protein Binding
  • Proteins / analysis*
  • Proteomics

Substances

  • Biomarkers
  • Proteins