PEDF, a pleiotropic WTC-LI biomarker: Machine learning biomarker identification and validation

PLoS Comput Biol. 2021 Jul 21;17(7):e1009144. doi: 10.1371/journal.pcbi.1009144. eCollection 2021 Jul.

Abstract

Biomarkers predict World Trade Center-Lung Injury (WTC-LI); however, there remains unaddressed multicollinearity in our serum cytokines, chemokines, and high-throughput platform datasets used to phenotype WTC-disease. To address this concern, we used automated, machine-learning, high-dimensional data pruning, and validated identified biomarkers. The parent cohort consisted of male, never-smoking firefighters with WTC-LI (FEV1, %Pred< lower limit of normal (LLN); n = 100) and controls (n = 127) and had their biomarkers assessed. Cases and controls (n = 15/group) underwent untargeted metabolomics, then feature selection performed on metabolites, cytokines, chemokines, and clinical data. Cytokines, chemokines, and clinical biomarkers were validated in the non-overlapping parent-cohort via binary logistic regression with 5-fold cross validation. Random forests of metabolites (n = 580), clinical biomarkers (n = 5), and previously assayed cytokines, chemokines (n = 106) identified that the top 5% of biomarkers important to class separation included pigment epithelium-derived factor (PEDF), macrophage derived chemokine (MDC), systolic blood pressure, macrophage inflammatory protein-4 (MIP-4), growth-regulated oncogene protein (GRO), monocyte chemoattractant protein-1 (MCP-1), apolipoprotein-AII (Apo-AII), cell membrane metabolites (sphingolipids, phospholipids), and branched-chain amino acids. Validated models via confounder-adjusted (age on 9/11, BMI, exposure, and pre-9/11 FEV1, %Pred) binary logistic regression had AUCROC [0.90(0.84-0.96)]. Decreased PEDF and MIP-4, and increased Apo-AII were associated with increased odds of WTC-LI. Increased GRO, MCP-1, and simultaneously decreased MDC were associated with decreased odds of WTC-LI. In conclusion, automated data pruning identified novel WTC-LI biomarkers; performance was validated in an independent cohort. One biomarker-PEDF, an antiangiogenic agent-is a novel, predictive biomarker of particulate-matter-related lung disease. Other biomarkers-GRO, MCP-1, MDC, MIP-4-reveal immune cell involvement in WTC-LI pathogenesis. Findings of our automated biomarker identification warrant further investigation into these potential pharmacotherapy targets.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Biomarkers / blood
  • Eye Proteins / blood*
  • Firefighters
  • Humans
  • Inhalation Exposure / statistics & numerical data
  • Longitudinal Studies
  • Lung Injury* / blood
  • Lung Injury* / diagnosis
  • Lung Injury* / epidemiology
  • Lung Injury* / etiology
  • Machine Learning*
  • Male
  • Middle Aged
  • Models, Statistical
  • Nerve Growth Factors / blood*
  • Occupational Diseases* / blood
  • Occupational Diseases* / epidemiology
  • Occupational Diseases* / etiology
  • Reproducibility of Results
  • Sensitivity and Specificity
  • September 11 Terrorist Attacks*
  • Serpins / blood*

Substances

  • Biomarkers
  • Eye Proteins
  • Nerve Growth Factors
  • Serpins
  • pigment epithelium-derived factor