DA-SRN: Omics data analysis based on the sample network optimization for complex diseases

Comput Biol Med. 2023 Sep:164:107252. doi: 10.1016/j.compbiomed.2023.107252. Epub 2023 Jul 8.

Abstract

Effective biomarker identification and accurate sample label prediction are still challenging for complex diseases. Patient similarity network (PSN) analysis is a powerful tool in disease omics data analysis. The topology of PSN can reflect the discriminative ability of the corresponding feature space on which the sample network is built. In this study, a novel omics data analysis method based on the sample reference network (DA-SRN) is proposed to identify the potential biomarkers and predict the sample categories. DA-SRN defines the informative features and the sample reference network in optimizing the network structure by genetic algorithm. It labels the samples based on the graph neural network, the reference network and the selected informative features. DA-SRN was compared with nine efficient omics data analysis methods on the genomics, metabolomics and transcriptomics datasets to show its validation. The comparison results showed that it outperformed the other methods in area under receiver operating characteristic curve (AUROC), sensitivity, specificity and area under precision-recall curve (AUPRC) in most cases. Besides, the important metabolites identified by DA-SRN for the type 2 diabetes (T2D) metabolomics data were further examined. The pathway analysis revealed the close relationships between the identified metabolites and the critical metabolic pathways related to the occurrence and development of T2D. The experimental results illustrate that DA-SRN can extract the valuable information from the complex omics data by analyzing the sample relationship, and is promising in biomarker identification and sample discrimination for complex diseases.

Keywords: Biomarker identification; Complex diseases; Graph neural network; Omics data analysis; Sample network.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biomarkers / analysis
  • Diabetes Mellitus, Type 2* / genetics
  • Genomics
  • Humans
  • Metabolomics / methods
  • Neural Networks, Computer

Substances

  • Biomarkers