ISMI-VAE: A deep learning model for classifying disease cells using gene expression and SNV data

Comput Biol Med. 2024 Jun:175:108485. doi: 10.1016/j.compbiomed.2024.108485. Epub 2024 Apr 16.

Abstract

Various studies have linked several diseases, including cancer and COVID-19, to single nucleotide variations (SNV). Although single-cell RNA sequencing (scRNA-seq) technology can provide SNV and gene expression data, few studies have integrated and analyzed these multimodal data. To address this issue, we introduce Interpretable Single-cell Multimodal Data Integration Based on Variational Autoencoder (ISMI-VAE). ISMI-VAE leverages latent variable models that utilize the characteristics of SNV and gene expression data to overcome high noise levels and uses deep learning techniques to integrate multimodal information, map them to a low-dimensional space, and classify disease cells. Moreover, ISMI-VAE introduces an attention mechanism to reflect feature importance and analyze genetic features that could potentially cause disease. Experimental results on three cancer data sets and one COVID-19 data set demonstrate that ISMI-VAE surpasses the baseline method in terms of both effectiveness and interpretability and can effectively identify disease-causing gene features.

Keywords: Multimodal classification; SNV; ScRNA-seq.

MeSH terms

  • Betacoronavirus / genetics
  • COVID-19* / genetics
  • Coronavirus Infections / genetics
  • Deep Learning*
  • Humans
  • Neoplasms* / genetics
  • Pandemics
  • Pneumonia, Viral / genetics
  • Polymorphism, Single Nucleotide
  • SARS-CoV-2* / genetics
  • Single-Cell Analysis / methods