GAiN: An integrative tool utilizing generative adversarial neural networks for augmented gene expression analysis

Patterns (N Y). 2024 Jan 8;5(2):100910. doi: 10.1016/j.patter.2023.100910. eCollection 2024 Feb 9.

Abstract

Big genomic data and artificial intelligence (AI) are ushering in an era of precision medicine, providing opportunities to study previously under-represented subtypes and rare diseases rather than categorize them as variances. However, clinical researchers face challenges in accessing such novel technologies as well as reliable methods to study small datasets or subcohorts with unique phenotypes. To address this need, we developed an integrative approach, GAiN, to capture patterns of gene expression from small datasets on the basis of an ensemble of generative adversarial networks (GANs) while leveraging big population data. Where conventional biostatistical methods fail, GAiN reliably discovers differentially expressed genes (DEGs) and enriched pathways between two cohorts with limited numbers of samples (n = 10) when benchmarked against a gold standard. GAiN is freely available at GitHub. Thus, GAiN may serve as a crucial tool for gene expression analysis in scenarios with limited samples, as in the context of rare diseases, under-represented populations, or limited investigator resources.

Keywords: deep learning GANs; differential gene expression; gene expression analysis; generative modeling; high-throughput sequencing data; pathway enrichment; small sample sizes; structural gene expression patterns; synthetic RNA expression datasets.