Identification of key candidate genes for IgA nephropathy using machine learning and statistics based bioinformatics models

Md Al Mehedi Hasan; Md Maniruzzaman; Jungpil Shin

doi:10.1038/s41598-022-18273-x

Identification of key candidate genes for IgA nephropathy using machine learning and statistics based bioinformatics models

Sci Rep. 2022 Aug 17;12(1):13963. doi: 10.1038/s41598-022-18273-x.

Authors

Md Al Mehedi Hasan¹, Md Maniruzzaman^{1

2}, Jungpil Shin³

Affiliations

¹ School of Computer Science and Engineering, The University of Aizu, Aizuwakamatsu, Fukushima, 965-8580, Japan.
² Statistics Discipline, Khulna University, Khulna, 9208, Bangladesh.
³ School of Computer Science and Engineering, The University of Aizu, Aizuwakamatsu, Fukushima, 965-8580, Japan. jpshin@u-aizu.ac.jp.

Abstract

Immunoglobulin-A-nephropathy (IgAN) is a kidney disease caused by the accumulation of IgAN deposits in the kidneys, which causes inflammation and damage to the kidney tissues. Various bioinformatics analysis-based approaches are widely used to predict novel candidate genes and pathways associated with IgAN. However, there is still some scope to clearly explore the molecular mechanisms and causes of IgAN development and progression. Therefore, the present study aimed to identify key candidate genes for IgAN using machine learning (ML) and statistics-based bioinformatics models. First, differentially expressed genes (DEGs) were identified using limma, and then enrichment analysis was performed on DEGs using DAVID. Protein-protein interaction (PPI) was constructed using STRING and Cytoscape was used to determine hub genes based on connectivity and hub modules based on MCODE scores and their associated genes from DEGs. Furthermore, ML-based algorithms, namely support vector machine (SVM), least absolute shrinkage and selection operator (LASSO), and partial least square discriminant analysis (PLS-DA) were applied to identify the discriminative genes of IgAN from DEGs. Finally, the key candidate genes (FOS, JUN, EGR1, FOSB, and DUSP1) were identified as overlapping genes among the selected hub genes, hub module genes, and discriminative genes from SVM, LASSO, and PLS-DA, respectively which can be used for the diagnosis and treatment of IgAN.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Computational Biology*
Gene Expression Profiling
Glomerulonephritis, IGA* / genetics
Humans
Machine Learning