Prospects of evolution-based artificial intelligence models in genome-wide studies to stratify genetic risk variants in nonalcoholic fatty liver disease

Ann Med Surg (Lond). 2023 May 8;85(6):2743-2748. doi: 10.1097/MS9.0000000000000743. eCollection 2023 Jun.

Abstract

The emergence of genome-wide association studies (GWAS) has identified genetic traits and polymorphisms that are associated with the progression of nonalcoholic fatty liver disease (NAFLD). Phospholipase domain-containing 3 and transmembrane 6 superfamily member 2 are genes commonly associated with NAFLD phenotypes. However, there are fewer studies and replicability in lesser-known genes such as LYPLAL1 and glucokinase regulator (GCKR). With the advent of artificial intelligence (AI) in clinical genetics, studies have utilized AI algorithms to identify phenotypes through electronic health records and utilize convolution neural networks to improve the accuracy of variant identification, predict the deleterious effects of variants, and conduct phenotype-to-genotype mapping. Natural language processing (NLP) and machine-learning (ML) algorithms are popular tools in GWAS studies and connect electronic health record phenotypes to genetic diagnoses using a combination of international classification disease (ICD)-based approaches. However, there are still limitations to machine-learning - and NLP-based models, such as the lack of replicability in larger cohorts and underpowered sample sizes, which prevent the accurate prediction of genetic variants that may increase the risk of NAFLD and its progression to advanced-stage liver fibrosis. This may be largely due to the lack of understanding of the clinical consequence in the majority of pathogenic variants. Though the concept of evolution-based AI models and evolutionary algorithms is relatively new, combining current international classification disease -based NLP models with phylogenetic and evolutionary data can improve prediction accuracy and create valuable connections between variants and their pathogenicity in NAFLD. Such developments can improve risk stratification within clinical genetics and research while overcoming limitations in GWAS studies that prevent community-wide interpretations.

Keywords: artificial intelligence; genetics; machine-learning; natural language processing; nonalcoholic fatty liver disease.