Genome-wide association study-based prediction of atrial fibrillation using artificial intelligence

Open Heart. 2022 Jan;9(1):e001898. doi: 10.1136/openhrt-2021-001898.

Abstract

Objective: We previously reported early-onset atrial fibrillation (AF) associated genetic loci among a Korean population. We explored whether the AF-associated single-nucleotide polymorphisms (SNPs) selected from the Genome-Wide Association Study (GWAS) of an external large cohort has a prediction power for AF in Korean population through a convolutional neural network (CNN).

Methods: This study included 6358 subjects (872 cases, 5486 controls) from the Korean population GWAS data. We extracted the lists of SNPs at each p value threshold of the association statistics from three different previously reported ethnical-specific GWASs. The Korean GWAS data were divided into training (64%), validation (16%) and test (20%) sets, and a stratified K-fold cross-validation was performed and repeated five times after data shuffling.

Results: The CNN-GWAS predictive power for AF had an area under the curve (AUC) of 0.78±0.01 based on the Japanese GWAS, AUC of 0.79±0.01 based on the European GWAS, and AUC of 0.82±0.01 based on the multiethnic GWAS, respectively. Gradient-weighted class activation mapping assigned high saliency scores for AF associated SNPs, and the PITX2 obtained the highest saliency score. The CNN-GWAS did not show AF prediction power by SNPs with non-significant p value subset (AUC 0.56±0.01) despite larger numbers of SNPs. The CNN-GWAS had no prediction power for odd-even registration numbers (AUC 0.51±0.01).

Conclusions: AF can be predicted by genetic information alone with moderate accuracy. The CNN-GWAS can be a robust and useful tool for detecting polygenic diseases by capturing the cumulative effects and genetic interactions of moderately associated but statistically significant SNPs.

Trial registration number: NCT02138695.

Keywords: atrial fibrillation; genetics; genome-wide association study.

Publication types

  • Multicenter Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Artificial Intelligence*
  • Atrial Fibrillation / diagnosis*
  • Atrial Fibrillation / epidemiology
  • Atrial Fibrillation / genetics
  • DNA / genetics*
  • Female
  • Genetic Predisposition to Disease*
  • Genome-Wide Association Study
  • Homeobox Protein PITX2
  • Homeodomain Proteins / genetics*
  • Homeodomain Proteins / metabolism
  • Humans
  • Male
  • Middle Aged
  • Morbidity / trends
  • Polymorphism, Single Nucleotide*
  • Republic of Korea / epidemiology
  • Transcription Factors / genetics*
  • Transcription Factors / metabolism

Substances

  • Homeodomain Proteins
  • Transcription Factors
  • DNA

Associated data

  • ClinicalTrials.gov/NCT02138695