EpiMC: Detecting Epistatic Interactions Using Multiple Clusterings

IEEE/ACM Trans Comput Biol Bioinform. 2022 Jan-Feb;19(1):243-254. doi: 10.1109/TCBB.2021.3080462. Epub 2022 Feb 3.

Abstract

Detecting single nucleotide polymorphisms (SNPs) interactions is crucial to identify susceptibility genes associated with complex human diseases in genome-wide association studies. Clustering-based approaches are widely used in reducing search space and exploring potential relationships between SNPs in epistasis analysis. However, these approaches all only use a single measure to filter out nonsignificant SNP combinations, which may be significant ones from another perspective. In this paper, we propose a two-stage approach named EpiMC (Epistatic Interactions detection based on Multiple Clusterings) that employs multiple clusterings to obtain more precise candidate sets and more comprehensively detect high-order interactions based on these sets. In the first stage, EpiMC proposes a matrix factorization based multiple clusterings algorithm to generate multiple diverse clusterings, each of which divide all SNPs into different clusters. This stage aims to reduce the chance of filtering out potential candidates overlooked by a single clustering and groups associated SNPs together from different clustering perspectives. In the next stage, EpiMC considers both the single-locus effects and interaction effects to select high-quality disease associated SNPs, and then uses Jaccard similarity to get candidate sets. Finally, EpiMC uses exhaustive search on the obtained small candidate sets to precisely detect epsitatic interactions. Extensive simulation experiments show that EpiMC has a better performance in detecting high-order interactions than state-of-the-art solutions. On the Wellcome Trust Case Control Consortium (WTCCC) dataset, EpiMC detects several significant epistatic interactions associated with breast cancer (BC) and age-related macular degeneration (AMD), which again corroborate the effectiveness of EpiMC.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Cluster Analysis
  • Epistasis, Genetic* / genetics
  • Genome-Wide Association Study*
  • Humans
  • Polymorphism, Single Nucleotide / genetics