A Gaussian process-based definition reveals new and bona fide genetic interactions compared to a multiplicative model in the Gram-negative Escherichia coli

Bioinformatics. 2020 Feb 1;36(3):880-889. doi: 10.1093/bioinformatics/btz673.

Abstract

Motivation: A digenic genetic interaction (GI) is observed when mutations in two genes within the same organism yield a phenotype that is different from the expected, given each mutation's individual effects. While multiplicative scoring is widely applied to define GIs, revealing underlying gene functions, it remains unclear if it is the most suitable choice for scoring GIs in Escherichia coli. Here, we assess many different definitions, including the multiplicative model, for mapping functional links between genes and pathways in E.coli.

Results: Using our published E.coli GI datasets, we show computationally that a machine learning Gaussian process (GP)-based definition better identifies functional associations among genes than a multiplicative model, which we have experimentally confirmed on a set of gene pairs. Overall, the GP definition improves the detection of GIs, biological reasoning of epistatic connectivity, as well as the quality of GI maps in E.coli, and, potentially, other microbes.

Availability and implementation: The source code and parameters used to generate the machine learning models in WEKA software were provided in the Supplementary information.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Epistasis, Genetic*
  • Escherichia coli / genetics*
  • Normal Distribution
  • Phenotype
  • Software