New approaches to predict the effect of co-occurring variants on protein characteristics

Am J Hum Genet. 2021 Aug 5;108(8):1502-1511. doi: 10.1016/j.ajhg.2021.06.011. Epub 2021 Jul 12.

Abstract

Predicting the effect of a mutated gene before the onset of symptoms of genetic diseases would greatly facilitate diagnosis and potentiate early intervention. There have been myriad attempts to predict the effects of single-nucleotide variants. However, the applicability of these efforts does not scale to co-occurring variants. Furthermore, an increasing number of protein therapeutics contain co-occurring nucleotide variations, adding uncertainty during development to the safety and efficiency of these drugs. Co-occurring nucleotide variants may often have synergistic, additive, or antagonistic effects on protein attributes, further complicating the task of outcome prediction. We tested four models based on the cooperative and antagonistic effects of co-occurring variants to predict pathogenicity and effectiveness of protein therapeutics. A total of 30 attributes, including amino acid and nucleotide features, as well as existing single-variant effect prediction tools, were considered on the basis of previous studies on single-nucleotide variants. Importantly, the effects of synonymous variants, often seen in protein therapeutics, were also included in our models. We used 12 datasets of people with monogenic diseases and controls with co-occurring genetic variants to evaluate the accuracy of our models, accomplishing a degree of accuracy comparable to that of prediction tools for single-nucleotide variants. More importantly, our framework is generalizable to new, well-curated datasets of monogenic diseases and new variant scoring tools. This approach successfully assists in addressing the challenging task of predicting the effect of co-occurring variants on pathogenicity and protein effectiveness and is applicable for a wide range of protein therapeutics and genetic diseases.

Keywords: SNV; cooperative and antagonistic effects; datasets of monogenic diseases; effect of co-occurring genetic variants; pathogenicity and protein effectiveness; prediction tool; single nucleotide variant; synonymous variants.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Computational Biology / methods*
  • Disease / genetics*
  • Genome, Human*
  • Humans
  • Mutation*
  • Polymorphism, Single Nucleotide*
  • Proteome / analysis*
  • Proteome / metabolism

Substances

  • Proteome