Accurate proteome-wide missense variant effect prediction with AlphaMissense

Science. 2023 Sep 22;381(6664):eadg7492. doi: 10.1126/science.adg7492. Epub 2023 Sep 22.

Abstract

The vast majority of missense variants observed in the human genome are of unknown clinical significance. We present AlphaMissense, an adaptation of AlphaFold fine-tuned on human and primate variant population frequency databases to predict missense variant pathogenicity. By combining structural context and evolutionary conservation, our model achieves state-of-the-art results across a wide range of genetic and experimental benchmarks, all without explicitly training on such data. The average pathogenicity score of genes is also predictive for their cell essentiality, capable of identifying short essential genes that existing statistical approaches are underpowered to detect. As a resource to the community, we provide a database of predictions for all possible human single amino acid substitutions and classify 89% of missense variants as either likely benign or likely pathogenic.

MeSH terms

  • Amino Acid Substitution* / genetics
  • Benchmarking
  • Conserved Sequence
  • Databases, Genetic
  • Disease* / genetics
  • Genome, Human
  • Humans
  • Machine Learning
  • Mutation, Missense*
  • Protein Conformation
  • Proteome* / genetics
  • Sequence Alignment* / methods

Substances

  • Proteome