UniProt genomic mapping for deciphering functional effects of missense variants

Hum Mutat. 2019 Jun;40(6):694-705. doi: 10.1002/humu.23738. Epub 2019 Apr 3.

Abstract

Understanding the association of genetic variation with its functional consequences in proteins is essential for the interpretation of genomic data and identifying causal variants in diseases. Integration of protein function knowledge with genome annotation can assist in rapidly comprehending genetic variation within complex biological processes. Here, we describe mapping UniProtKB human sequences and positional annotations, such as active sites, binding sites, and variants to the human genome (GRCh38) and the release of a public genome track hub for genome browsers. To demonstrate the power of combining protein annotations with genome annotations for functional interpretation of variants, we present specific biological examples in disease-related genes and proteins. Computational comparisons of UniProtKB annotations and protein variants with ClinVar clinically annotated single nucleotide polymorphism (SNP) data show that 32% of UniProtKB variants colocate with 8% of ClinVar SNPs. The majority of colocated UniProtKB disease-associated variants (86%) map to 'pathogenic' ClinVar SNPs. UniProt and ClinVar are collaborating to provide a unified clinical variant annotation for genomic, protein, and clinical researchers. The genome track hubs, and related UniProtKB files, are downloadable from the UniProt FTP site and discoverable as public track hubs at the UCSC and Ensembl genome browsers.

Keywords: UniProt database; genome mapping; missense variants.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Binding Sites
  • Chromosome Mapping / methods*
  • Databases, Genetic*
  • Databases, Protein
  • Genetic Predisposition to Disease
  • Humans
  • Molecular Sequence Annotation
  • Mutation, Missense*
  • Polymorphism, Single Nucleotide
  • Protein Binding
  • Proteins / chemistry*
  • Proteins / genetics
  • Proteins / metabolism
  • Software
  • Web Browser

Substances

  • Proteins