Lost in translation: bioinformatic analysis of variations affecting the translation initiation codon in the human genome

Bioinformatics. 2018 Nov 15;34(22):3788-3794. doi: 10.1093/bioinformatics/bty453.

Abstract

Motivation: Translation is a key biological process controlled in eukaryotes by the initiation AUG codon. Variations affecting this codon may have pathological consequences by disturbing the correct initiation of translation. Unfortunately, there is no systematic study describing these variations in the human genome. Moreover, we aimed to develop new tools for in silico prediction of the pathogenicity of gene variations affecting AUG codons, because to date, these gene defects have been wrongly classified as missense.

Results: Whole-exome analysis revealed the mean of 12 gene variations per person affecting initiation codons, mostly with high (>0.01) minor allele frequency (MAF). Moreover, analysis of Ensembl data (December 2017) revealed 11 261 genetic variations affecting the initiation AUG codon of 7205 genes. Most of these variations (99.5%) have low or unknown MAF, probably reflecting deleterious consequences. Only 62 variations had high MAF. Genetic variations with high MAF had closer alternative AUG downstream codons than did those with low MAF. Besides, the high-MAF group better maintained both the signal peptide and reading frame. These differentiating elements could help to determine the pathogenicity of this kind of variation.

Availability and implementation: Data and scripts in Perl and R are freely available at https://github.com/fanavarro/hemodonacion.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Codon
  • Codon, Initiator*
  • Computational Biology*
  • Genome, Human*
  • Humans
  • Protein Biosynthesis

Substances

  • Codon
  • Codon, Initiator