Tandem repeats ubiquitously flank and contribute to translation initiation sites

BMC Genom Data. 2022 Jul 27;23(1):59. doi: 10.1186/s12863-022-01075-5.

Abstract

Background: While the evolutionary divergence of cis-regulatory sequences impacts translation initiation sites (TISs), the implication of tandem repeats (TRs) in TIS selection remains largely elusive. Here, we employed the TIS homology concept to study a possible link between TRs of all core lengths and repeats with TISs.

Methods: Human, as reference sequence, and 83 other species were selected, and data was extracted on the entire protein-coding genes (n = 1,611,368) and transcripts (n = 2,730,515) annotated for those species from Ensembl 102. Following TIS identification, two different weighing vectors were employed to assign TIS homology, and the co-occurrence pattern of TISs with the upstream flanking TRs was studied in the selected species. The results were assessed in 10-fold cross-validation.

Results: On average, every TIS was flanked by 1.19 TRs of various categories within its 120 bp upstream sequence, per species. We detected statistically significant enrichment of non-homologous human TISs co-occurring with human-specific TRs. On the contrary, homologous human TISs co-occurred significantly with non-human-specific TRs. 2991 human genes had at least one transcript, TIS of which was flanked by a human-specific TR. Text mining of a number of the identified genes, such as CACNA1A, EIF5AL1, FOXK1, GABRB2, MYH2, SLC6A8, and TTN, yielded predominant expression and functions in the human brain and/or skeletal muscle.

Conclusion: We conclude that TRs ubiquitously flank and contribute to TIS selection at the trans-species level. Future functional analyses, such as a combination of genome editing strategies and in vitro protein synthesis may be employed to further investigate the impact of TRs on TIS selection.

Keywords: Genome-scale; Homology; TIS selection; Tandem repeat; Translation initiation site.

MeSH terms

  • Tandem Repeat Sequences*

Associated data

  • figshare/10.6084/m9.figshare.15405267