Predicting transfer RNA gene activity from sequence and genome context

Genome Res. 2020 Jan;30(1):85-94. doi: 10.1101/gr.256164.119. Epub 2019 Dec 19.

Abstract

Transfer RNA (tRNA) genes are among the most highly transcribed genes in the genome owing to their central role in protein synthesis. However, there is evidence for a broad range of gene expression across tRNA loci. This complexity, combined with difficulty in measuring transcript abundance and high sequence identity across transcripts, has severely limited our collective understanding of tRNA gene expression regulation and evolution. We establish sequence-based correlates to tRNA gene expression and develop a tRNA gene classification method that does not require, but benefits from, comparative genomic information and achieves accuracy comparable to molecular assays. We observe that guanine + cytosine (G + C) content and CpG density surrounding tRNA loci is exceptionally well correlated with tRNA gene activity, supporting a prominent regulatory role of the local genomic context in combination with internal sequence features. We use our tRNA gene activity predictions in conjunction with a comprehensive tRNA gene ortholog set spanning 29 placental mammals to estimate the evolutionary rate of functional changes among orthologs. Our method adds a new dimension to large-scale tRNA functional prediction and will help prioritize characterization of functional tRNA variants. Its simplicity and robustness should enable development of similar approaches for other clades, as well as exploration of functional diversification of members of large gene families.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Animals
  • Computational Biology / methods
  • CpG Islands
  • DNA Methylation
  • Epigenesis, Genetic
  • Epigenomics / methods
  • Genome*
  • Genomics* / methods
  • Mammals
  • Mice
  • Phylogeny
  • RNA, Transfer* / genetics

Substances

  • RNA, Transfer