Discovery of regulatory motifs in 5' untranslated regions using interpretable multi-task learning models

Cell Syst. 2023 Dec 20;14(12):1103-1112.e6. doi: 10.1016/j.cels.2023.10.011. Epub 2023 Nov 27.

Abstract

The sequence in the 5' untranslated regions (UTRs) is known to affect mRNA translation rates. However, the underlying regulatory grammar remains elusive. Here, we propose MTtrans, a multi-task translation rate predictor capable of learning common sequence patterns from datasets across various experimental techniques. The core premise is that common motifs are more likely to be genuinely involved in translation control. MTtrans outperforms existing methods in both accuracy and the ability to capture transferable motifs across species, highlighting its strength in identifying evolutionarily conserved sequence motifs. Our independent fluorescence-activated cell sorting coupled with deep sequencing (FACS-seq) experiment validates the impact of most motifs identified by MTtrans. Additionally, we introduce "GRU-rewiring," a technique to interpret the hidden states of the recurrent units. Gated recurrent unit (GRU)-rewiring allows us to identify regulatory element-enriched positions and examine the local effects of 5' UTR mutations. MTtrans is a powerful tool for deciphering the translation regulatory motifs.

Keywords: eukaryotic translation; explainable AI; motif discovery; multi-task learning; sequence modeling.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • 5' Untranslated Regions / genetics
  • Conserved Sequence
  • Regulatory Sequences, Nucleic Acid*

Substances

  • 5' Untranslated Regions