Expanding the repertoire of human tandem repeat RNA-binding proteins

PLoS One. 2023 Sep 20;18(9):e0290890. doi: 10.1371/journal.pone.0290890. eCollection 2023.

Abstract

Protein regions consisting of arrays of tandem repeats are known to bind other molecular partners, including nucleic acid molecules. Although the interactions between repeat proteins and DNA are already widely explored, studies characterising tandem repeat RNA-binding proteins are lacking. We performed a large-scale analysis of human proteins devoted to expanding the knowledge about tandem repeat proteins experimentally reported as RNA-binding molecules. This work is timely because of the release of a full set of accurate structural models for the human proteome amenable to repeat detection using structural methods. The main goal of our analysis was to build a comprehensive set of human RNA-binding proteins that contain repeats at the sequence or structure level. Our results showed that the combination of sequence and structural methods finds significantly more tandem repeat proteins than either method alone. We identified 219 tandem repeat proteins that bind RNA molecules and characterised the overlap between repeat regions and RNA-binding regions as a first step towards assessing their functional relationship. We observed differences in the characteristics of repeat regions predicted by sequence-based or structure-based methods in terms of their sequence composition, their functions and their protein domains.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Humans
  • Knowledge*
  • Models, Structural
  • RNA / genetics
  • RNA-Binding Proteins* / genetics
  • Tandem Repeat Sequences / genetics

Substances

  • RNA-Binding Proteins
  • RNA

Grants and funding

This project has received funding from the European Union’s Horizon 2020 research and innovation staff exchange programme REFRACT under grant agreement No 823886. A.O., M.S.C. and M.G.B. are Ph.D. fellows, J.M. is a postdoctoral researcher, and N.P. is an adjunct researcher from Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET). The work was in part supported by funding from Agencia Nacional de Promoción Científica y Tecnológica (ANPCyT) Grant #PICT-2020-SERIEA-00192 to N.P. The authors of this work are also supported by the core EMBL funding and declare that they have no competing interests. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. There was no additional external funding received for this study.