Graph Theoretical Methods and Workflows for Searching and Annotation of RNA Tertiary Base Motifs and Substructures

Int J Mol Sci. 2021 Aug 9;22(16):8553. doi: 10.3390/ijms22168553.

Abstract

The increasing number and complexity of structures containing RNA chains in the Protein Data Bank (PDB) have led to the need for automated structure annotation methods to replace or complement expert visual curation. This is especially true when searching for tertiary base motifs and substructures. Such base arrangements and motifs have diverse roles that range from contributions to structural stability to more direct involvement in the molecule's functions, such as the sites for ligand binding and catalytic activity. We review the utility of computational approaches in annotating RNA tertiary base motifs in a dataset of PDB structures, particularly the use of graph theoretical algorithms that can search for such base motifs and annotate them or find and annotate clusters of hydrogen-bond-connected bases. We also demonstrate how such graph theoretical algorithms can be integrated into a workflow that allows for functional analysis and comparisons of base arrangements and sub-structures, such as those involved in ligand binding. The capacity to carry out such automatic curations has led to the discovery of novel motifs and can give new context to known motifs as well as enable the rapid compilation of RNA 3D motifs into a database.

Keywords: 3D base motifs; RNA structure; RNA structure annotation; base clusters; base interaction networks; graph theory.

Publication types

  • Review

MeSH terms

  • Algorithms*
  • Databases, Nucleic Acid*
  • Molecular Sequence Annotation*
  • Nucleotide Motifs*
  • RNA / chemistry*
  • RNA / genetics
  • Software*
  • Workflow

Substances

  • RNA