RNAsolo: a repository of cleaned PDB-derived RNA 3D structures

Bioinformatics. 2022 Jul 11;38(14):3668-3670. doi: 10.1093/bioinformatics/btac386.

Abstract

Motivation: The development of algorithms dedicated to RNA three-dimensional (3D) structures contributes to the demand for training, testing and benchmarking data. A reliable source of such data derived from computational prediction is the RNA-Puzzles repository. In contrast, the largest resource with experimentally determined structures is the Protein Data Bank. However, files in this archive often contain other molecular data in addition to the RNA structure itself, which-to be used by RNA processing algorithms-should be removed.

Results: RNAsolo is a self-updating database dedicated to RNA bioinformatics. It systematically collects experimentally determined RNA 3D structures stored in the PDB, cleans them from non-RNA chains, and groups them into equivalence classes. It allows users to download various subsets of data-clustered by resolution, source, data format, etc.-for further processing and analysis with a single click.

Availability and implementation: The repository is publicly available at https://rnasolo.cs.put.poznan.pl.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology
  • Databases, Protein
  • Nucleic Acid Conformation
  • RNA* / chemistry
  • Software*

Substances

  • RNA