RNAprofiling 2.0: Enhanced Cluster Analysis of Structural Ensembles

J Mol Biol. 2023 Jul 15;435(14):168047. doi: 10.1016/j.jmb.2023.168047. Epub 2023 Mar 17.

Abstract

Understanding the base pairing of an RNA sequence provides insight into its molecular structure. By mining suboptimal sampling data, RNAprofiling 1.0 identifies the dominant helices in low-energy secondary structures as features, organizes them into profiles which partition the Boltzmann sample, and highlights key similarities/differences among the most informative, i.e. selected, profiles in a graphical format. Version 2.0 enhances every step of this approach. First, the featured substructures are expanded from helices to stems. Second, profile selection includes low-frequency pairings similar to featured ones. In conjunction, these updates extend the utility of the method to sequences up to length 600, as evaluated over a sizable dataset. Third, relationships are visualized in a decision tree which highlights the most important structural differences. Finally, this cluster analysis is made accessible to experimental researchers in a portable format as an interactive webpage, permitting a much greater understanding of trade-offs among different possible base pairing combinations.

Keywords: Boltzmann sampling; RNA secondary structure; decision tree; interactive visualization; thermodynamic optimization.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Base Pairing
  • Base Sequence
  • Cluster Analysis
  • Nucleic Acid Conformation
  • RNA* / chemistry
  • Sequence Analysis, RNA*

Substances

  • RNA