Comparative Study of Single-stranded Oligonucleotides Secondary Structure Prediction Tools

BMC Bioinformatics. 2023 Nov 8;24(1):422. doi: 10.1186/s12859-023-05532-5.

Abstract

Background: Single-stranded nucleic acids (ssNAs) have important biological roles and a high biotechnological potential linked to their ability to bind to numerous molecular targets. This depends on the different spatial conformations they can assume. The first level of ssNAs spatial organisation corresponds to their base pairs pattern, i.e. their secondary structure. Many computational tools have been developed to predict the ssNAs secondary structures, making the choice of the appropriate tool difficult, and an up-to-date guide on the limits and applicability of current secondary structure prediction tools is missing. Therefore, we performed a comparative study of the performances of 9 freely available tools (mfold, RNAfold, CentroidFold, CONTRAfold, MC-Fold, LinearFold, UFold, SPOT-RNA, and MXfold2) on a dataset of 538 ssNAs with known experimental secondary structure.

Results: The minimum free energy-based tools, namely mfold and RNAfold, and some tools based on artificial intelligence, namely CONTRAfold and MXfold2, provided the best results, with [Formula: see text] of exact predictions, whilst MC-fold seemed to be the worst performing tool, with only [Formula: see text] of exact predictions. In addition, UFold and SPOT-RNA are the only options for pseudoknots prediction. Including in the analysis of mfold and RNAfold results 5-10 suboptimal solutions further improved the performances of these tools. Nevertheless, we could observe issues in predicting particular motifs, such as multiple-ways junctions and mini-dumbbells, or the ssNAs whose structure has been determined in complex with a protein. In addition, our benchmark shows that some effort has to be paid for ssDNA secondary structure predictions.

Conclusions: In general, Mfold, RNAfold, and MXfold2 seem to currently be the best choice for the ssNAs secondary structure prediction, although they still show some limits linked to specific structural motifs. Nevertheless, actual trends suggest that artificial intelligence has a high potential to overcome these remaining issues, for example the recently developed UFold and SPOT-RNA have a high success rate in predicting pseudoknots.

Keywords: Benchmark; Prediction; Secondary structure; Single-stranded oligonucleotides.

MeSH terms

  • Algorithms
  • Artificial Intelligence*
  • Entropy
  • Nucleic Acid Conformation
  • Oligonucleotides*
  • RNA / chemistry

Substances

  • Oligonucleotides
  • RNA