RNA structural segmentation

Pac Symp Biocomput. 2010:57-68. doi: 10.1142/9789814295291_0008.

Abstract

We describe several dynamic programming segmentation algorithms to segment RNA secondary and tertiary structures into distinct domains. For this purpose, we consider fitness functions that variously depend on (i) base pairing probabilities in the Boltzmann low energy ensemble of structures, (ii) contact maps inferred from 3-dimensional structures, and (iii) Voronoi tessellation computed from 3-dimensional structures. Segmentation algorithms include a direct dynamic programming method, previously discovered by Bellman and by Finkelstein and Roytberg, as well as two novel algorithms - a parametric algorithm to compute the optimal segmentation into k classes, for each value k, and an algorithm that simultaneously computes the optimal segmentation of all subsegments. Since many non-coding RNA gene finders scan the genome by a moving window method, reporting high-scoring windows, we apply structural segmentation to determine the most likely 5' and 3' boundaries of precursor microRNAs. When tested on all precursor microRNAs of length at most 100 nt from the Rfam database, benchmarking studies indicate that segmentation determines the 5' boundary with discrepancy (absolute value of difference between predicted and real boundaries) having mean -0.640 (stdev 15.196) and the 3' boundary with discrepancy having mean -0.266 (stdev. 17.415). This yields a sensitivity of 0.911 and positive predictive value of 0.906 for determination of exact boundaries of precursor microRNAs within a window of approximately 900 nt. Additionally, by comparing the manual segmentation of Jaeger et al. with our optimal structural segmentation of 16S and 16S-like rRNA of E. coli, rat mitochondria, Halobacterium volcanii, and Chlamydomonas reinhardii chloroplast into 4 segments, we establish the usefulness of (automated) structural segmentation in decomposing large RNA structures into distinct domains.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • Animals
  • Computational Biology
  • Computer Simulation
  • MicroRNAs / chemistry*
  • MicroRNAs / genetics
  • Models, Molecular
  • Nucleic Acid Conformation*
  • Rats
  • Riboswitch / genetics

Substances

  • MicroRNAs
  • Riboswitch