Dihedral-based segment identification and classification of biopolymers II: polynucleotides

J Chem Inf Model. 2014 Jan 27;54(1):278-88. doi: 10.1021/ci400542n. Epub 2014 Jan 10.

Abstract

In an accompanying paper (Nagy, G.; Oostenbrink, C. Dihedral-based segment identification and classification of biopolymers I: Proteins. J. Chem. Inf. Model. 2013, DOI: 10.1021/ci400541d), we introduce a new algorithm for structure classification of biopolymeric structures based on main-chain dihedral angles. The DISICL algorithm (short for DIhedral-based Segment Identification and CLassification) classifies segments of structures containing two central residues. Here, we introduce the DISICL library for polynucleotides, which is based on the dihedral angles ε, ζ, and χ for the two central residues of a three-nucleotide segment of a single strand. Seventeen distinct structural classes are defined for nucleotide structures, some of which--to our knowledge--were not described previously in other structure classification algorithms. In particular, DISICL also classifies noncanonical single-stranded structural elements. DISICL is applied to databases of DNA and RNA structures containing 80,000 and 180,000 segments, respectively. The classifications according to DISICL are compared to those of another popular classification scheme in terms of the amount of classified nucleotides, average occurrence and length of structural elements, and pairwise matches of the classifications. While the detailed classification of DISICL adds sensitivity to a structure analysis, it can be readily reduced to eight simplified classes providing a more general overview of the secondary structure in polynucleotides.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Biopolymers / chemistry*
  • Biopolymers / classification*
  • Computational Biology
  • Computer Simulation
  • Databases, Nucleic Acid
  • Models, Molecular*
  • Multiprotein Complexes / chemistry
  • Multiprotein Complexes / classification
  • Nucleic Acid Conformation*
  • Polynucleotides / chemistry*
  • Polynucleotides / classification*
  • Software

Substances

  • Biopolymers
  • Multiprotein Complexes
  • Polynucleotides