NucPosPred: Predicting species-specific genomic nucleosome positioning via four different modes of general PseKNC

J Theor Biol. 2018 Aug 7:450:15-21. doi: 10.1016/j.jtbi.2018.04.025. Epub 2018 Apr 18.

Abstract

The nucleosome is the basic structure of chromatin in eukaryotic cells, with essential roles in the regulation of many biological processes, such as DNA transcription, replication and repair, and RNA splicing. Because of the importance of nucleosomes, the factors that determine their positioning within genomes should be investigated. High-resolution nucleosome-positioning maps are now available for organisms including Saccharomyces cerevisiae, Drosophila melanogaster and Caenorhabditis elegans, enabling the identification of nucleosome positioning by application of computational tools. Here, we describe a novel predictor called NucPosPred, which was specifically designed for large-scale identification of nucleosome positioning in C. elegans and D. melanogaster genomes. NucPosPred was separately optimized for each species for four types of DNA sequence feature extraction, with consideration of two classification algorithms (gradient-boosting decision tree and support vector machine). The overall accuracy obtained with NucPosPred was 92.29% for C. elegans and 88.26% for D. melanogaster, outperforming previous methods and demonstrating the potential for species-specific prediction of nucleosome positioning. For the convenience of most experimental scientists, a web-server for the predictor NucPosPred is available at http://121.42.167.206/NucPosPred/index.jsp.

Keywords: GBDT; KNN; Nucleosome positioning; Nucleotide composition; SVM.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Base Sequence
  • Caenorhabditis elegans / genetics
  • Chromatin Assembly and Disassembly
  • Computational Biology / methods*
  • Drosophila melanogaster / genetics
  • Genome*
  • Nucleosomes / metabolism*
  • Species Specificity

Substances

  • Nucleosomes