PRAP: an ab initio software package for automated genome-wide analysis of DNA repeats for prokaryotes

Bioinformatics. 2013 Nov 1;29(21):2683-9. doi: 10.1093/bioinformatics/btt482. Epub 2013 Aug 19.

Abstract

Motivation: Prokaryotic genome annotation has been focused mainly on identifying all genes and their protein functions. However, <30% of the prokaryotic genomes submitted to GenBank contain partial repeat features of specific types and none of the genomes contain complete repeat annotations. Deciphering all repeats in DNA sequences is an important and open task in genome annotation and bioinformatics. Hence, there is an immediate need of a tool capable of identifying full spectrum repeats in the whole genome.

Results: We report the PRAP (Prokaryotic Repeats Annotation Program software package to automate the analysis of repeats in both finished and draft genomes. It is aimed at identifying full spectrum repeats at the scale of the prokaryotic genome. Compared with the major existing repeat finding tools, PRAP exhibits competitive or better results. The results are consistent with manually curated and experimental data. Repeats can be identified and grouped into families to define their relevant types. The final output is parsed into the European Molecular Biology Laboratory (EMBL)/GenBank feature table format for reading and displaying in Artemis, where it can be combined or compared with other genome data. It is currently the most complete repeat finder for prokaryotes and is a valuable tool for genome annotation.

Availability: https://sites.google.com/site/prapsoftware/

Contact: hsuehc@ntu.edu.tw.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • DNA, Archaeal / chemistry*
  • DNA, Bacterial / chemistry*
  • Databases, Nucleic Acid
  • Genome, Archaeal
  • Genome, Bacterial
  • Genomics / methods
  • Repetitive Sequences, Nucleic Acid*
  • Sequence Analysis, DNA / methods
  • Software*

Substances

  • DNA, Archaeal
  • DNA, Bacterial