Gene identification programs in bread wheat: a comparison study

Nucleosides Nucleotides Nucleic Acids. 2013;32(10):529-54. doi: 10.1080/15257770.2013.832773.

Abstract

Seven ab initio web-based gene prediction programs (i.e., AUGUSTUS, BGF, Fgenesh, Fgenesh+, GeneID, Genemark.hmm, and HMMgene) were assessed to compare their prediction accuracy using protein-coding sequences of bread wheat. At both nucleotide and exon levels, Fgenesh+ was deduced as the superior program and BGF followed by Fgenesh were resided in the next positions, respectively. Conversely, at gene level, Fgenesh with the value of predicting more than 75% of all the genes precisely, concluded as the best ones. It was also found out that programs such as Fgenesh+, BGF, and Fgenesh, because of harboring the highest percentage of correct predictive exons appear to be much more applicable in achieving more trustworthy results, while using both GeneID and HMMgene the percentage of false negatives would be expected to enhance. Regarding initial exon, overall, the frequency of accurate recognition of 3' boundary was significantly higher than that of 5' and the reverse was true if terminal exon is taken into account. Lastly, HMMgene and Genemark.hmm, overall, presented independent tendency against GC content, while the others appear to be slightly more sensitive if GC-poor sequences are employed. Our results, overall, exhibited that to make adequate opportunity in acquiring remarkable results, gene finders still need additional improvements.

Publication types

  • Comparative Study

MeSH terms

  • Base Composition / genetics
  • Bread*
  • Computational Biology / methods*
  • Exons / genetics
  • Genes, Plant / genetics*
  • Internet
  • Oligodeoxyribonucleotides / genetics
  • Polymerase Chain Reaction
  • Software*
  • Triticum / genetics*

Substances

  • Oligodeoxyribonucleotides