TaF: a web platform for taxonomic profile-based fungal gene prediction

Genes Genomics. 2019 Mar;41(3):337-342. doi: 10.1007/s13258-018-0766-1. Epub 2018 Nov 19.

Abstract

Introduction: The accurate prediction and annotation of gene structures from the genome sequence of an organism enable genome-wide functional analyses to obtain insight into the biological properties of an organism.

Objectives: We recently developed a highly accurate filamentous fungal gene prediction pipeline and web platform called TaF. TaF is a homology-based gene predictor employing large-scale taxonomic profiling to search for close relatives in genome queries.

Methods: TaF pipeline consists of four processing steps; (1) taxonomic profiling to search for close relatives to query, (2) generation of hints for determining exon-intron boundaries from orthologous protein sequence data of the profiled species, (3) gene prediction by combination of ab inito and evidence-based prediction methods, and (4) homology search for gene models.

Results: TaF generates extrinsic evidence that suggests possible exon-intron boundaries based on orthologous protein sequence data, thus reducing false-positive predictions of gene structure based on distantly related orthologs data. In particular, the gene prediction method using taxonomic profiling shows very high accuracy, including high sensitivity and specificity for gene models, suggesting a new approach for homology-based gene prediction from newly sequenced or uncharacterized fungal genomes, with the potential to improve the quality of gene prediction.

Conclusion: TaF will be a useful tool for fungal genome-wide analyses, including the identification of targeted genes associated with a trait, transcriptome profiling, comparative genomics, and evolutionary analysis.

Keywords: Ab initio; Exon–intron boundary; Filamentous fungal genome; Homology-based gene prediction; Taxonomic profile; Web platform.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Fungi / classification
  • Fungi / genetics
  • Genes, Fungal*
  • Phylogeny*
  • Sequence Analysis, DNA / methods*
  • Software*