CEPiNS: Conserved Exon Prediction in Novel Species

Bioinformation. 2013;9(4):210-1. doi: 10.6026/97320630009210. Epub 2013 Feb 21.

Abstract

Exon structure is relatively well conserved among orthologs in several large clades of species (e.g. Mammalia, Diptera, Lepidoptera) across evolutionary distances of up to 80 million years. Thus, it should be straightforward to predict the exon structures in novel species based upon the known exon structures of species that have had their genomes sequenced and well assembled. Being able to predict the exon boundaries in the genes of novel species is important given the quickly growing numbers of transcriptome sequencing projects. CEPiNS is a new pipeline for mining exon boundaries of predicted gene sets from model species and then using this information to identify the exon boundaries in a novel species through codon based alignment. The pipeline uses the freeware SPIDEY, an exon boundary prediction tool, and BLAST (BLASTN, BLASTP, TBLASTX), both of which are part of NCBI's toolkit. CEPiNS provides an important tool to analyze the transcriptome of novel species.

Keywords: Bioinformatics Software; Evolutionary and Comparative genomics; Exon prediction; Gene structure; Model species; Novel species; Transcriptomics.