Integrated Modeling of Structural Genes Using MCuNovo

Methods Mol Biol. 2019:1858:45-57. doi: 10.1007/978-1-4939-8775-7_5.

Abstract

Correct modeling of protein-coding genes based on genome and cDNA data is a prerequisite for functional studies. Various programs such as MAKER, Cufflinks, Oases, and Trinity have been developed, each with advantages and drawbacks. Manual integration of different models for a single gene is cumbersome and becomes a daunting task for 14,000-18,000 genes in a typical holometabolous insect. We developed methods to evaluate the output of MAKER, Cufflinks, Oases and Trinity and select the best models to constitute the MCOT1.0 set for Manduca sexta, a biochemical model insect. To apply these methods in other organisms, we improved the algorithm (designated MCuNovo Gene Selector) and automated the data processing. In this chapter, we describe background information of algorithm development and how to prepare and run this program.

Keywords: Arthropod; Gene modeling; Genomics; Insect; Python; Transcriptome.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • Animals
  • Genome, Insect
  • Genomics / methods*
  • High-Throughput Nucleotide Sequencing / instrumentation
  • High-Throughput Nucleotide Sequencing / methods*
  • Insect Proteins / genetics*
  • Manduca / genetics*
  • Models, Statistical*
  • Sequence Analysis, DNA / methods*

Substances

  • Insect Proteins