Recognition and analysis of protein-coding genes in severe acute respiratory syndrome associated coronavirus

Bioinformatics. 2004 May 1;20(7):1074-80. doi: 10.1093/bioinformatics/bth041. Epub 2004 Feb 5.

Abstract

Motivation: The recent outbreak of severe acute respiratory syndrome (SARS) caused by SARS coronavirus (SARS-CoV) has necessitated an in-depth molecular understanding of the virus to identify new drug targets. The availability of complete genome sequence of several strains of SARS virus provides the possibility of identification of protein-coding genes and defining their functions. Computational approach to identify protein-coding genes and their putative functions will help in designing experimental protocols.

Results: In this paper, a novel analysis of SARS genome using gene prediction method GeneDecipher developed in our laboratory has been presented. Each of the 18 newly sequenced SARS-CoV genomes has been analyzed using GeneDecipher. In addition to polyprotein 1ab(1), polyprotein 1a and the four genes coding for major structural proteins spike (S), small envelope (E), membrane (M) and nucleocapsid (N), six to eight additional proteins have been predicted depending upon the strain analyzed. Their lengths range between 61 and 274 amino acids. Our method also suggests that polyprotein 1ab, polyprotein 1a, S, M and N are proteins of viral origin and others are of prokaryotic. Putative functions of all predicted protein-coding genes have been suggested using conserved peptides present in their open reading frames.

Availability: Detailed results of GeneDecipher analysis of all the 18 strains of SARS-CoV genomes are available at http://www.igib.res.in/sarsanalysis.html

Publication types

  • Comparative Study
  • Evaluation Study
  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Algorithms
  • Gene Expression Profiling / methods*
  • Genetic Testing / methods
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Sequence Alignment / methods
  • Sequence Analysis / methods
  • Sequence Analysis, DNA / methods
  • Sequence Analysis, Protein / methods*
  • Severe acute respiratory syndrome-related coronavirus / genetics*
  • Software*
  • Viral Proteins / genetics*

Substances

  • Viral Proteins