Analysis of a Bacillus subtilis genome fragment using a co-operative computer system prototype

Gene. 1995 Nov 7;165(1):GC37-51. doi: 10.1016/0378-1119(95)00636-k.

Abstract

Analysis of the huge volume of data generated by large scale sequencing projects requires the construction of new, sophisticated computer systems. These systems should be able to manage the biological data as well as the results of their analysis. They should also help the user to choose the most appropriate methods, and to string them together in order to solve a global analysis task. In this paper we present the prototype of a software system providing an environment for the analysis of large-scale sequence data. As a first step toward this end, this environment has been put to the test within the Bacillus subtilis genome sequencing project. This system integrates both the descriptive knowledge of the entities involved (genes, regulatory signals and the like) and the methodological knowledge comprising an extensible set of analytical methods. A knowledge representation based on two existing object-oriented models is used to implement this integrated system. In addition, the present prototype provides a suitable user interface both for displaying simultaneously the results generated by several methods and for interacting with the objects. We present in this paper the analysis of a B. subtilis genome fragment, present in data libraries but not annotated. Annotation of the genes present in the fragment allowed us to combine the results of several methods used for predicting coding sequences, and to characterize it as comprising a cryptic phage, the skin element. Comparison between the annotation of the skin element and a standard region of the chromosome indicated that local features of the nucleotide sequence could discriminate between phage and non-phage DNA sequence.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacillus subtilis / genetics*
  • Base Sequence
  • Genome, Bacterial*
  • Molecular Sequence Data
  • Sequence Analysis*
  • Software

Associated data

  • GENBANK/M19299
  • PIR/P30339
  • PIR/S24451
  • PIR/S26033
  • PIR/S38900
  • PIR/S41177
  • SWISSPROT/P13772
  • SWISSPROT/P23789
  • SWISSPROT/P26835
  • SWISSPROT/P30339
  • SWISSPROT/P33228
  • SWISSPROT/P35892
  • SWISSPROT/P37213
  • SWISSPROT/P37309
  • SWISSPROT/P37944
  • SWISSPROT/P38800
  • SWISSPROT/P39780
  • SWISSPROT/P39781
  • SWISSPROT/P39782
  • SWISSPROT/P39786
  • SWISSPROT/P39797
  • SWISSPROT/Q00828
  • SWISSPROT/Q07683