On the complexity of sparse exon assembly

J Comput Biol. 2006 Jun;13(5):1013-27. doi: 10.1089/cmb.2006.13.1013.

Abstract

Gene structure prediction is one of the most important problems in computational molecular biology. It involves two steps: the first is finding the evidence (e.g., predicting splice sites) and the second is interpreting the evidence, that is, trying to determine the whole gene structure by assembling its pieces. In this paper, we suggest a combinatorial solution to the second step, which is also referred to as the "Exon Assembly Problem." We use a similarity-based approach that aims to produce a single gene structure based on similarities to a known homologous sequence. We target the sparse case, where filtering has been applied to the data, resulting in a set of O(n) candidate exon blocks. Our algorithm yields an O(n(2) square root of n) solution.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Computational Biology
  • Exons / genetics*
  • Pattern Recognition, Automated*
  • Sequence Analysis, DNA*
  • Software*