Visual representation of DNA sequences for exon detection using non-parametric spectral estimation techniques

Nucleosides Nucleotides Nucleic Acids. 2019;38(5):321-337. doi: 10.1080/15257770.2018.1536270. Epub 2019 Mar 12.

Abstract

This paper presents a new approach for modeling of DNA sequences for the purpose of exon detection. The proposed model adopts the sum-of-sinusoids concept for the representation of DNA sequences. The objective of the modeling process is to represent the DNA sequence with few coefficients. The modeling process can be performed on the DNA signal as a whole or on a segment-by-segment basis. The created models can be used instead of the original sequences in a further spectral estimation process for exon detection. The accuracy of modeling is evaluated evaluated by using the Root Mean Square Error (RMSE) and the R-square metrics. In addition, non-parametric spectral estimation methods are used for estimating the spectral of both original and modeled DNA sequences. The results of exon detection based on original and modeled DNA sequences coincide to a great extent, which ensures the success of the proposed sum-of-sinusoids method for modeling of DNA sequences.

Keywords: DNA; non-parametric spectral estimation; signal modeling.

MeSH terms

  • Base Sequence*
  • Computational Biology
  • DNA / chemistry*
  • Databases, Nucleic Acid
  • Exons*
  • Models, Molecular
  • Nucleic Acid Conformation
  • Nucleotides / chemistry
  • Sequence Analysis, DNA

Substances

  • Nucleotides
  • DNA