Genomic signal processing methods for computation of alignment-free distances from DNA sequences

PLoS One. 2014 Nov 13;9(11):e110954. doi: 10.1371/journal.pone.0110954. eCollection 2014.

Abstract

Genomic signal processing (GSP) refers to the use of digital signal processing (DSP) tools for analyzing genomic data such as DNA sequences. A possible application of GSP that has not been fully explored is the computation of the distance between a pair of sequences. In this work we present GAFD, a novel GSP alignment-free distance computation method. We introduce a DNA sequence-to-signal mapping function based on the employment of doublet values, which increases the number of possible amplitude values for the generated signal. Additionally, we explore the use of three DSP distance metrics as descriptors for categorizing DNA signal fragments. Our results indicate the feasibility of employing GAFD for computing sequence distances and the use of descriptors for characterizing DNA fragments.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acid Sequence / genetics
  • Base Sequence / genetics*
  • Chromosome Mapping / methods*
  • Computational Biology / methods*
  • DNA / genetics
  • Genomics
  • Humans
  • RNA / genetics
  • Sequence Analysis, DNA / methods*
  • Signal Processing, Computer-Assisted*

Substances

  • RNA
  • DNA

Grants and funding

The authors wish to thank the National Council for Science and Technology (CONACyT) for PhD scholarship support to EB, and FOMIXJal project no. 2010-10-149481 that supported the infrastructure for experimentation.