A Graph-Based Approach for the DNA Word Design Problem

IEEE/ACM Trans Comput Biol Bioinform. 2021 Nov-Dec;18(6):2747-2752. doi: 10.1109/TCBB.2020.3008346. Epub 2021 Dec 8.

Abstract

The aim of this paper is to improve the best known solution of an important problem, the DNA Word Design problem, which has its roots in Bioinformatics and Coding Theory. The problem is to design DNA codes that satisfy some combinatorial constraints. The constraints considered are: minimum Hamming distance, fixed GC content and the reverse complement Hamming distance. The problem is modeled as a maximum independent set problem. Existing complex approaches for the maximum independent set problem, suitable for large graphs, were tested. In order to tackle large instances, libraries for external memory computations and sampling techniques were investigated. Eventually, we succeed in finding good lower bounds for the instances that were analyzed.

MeSH terms

  • Algorithms*
  • Base Composition / genetics
  • Computational Biology / methods*
  • Computers, Molecular*
  • DNA / chemistry
  • DNA / genetics
  • DNA / ultrastructure*

Substances

  • DNA