Optimal Design of Synthetic DNA Sequences Without Unwanted Binding Sites

J Comput Biol. 2022 Sep;29(9):974-986. doi: 10.1089/cmb.2021.0417. Epub 2022 May 30.

Abstract

Synthesizing DNA molecules by design has become an essential tool in molecular biology and is expected to become ubiquitous in the coming decade. Successful design of a synthetic DNA molecule often requires satisfying multiple objectives, some of which may conflict with others. One particularly important objective is the elimination of unwanted protein binding sites, which may interfere with the desired function of the synthesized molecule. While most design tools offer this fundamental capability, they do not follow a systematic approach that guarantees elimination of all unwanted sites whenever a feasible solution exists. Furthermore, the algorithms these tools use (when published) are often quite naive and inefficient. We present a formal description of the binding site elimination problem and suggest several efficient algorithms that eliminate unwanted patterns with minimum interference to the desired function of the synthesized sequence. These algorithms are simple, efficient, and flexible and, therefore, can be easily incorporated in all existing DNA design tools, enhancing their design capabilities.

Keywords: pattern elimination; pattern matching; string algorithms; synthetic DNA design.

MeSH terms

  • Algorithms*
  • Base Sequence
  • Binding Sites / genetics
  • Computational Biology
  • DNA* / chemistry
  • Protein Binding

Substances

  • DNA