SDBA: Score Domain-Based Attention for DNA N4-Methylcytosine Site Prediction from Multiperspectives

J Chem Inf Model. 2024 Apr 8;64(7):2839-2853. doi: 10.1021/acs.jcim.3c00688. Epub 2023 Aug 30.

Abstract

In tasks related to DNA sequence classification, choosing the appropriate encoding methods is challenging. Some of the methods encode sequences based on prior knowledge that limits the ability of the model to obtain multiperspective information from the sequences. We introduced a new trainable ensemble method based on the attention mechanism SDBA, which stands for Score Domain-Based Attention. Unlike other methods, we fed the task-independent encoding results into the models and dynamically ensembled features from different perspectives using the SDBA mechanism. This approach allows the model to acquire and weight sequence features voluntarily. SDBA is conceptually general and empirically powerful. It has achieved new state-of-the-art results on the benchmark data sets associated with DNA N4-methylcytosine site prediction.

MeSH terms

  • Cytosine* / analogs & derivatives
  • DNA* / chemistry

Substances

  • DNA
  • 1-methylcytosine
  • Cytosine