Deep6mAPred: A CNN and Bi-LSTM-based deep learning method for predicting DNA N6-methyladenosine sites across plant species

Methods. 2022 Aug:204:142-150. doi: 10.1016/j.ymeth.2022.04.011. Epub 2022 Apr 25.

Abstract

DNA N6-methyladenine (6mA) is a key DNA modification, which plays versatile roles in the cellular processes, including regulation of gene expression, DNA repair, and DNA replication. DNA 6mA is closely associated with many diseases in the mammals and with growth as well as development of plants. Precisely detecting DNA 6mA sites is of great importance to exploration of 6mA functions. Although many computational methods have been presented for DNA 6mA prediction, there is still a wide gap in the practical application. We presented a convolution neural network (CNN) and bi-directional long-short term memory (Bi-LSTM)-based deep learning method (Deep6mAPred) for predicting DNA 6mA sites across plant species. The Deep6mAPred stacked the CNNs and the Bi-LSTMs in a paralleling manner instead of a series-connection manner. The Deep6mAPred also employed the attention mechanism for improving the representations of sequences. The Deep6mAPred reached an accuracy of 0.9556 over the independent rice dataset, far outperforming the state-of-the-art methods. The tests across plant species showed that the Deep6mAPred is of a remarkable advantage over the state of the art methods. We developed a user-friendly web application for DNA 6mA prediction, which is freely available at http://106.13.196.152:7001/ for all the scientific researchers. The Deep6mAPred would enrich tools to predict DNA 6mA sites and speed up the exploration of DNA modification.

Keywords: 6mA; Convolution neural network; DNA modification; Deep learning; Feed-forward attention; Long-short term memory.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adenosine / analogs & derivatives
  • Adenosine / genetics
  • Adenosine / metabolism
  • Animals
  • DNA / metabolism
  • DNA Methylation*
  • Deep Learning*
  • Mammals / genetics

Substances

  • DNA
  • N-methyladenosine
  • Adenosine