GC6mA-Pred: A deep learning approach to identify DNA N6-methyladenine sites in the rice genome

Methods. 2022 Aug:204:14-21. doi: 10.1016/j.ymeth.2022.02.001. Epub 2022 Feb 9.

Abstract

Motivation: DNA N6-methyladenine (6mA) is a pivotal DNA modification for various biological processes. More accurate prediction of 6mA methylation sites plays an irreplaceable part in grasping the internal rationale of related biological activities. However, the existing prediction methods only extract information from a single dimension, which has some limitations. Therefore, it is very necessary to obtain the information of 6mA sites from different dimensions, so as to establish a reliable prediction method.

Results: In this study, a neural network based bioinformatics model named GC6mA-Pred is proposed to predict N6-methyladenine modifications in DNA sequences. GC6mA-Pred extracts significant information from both sequence level and graph level. In the sequence level, GC6mA-Pred uses a three-layer convolution neural network (CNN) model to represent the sequence. In the graph level, GC6mA-Pred employs graph neural network (GNN) method to integrate various information contained in the chemical molecular formula corresponding to DNA sequence. In our newly built dataset, GC6mA-Pred shows better performance than other existing models. The results of comparative experiments have illustrated that GC6mA-Pred is capable of producing a marked effect in accurately identifying DNA 6mA modifications.

Keywords: Convolution neural network; DNA N6-methyladenine; Deep learning; Graph neural network.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adenine / chemistry
  • DNA / genetics
  • DNA Methylation / genetics
  • Deep Learning*
  • Oryza* / genetics

Substances

  • DNA
  • Adenine