4mCPred-MTL: Accurate Identification of DNA 4mC Sites in Multiple Species Using Multi-Task Deep Learning Based on Multi-Head Attention Mechanism

Front Cell Dev Biol. 2021 May 10:9:664669. doi: 10.3389/fcell.2021.664669. eCollection 2021.

Abstract

DNA methylation is one of the most extensive epigenetic modifications. DNA 4mC modification plays a key role in regulating chromatin structure and gene expression. In this study, we proposed a generic 4mC computational predictor, namely, 4mCPred-MTL using multi-task learning coupled with Transformer to predict 4mC sites in multiple species. In this predictor, we utilize a multi-task learning framework, in which each task is to train species-specific data based on Transformer. Extensive experimental results show that our multi-task predictive model can significantly improve the performance of the model based on single task and outperform existing methods on benchmarking comparison. Moreover, we found that our model can sufficiently capture better characteristics of 4mC sites as compared to existing commonly used feature descriptors, demonstrating the strong feature learning ability of our model. Therefore, based on the above results, it can be expected that our 4mCPred-MTL can be a useful tool for research communities of interest.

Keywords: DNA 4mC modification; deep learning; epigenetics; feature sharing; multi-task learning; transformer.