DeepATT: a hybrid category attention neural network for identifying functional effects of DNA sequences

Brief Bioinform. 2021 May 20;22(3):bbaa159. doi: 10.1093/bib/bbaa159.

Abstract

Quantifying DNA properties is a challenging task in the broad field of human genomics. Since the vast majority of non-coding DNA is still poorly understood in terms of function, this task is particularly important to have enormous benefit for biology research. Various DNA sequences should have a great variety of representations, and specific functions may focus on corresponding features in the front part of learning model. Currently, however, for multi-class prediction of non-coding DNA regulatory functions, most powerful predictive models do not have appropriate feature extraction and selection approaches for specific functional effects, so that it is difficult to gain a better insight into their internal correlations. Hence, we design a category attention layer and category dense layer in order to select efficient features and distinguish different DNA functions. In this study, we propose a hybrid deep neural network method, called DeepATT, for identifying $919$ regulatory functions on nearly $5$ million DNA sequences. Our model has four built-in neural network constructions: convolution layer captures regulatory motifs, recurrent layer captures a regulatory grammar, category attention layer selects corresponding valid features for different functions and category dense layer classifies predictive labels with selected features of regulatory functions. Importantly, we compare our novel method, DeepATT, with existing outstanding prediction tools, DeepSEA and DanQ. DeepATT performs significantly better than other existing tools for identifying DNA functions, at least increasing $1.6\%$ area under precision recall. Furthermore, we can mine the important correlation among different DNA functions according to the category attention module. Moreover, our novel model can greatly reduce the number of parameters by the mechanism of attention and locally connected, on the basis of ensuring accuracy.

Keywords: DNA function; category attention; deep neural network.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • DNA / genetics*
  • Databases, Nucleic Acid*
  • Neural Networks, Computer*
  • Regulatory Sequences, Nucleic Acid*
  • Sequence Analysis, DNA*

Substances

  • DNA