A pseudo-Siamese framework for circRNA-RBP binding sites prediction integrating BiLSTM and soft attention mechanism

Methods. 2022 Nov:207:57-64. doi: 10.1016/j.ymeth.2022.09.003. Epub 2022 Sep 14.

Abstract

Circular RNAs (circRNAs) are widely expressed in tissues and play a key role in diseases through interacting with RNA binding proteins (RBPs). Since the high cost of traditional technology, computational methods are developed to identify the binding sites between circRNAs and RBPs. Unfortunately, these methods suffer from the insufficient learning of features and the single classification of output. To address these limitations, we propose a novel method named circ-pSBLA which constructs a pseudo-Siamese framework integrating Bi-directional long short-term memory (BiLSTM) network and soft attention mechanism for circRNA-RBP binding sites prediction. Softmax function and CatBoost are adopted to classify, respectively, and then a pseudo-Siamese framework is constructed. circ-pSBLA combines them to get final output. To validate the effectiveness of circ-pSBLA, we compare it with other state-of-the-art methods and carry out an ablation experiment on 17 sub-datasets. Moreover, we do motif analysis on 3 sub-datasets. The results show that circ-pSBLA achieves superior performance and outperforms other methods. All supporting source codes can be downloaded from https://github.com/gyj9811/circ-pSBLA.

Keywords: BiLSTM; CatBoost; RBPs; circRNAs; pseudo-Siamese framework.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Binding Sites
  • RNA, Circular* / genetics
  • RNA-Binding Proteins* / metabolism
  • Software

Substances

  • RNA, Circular
  • RNA-Binding Proteins