circDeep: deep learning approach for circular RNA classification from other long non-coding RNA

Bioinformatics. 2020 Jan 1;36(1):73-80. doi: 10.1093/bioinformatics/btz537.

Abstract

Motivation: Over the past two decades, a circular form of RNA (circular RNA), produced through alternative splicing, has become the focus of scientific studies due to its major role as a microRNA (miRNA) activity modulator and its association with various diseases including cancer. Therefore, the detection of circular RNAs is vital to understanding their biogenesis and purpose. Prediction of circular RNA can be achieved in three steps: distinguishing non-coding RNAs from protein coding gene transcripts, separating short and long non-coding RNAs and predicting circular RNAs from other long non-coding RNAs (lncRNAs). However, the available tools are less than 80 percent accurate for distinguishing circular RNAs from other lncRNAs due to difficulty of classification. Therefore, the availability of a more accurate and fast machine learning method for the identification of circular RNAs, which considers the specific features of circular RNA, is essential to the development of systematic annotation.

Results: Here we present an End-to-End deep learning framework, circDeep, to classify circular RNA from other lncRNA. circDeep fuses an RCM descriptor, ACNN-BLSTM sequence descriptor and a conservation descriptor into high level abstraction descriptors, where the shared representations across different modalities are integrated. The experiments show that circDeep is not only faster than existing tools but also performs at an unprecedented level of accuracy by achieving a 12 percent increase in accuracy over the other tools.

Availability and implementation: https://github.com/UofLBioinformatics/circDeep.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Computational Biology* / methods
  • Deep Learning*
  • RNA, Circular* / classification
  • RNA, Circular* / genetics
  • RNA, Long Noncoding* / genetics
  • Reproducibility of Results

Substances

  • RNA, Circular
  • RNA, Long Noncoding