Towards a comprehensive pipeline to identify and functionally annotate long noncoding RNA (lncRNA)

Comput Biol Med. 2020 Dec:127:104028. doi: 10.1016/j.compbiomed.2020.104028. Epub 2020 Oct 13.

Abstract

Long noncoding RNAs (lncRNAs) are implicated in various genetic diseases and cancer, attributed to their critical role in gene regulation. They are a divergent group of RNAs and are easily differentiated from other types with unique characteristics, functions, and mechanisms of action. In this review, we provide a list of some of the prominent data repositories containing lncRNAs, their interactome, and predicted and validated disease associations. Next, we discuss various wet-lab experiments formulated to obtain the data for these repositories. We also provide a critical review of in silico methods available for the identification purpose and suggest techniques to further improve their performance. The bulk of the methods currently focus on distinguishing lncRNA transcripts from the coding ones. Functional annotation of these transcripts still remains a grey area and more efforts are needed in that space. Finally, we provide details of current progress, discuss impediments, and illustrate a roadmap for developing a generalized computational pipeline for comprehensive annotation of lncRNAs, which is essential to accelerate research in this area.

Keywords: ANN; Bioinformatics; Epigenomics; Gene regulation; Machine learning; Noncoding RNA; lncRNA.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Gene Expression Regulation
  • Humans
  • Molecular Sequence Annotation
  • Neoplasms* / genetics
  • RNA, Long Noncoding* / genetics

Substances

  • RNA, Long Noncoding