Integrative approaches for analysis of mRNA and microRNA high-throughput data

Comput Struct Biotechnol J. 2021 Jan 26:19:1154-1162. doi: 10.1016/j.csbj.2021.01.029. eCollection 2021.

Abstract

Advanced sequencing technologies such as RNASeq provide the means for production of massive amounts of data, including transcriptome-wide expression levels of coding RNAs (mRNAs) and non-coding RNAs such as miRNAs, lncRNAs, piRNAs and many other RNA species. In silico analysis of datasets, representing only one RNA species is well established and a variety of tools and pipelines are available. However, attaining a more systematic view of how different players come together to regulate the expression of a gene or a group of genes requires a more intricate approach to data analysis. To fully understand complex transcriptional networks, datasets representing different RNA species need to be integrated. In this review, we will focus on miRNAs as key post-transcriptional regulators summarizing current computational approaches for miRNA:target gene prediction as well as new data-driven methods to tackle the problem of comprehensively and accurately dissecting miRNome-targetome interactions.

Keywords: CCA, canonical correlation analysis; CDS, coding sequence; CLASH, cross-linking, ligation and sequencing of hybrids; CLIP, cross-linking immunoprecipitation; CNN, convolutional neural network; Data integration; GO, gene ontology; ICA, independent component analysis; Matrix factorization; NGS, next-generation sequencing; NMF, non-negative matrix factorization; PCA, principal component analysis; RNASeq, high-throughput RNA sequencing; TDMD, target RNA-directed miRNA degradation; TF, transcription factors; Target prediction; Transcriptomics; circRNA, circular RNA; lncRNA, long non-coding RNA; mRNA, messenger RNA; miRNA, microRNA; microRNA.

Publication types

  • Review