Enhancing Epitranscriptome Module Detection from m6A-Seq Data Using Threshold-Based Measurement Weighting Strategy

Biomed Res Int. 2018 Jun 14:2018:2075173. doi: 10.1155/2018/2075173. eCollection 2018.

Abstract

To date, with well over 100 different types of RNA modifications associated with various molecular functions identified on diverse types of RNA molecules, the epitranscriptome has emerged to be an important layer for gene expression regulation. It is of crucial importance and increasing interest to understand how the epitranscriptome is regulated to facilitate different biological functions from a global perspective, which may be carried forward by finding biologically meaningful epitranscriptome modules that respond to upstream epitranscriptome regulators and lead to downstream biological functions; however, due to the intrinsic properties of RNA molecules, RNA modifications, and relevant sequencing technique, the epitranscriptome profiled from high-throughput sequencing approaches often suffers from various artifacts, jeopardizing the effectiveness of epitranscriptome modules identification when using conventional approaches. To solve this problem, we developed a convenient measurement weighting strategy, which can largely tolerate the artifacts of high-throughput sequencing data. We demonstrated on real data that the proposed measurement weighting strategy indeed brings improved performance in epitranscriptome module discovery in terms of both module accuracy and biological significance. Although the new approach is integrated with Euclidean distance measurement in a hierarchical clustering scenario, it has great potential to be extended to other distance measurements and algorithms as well for addressing various tasks in epitranscriptome analysis. Additionally, we show for the first time with rigorous statistical analysis that the epitranscriptome modules are biologically meaningful with different GO functions enriched, which established the functional basis of epitranscriptome modules, fulfilled a key prerequisite for functional characterization, and deciphered the epitranscriptome and its regulation.

MeSH terms

  • Algorithms
  • Cluster Analysis
  • Gene Expression Regulation*
  • High-Throughput Nucleotide Sequencing*
  • RNA / metabolism*
  • Reproducibility of Results

Substances

  • RNA