Machine learning applications in RNA modification sites prediction

Comput Struct Biotechnol J. 2021 Sep 29:19:5510-5524. doi: 10.1016/j.csbj.2021.09.025. eCollection 2021.

Abstract

Ribonucleic acid (RNA) modifications are post-transcriptional chemical composition changes that have a fundamental role in regulating the main aspect of RNA function. Recently, large datasets have become available thanks to the recent development in deep sequencing and large-scale profiling. This availability of transcriptomic datasets has led to increased use of machine learning based approaches in epitranscriptomics, particularly in identifying RNA modifications. In this review, we comprehensively explore machine learning based approaches used for the prediction of 11 RNA modification types, namely, m 1 A , m 6 A , m 5 C , 5 hmC , ψ , 2 ' - O - Me , ac 4 C , m 7 G , A - to - I , m 2 G , and D . This review covers the life cycle of machine learning methods to predict RNA modification sites including available benchmark datasets, feature extraction, and classification algorithms. We compare available methods in terms of datasets, target species, approach, and accuracy for each RNA modification type. Finally, we discuss the advantages and limitations of the reviewed approaches and suggest future perspectives.

Keywords: Deep learning; Feature extraction; Machine learning; Prediction; RNA modification.

Publication types

  • Review