Stacking-ac4C: an ensemble model using mixed features for identifying n4-acetylcytidine in mRNA

Front Immunol. 2023 Nov 29:14:1267755. doi: 10.3389/fimmu.2023.1267755. eCollection 2023.

Abstract

N4-acetylcytidine (ac4C) is a modification of cytidine at the nitrogen-4 position, playing a significant role in the translation process of mRNA. However, the precise mechanism and details of how ac4C modifies translated mRNA remain unclear. Since identifying ac4C sites using conventional experimental methods is both labor-intensive and time-consuming, there is an urgent need for a method that can promptly recognize ac4C sites. In this paper, we propose a comprehensive ensemble learning model, the Stacking-based heterogeneous integrated ac4C model, engineered explicitly to identify ac4C sites. This innovative model integrates three distinct feature extraction methodologies: Kmer, electron-ion interaction pseudo-potential values (PseEIIP), and pseudo-K-tuple nucleotide composition (PseKNC). The model also incorporates the robust Cluster Centroids algorithm to enhance its performance in dealing with imbalanced data and alleviate underfitting issues. Our independent testing experiments indicate that our proposed model improves the Mcc by 15.61% and the ROC by 5.97% compared to existing models. To test our model's adaptability, we also utilized a balanced dataset assembled by the authors of iRNA-ac4C. Our model showed an increase in Sn of 4.1%, an increase in Acc of nearly 1%, and ROC improvement of 0.35% on this balanced dataset. The code for our model is freely accessible at https://github.com/louliliang/ST-ac4C.git, allowing users to quickly build their model without dealing with complicated mathematical equations.

Keywords: Cluster Centroids algorithm; N4-acetylcytidine; ensemble model; feature extraction; stacking heterogeneous integration.

Publication types

  • Research Support, Non-U.S. Gov't
  • Comment

MeSH terms

  • Algorithms
  • Cytidine* / genetics
  • Nucleotides*
  • RNA, Messenger / genetics

Substances

  • N-acetylcytidine
  • RNA, Messenger
  • Cytidine
  • Nucleotides

Grants and funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This work was supported by grants from the National Natural Science Foundation of China (No. 62162032, 62062043,32270789), the Scientific Research Plan of the Department of Education of Jiangxi Province, China (GJJ2201004, GJJ2201038).