In search of the ratio of miRNA expression as robust biomarkers for constructing stable diagnostic models among multi-center data

Cuidie Ma; Yonghao Zhang; Rui Ding; Han Chen; Xudong Wu; Lida Xu; Changyuan Yu

doi:10.3389/fgene.2024.1381917

In search of the ratio of miRNA expression as robust biomarkers for constructing stable diagnostic models among multi-center data

Front Genet. 2024 Apr 30:15:1381917. doi: 10.3389/fgene.2024.1381917. eCollection 2024.

Authors

Cuidie Ma^#¹, Yonghao Zhang^#¹, Rui Ding², Han Chen³, Xudong Wu¹, Lida Xu⁴, Changyuan Yu¹

Affiliations

¹ College of Life Science and Technology, Beijing University of Chemical Technology, Beijing, China.
² State Key Laboratory of Complex Severe and Rare Diseases, Department of Laboratory Medicine, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China.
³ Shenyang Medical College, Shenyang, China.
⁴ Beijing Hotgen Biotech Co., Ltd., Beijing, China.

^# Contributed equally.

Abstract

MicroRNAs (miRNAs) are promising biomarkers for the early detection of disease, and many miRNA-based diagnostic models have been constructed to distinguish patients and healthy individuals. To thoroughly utilize the miRNA-profiling data across different sequencing platforms or multiple centers, the models accounting the batch effects were demanded for the generalization of medical application. We conducted transcription factor (TF)-mediated miRNA-miRNA interaction network analysis and adopted the within-sample expression ratios of miRNA pairs as predictive markers. The ratio of the expression values between each miRNA pair turned out to be stable across multiple data sources. A genetic algorithm-based classifier was constructed to quantify risk scores of the probability of disease and discriminate disease states from normal states in discovery, with a validation dataset for COVID-19, renal cell carcinoma, and lung adenocarcinoma. The predictive models based on the expression ratio of interacting miRNA pairs demonstrated good performances in the discovery and validation datasets, and the classifier may be used accurately for the early detection of disease.

Keywords: batch effect; biomarker; disease classifications; microRNA interactions; multi-center data.

Grants and funding

The author(s) declare that financial support was received for the research, authorship, and/or publication of this article. This work was supported by the National Natural Science Foundation of China (Grant number 82174531).