Assessing the Effectiveness of Direct Data Merging Strategy in Long-Term and Large-Scale Pharmacometabonomics

Xuejiao Cui; Qingxia Yang; Bo Li; Jing Tang; Xiaoyu Zhang; Shuang Li; Fengcheng Li; Jie Hu; Yan Lou; Yunqing Qiu; Weiwei Xue; Feng Zhu

doi:10.3389/fphar.2019.00127

Assessing the Effectiveness of Direct Data Merging Strategy in Long-Term and Large-Scale Pharmacometabonomics

Front Pharmacol. 2019 Feb 20:10:127. doi: 10.3389/fphar.2019.00127. eCollection 2019.

Authors

Xuejiao Cui^{1

2}, Qingxia Yang^{1

2}, Bo Li², Jing Tang^{1

2}, Xiaoyu Zhang^{1

2}, Shuang Li^{1

2}, Fengcheng Li¹, Jie Hu³, Yan Lou⁴, Yunqing Qiu⁴, Weiwei Xue², Feng Zhu^{1

2}

Affiliations

¹ College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China.
² School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China.
³ School of International Studies, Zhejiang University, Hangzhou, China.
⁴ Zhejiang Provincial Key Laboratory for Drug Clinical Research and Evaluation, The First Affiliated Hospital, Zhejiang University, Hangzhou, China.

Abstract

Because of the extended period of clinic data collection and huge size of analyzed samples, the long-term and large-scale pharmacometabonomics profiling is frequently encountered in the discovery of drug/target and the guidance of personalized medicine. So far, integration of the results (ReIn) from multiple experiments in a large-scale metabolomic profiling has become a widely used strategy for enhancing the reliability and robustness of analytical results, and the strategy of direct data merging (DiMe) among experiments is also proposed to increase statistical power, reduce experimental bias, enhance reproducibility and improve overall biological understanding. However, compared with the ReIn, the DiMe has not yet been widely adopted in current metabolomics studies, due to the difficulty in removing unwanted variations and the inexistence of prior knowledges on the performance of the available merging methods. It is therefore urgently needed to clarify whether DiMe can enhance the performance of metabolic profiling or not. Herein, the performance of DiMe on 4 pairs of benchmark datasets was comprehensively assessed by multiple criteria (classification capacity, robustness and false discovery rate). As a result, integration/merging-based strategies (ReIn and DiMe) were found to perform better under all criteria than those strategies based on single experiment. Moreover, DiMe was discovered to outperform ReIn in classification capacity and robustness, while the ReIn showed superior capacity in controlling false discovery rate. In conclusion, these findings provided valuable guidance to the selection of suitable analytical strategy for current metabolomics.

Keywords: classification capacity; direct data merging; false discovery rate; long-term and large-scale metabolomics; robustness.