Non-Negative matrix factorization combined with kernel regression for the prediction of adverse drug reaction profiles

Bioinform Adv. 2024 Jan 23;4(1):vbae009. doi: 10.1093/bioadv/vbae009. eCollection 2024.

Abstract

Motivation: Post-market unexpected Adverse Drug Reactions (ADRs) are associated with significant costs, in both financial burden and human health. Due to the high cost and time required to run clinical trials, there is significant interest in accurate computational methods that can aid in the prediction of ADRs for new drugs. As a machine learning task, ADR prediction is made more challenging due to a high degree of class imbalance and existing methods do not successfully balance the requirement to detect the minority cases (true positives for ADR), as measured by the Area Under the Precision-Recall (AUPR) curve with the ability to separate true positives from true negatives [as measured by the Area Under the Receiver Operating Characteristic (AUROC) curve]. Surprisingly, the performance of most existing methods is worse than a naïve method that attributes ADRs to drugs according to the frequency with which the ADR has been observed over all other drugs. The existing advanced methods applied do not lead to substantial gains in predictive performance.

Results: We designed a rigorous evaluation to provide an unbiased estimate of the performance of ADR prediction methods: Nested Cross-Validation and a hold-out set were adopted. Among the existing methods, Kernel Regression (KR) performed best in AUPR but had a disadvantage in AUROC, relative to other methods, including the naïve method. We proposed a novel method that combines non-negative matrix factorization with kernel regression, called VKR. This novel approach matched or exceeded the performance of existing methods, overcoming the weakness of the existing methods.

Availability: Code and data are available on https://github.com/YezhaoZhong/VKR.