Development of a multivariate prediction model to identify individual case safety reports which require clinical review

Helen R Gosselt; Elizabeth A Bazelmans; Thomas Lieber; Florence P A M van Hunsel; Linda Härmark

doi:10.1002/pds.5553

Development of a multivariate prediction model to identify individual case safety reports which require clinical review

Pharmacoepidemiol Drug Saf. 2022 Dec;31(12):1300-1307. doi: 10.1002/pds.5553. Epub 2022 Oct 21.

Authors

Helen R Gosselt¹, Elizabeth A Bazelmans¹, Thomas Lieber¹, Florence P A M van Hunsel¹, Linda Härmark¹

Affiliation

¹ Netherlands Pharmacovigilance Centre Lareb, 's-Hertogenbosch, The Netherlands.

PMID: 36251280
DOI: 10.1002/pds.5553

Abstract

Background: The number of Individual Case Safety Reports (ICSRs) in pharmacovigilance databases are rapidly increasing world-wide. The majority of ICSRs at the Netherlands Pharmacovigilance Centre Lareb is reviewed manually to identify potential signal triggering reports (PSTR) or ICSRs which need further clinical assessment for other reasons.

Objectives: To develop a prediction model to identify ICSRs that require clinical review, including PSTRs. Secondly, to identify the most important features of these reports.

Methods: All ICSRs (n = 30 424) received by Lareb between October 1, 2017 and February 26, 2021 were included. ICSRs originating from marketing authorisation holders and ICSRs reported on vaccines were excluded. The outcome was defined as PSTR (yes/no), where PSTR 'yes' was defined as an ICSR discussed at a signal detection meeting. Nineteen features were included, concerning structured information on: patients, adverse drug reactions (ADR) or drugs. Data were divided into a training (70%) and test set (30%) using a stratified split to maintain the PSTR/no PSTR ratio. Logistic regression, elastic net logistic regression and eXtreme Gradient Boosting models were trained and tuned on a training set. Random down-sampling of negative controls was applied on the training set to adjust for the imbalanced dataset. Final models were evaluated on the test set. Model performances were assessed using the area under the curve (AUC) with 95% confidence interval of a receiver operating characteristic (ROC), and specificity and precision were assessed at a threshold for perfect sensitivity (100%, to not miss any PSTRs). Feature importance plots were inspected and a selection of features was used to re-train and test model performances with fewer features.

Results: 1439 (4.7%) of reports were PSTR. All three models performed equally with a highest AUC of 0.75 (0.73-0.77). Despite moderate model performances, specificity (5%) and precision (5%) were low. Most important features were: 'absence of ADR in the Summary of product characteristics', 'ADR reported as serious', 'ADR labelled as an important medical event', 'ADR reported by physician' and 'positive rechallenge'. Model performances were similar when using only nine of the most important features.

Conclusions: We developed a prediction model with moderate performances to identify PSTRs with nine commonly available features. Optimisation of the model using more ICSR information (e.g., free text fields) to increase model precision is required before implementation.

Keywords: adverse drug reaction; pharmacovigilance; prediction model; supervised machine learning.

Publication types

Review

MeSH terms

Adverse Drug Reaction Reporting Systems*
Databases, Factual
Drug-Related Side Effects and Adverse Reactions* / epidemiology
Drug-Related Side Effects and Adverse Reactions* / etiology
Humans
Pharmacovigilance
ROC Curve