Automated Classification of Quality Defect Issues Relating to Substandard Medicines Using a Hybrid Machine Learning and Rule-Based Approach

Drug Saf. 2023 Oct;46(10):975-989. doi: 10.1007/s40264-023-01339-8. Epub 2023 Sep 30.

Abstract

Background and objective: Substandard medicines can lead to serious safety issues affecting public health; however, the nature of such issues can be widely heterogeneous. Health product regulators seek to prioritise critical product quality defects for review to ensure that prompt risk mitigation measures are taken. This study aims to classify the nature of issues for substandard medicines using machine learning to augment a risk-based and timely review of cases.

Methods: A combined machine learning algorithm with a keyword-based model was developed to classify quality issues using text relating to substandard medicines (CISTERM). The nature of issues for product defect cases were classified based on Medical Dictionary for Regulatory Activities-Health Sciences Authority (MedDRA-HSA) lowest-level terms.

Results: Product defect cases received from January 2010 to December 2021 were used for training (n = 11,082) and for testing (n = 2771). The machine learning model achieved a good recall (precision) of 92% (96%) for 'Product adulterated and/or contains prohibited substance', 86% (90%) for 'Out of specification or out of trend test result' and 90% (91%) for 'Manufacturing non-compliance'.

Conclusion: Post-market surveillance of substandard medicines remains a key activity for drug regulatory authorities. A combined machine learning algorithm with keyword-based model can help to prioritise the review of product quality defect issues in a timely manner.

MeSH terms

  • Algorithms
  • Drug Contamination
  • Humans
  • Machine Learning
  • Public Health
  • Substandard Drugs*

Substances

  • Substandard Drugs