Chemistry-Based Modeling on Phenotype-Based Drug-Induced Liver Injury Annotation: From Public to Proprietary Data

Chem Res Toxicol. 2023 Aug 21;36(8):1238-1247. doi: 10.1021/acs.chemrestox.2c00378. Epub 2023 Aug 9.

Abstract

Drug-induced liver injury (DILI) is an important safety concern and a major reason to remove a drug from the market. Advancements in recent machine learning methods have led to a wide range of in silico models for DILI predictive methods based on molecule chemical structures (fingerprints). Existing publicly available DILI data sets used for model building are based on the interpretation of drug labels or patient case reports, resulting in a typical binary clinical DILI annotation. We developed a novel phenotype-based annotation to process hepatotoxicity information extracted from repeated dose in vivo preclinical toxicology studies using INHAND annotation to provide a more informative and reliable data set for machine learning algorithms. This work resulted in a data set of 430 unique compounds covering diverse liver pathology findings which were utilized to develop multiple DILI prediction models trained on the publicly available data (TG-GATEs) using the compound's fingerprint. We demonstrate that the TG-GATEs compounds DILI labels can be predicted well and how the differences between TG-GATEs and the external test compounds (Johnson & Johnson) impact the model generalization performance.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Chemical and Drug Induced Liver Injury*
  • Computer Simulation
  • Drug-Related Side Effects and Adverse Reactions*
  • Humans
  • Machine Learning