Machine Learning Attempts for Predicting Human Subcutaneous Bioavailability of Monoclonal Antibodies

Pharm Res. 2021 Mar;38(3):451-460. doi: 10.1007/s11095-021-03022-y. Epub 2021 Mar 12.

Abstract

Purpose: One knowledge gap related to subcutaneous (SC) delivery is unpredictable and variable bioavailability. This study was aimed to develop machine learning methods to predict whether mAb's bioavailability was ≥70% or below, without completely knowing the mechanism and causality between inputs and outputs.

Methods: A database of mAb SC products was built. The model training and validation were accomplished based on this database and a set of the inputs (product properties) were mapped to the output (bioavailability) using different machine learning algorithms. Dimensionality reduction was undertaken using principal component analysis (PCA).

Results: The bioavailability of the mAb products being investigated varied from 35% to 90%. The tree-based methods, including random forest (RF), Adaptive Boost (AdaBoost), and decision tree (DT) presented the best predictability and generalization power on bioavailability classification. The models based on Multi-layer perceptron (MLP), Gaussian Naïve Bayes (GaussianNB), and k nearest neighbor (kNN) algorithms also provided acceptable prediction accuracy.

Conclusion: Machine learning could be a potential tool to predict mAb's bioavailability. Since all input features were acquired using theoretical calculations and predictions rather than experiments, the models may be particularly applicable to some early-stage research activities such as mAb molecule triage, design/optimization, mutant screening, molecule selection, and formulation design.

Keywords: bioavailability; machine learning; material attribute; monoclonal antibody; subcutaneous.

MeSH terms

  • Amino Acid Sequence
  • Antibodies, Monoclonal / pharmacokinetics*
  • Biological Availability
  • Computational Biology
  • Data Mining
  • Databases, Pharmaceutical
  • Drug Compounding
  • Humans
  • Machine Learning
  • Neural Networks, Computer
  • Normal Distribution
  • Predictive Value of Tests
  • Principal Component Analysis
  • Protein Conformation

Substances

  • Antibodies, Monoclonal