Predicting protein retention in ion-exchange chromatography using an open source QSPR workflow

Biotechnol J. 2024 Mar;19(3):e2300708. doi: 10.1002/biot.202300708.

Abstract

Protein-based biopharmaceuticals require high purity before final formulation to ensure product safety, making process development time consuming. Implementation of computational approaches at the initial stages of process development offers a significant reduction in development efforts. By preselecting process conditions, experimental screening can be limited to only a subset. One such computational selection approach is the application of Quantitative Structure Property Relationship (QSPR) models that describe the properties exploited during purification. This work presents a novel open-source Python tool capable of extracting a range of features from protein 3D models on a local computer allowing total transparency of the calculations. As open-source tool, it also impacts initial investments in constructing a QSPR workflow for protein property prediction for third parties, making it widely applicable within the field of bioprocess development. The focus of current calculated molecular features is projection onto the protein surface by constructing surface grid representations. Linear regression models were trained with the calculated features to predict chromatographic retention times/volumes. Model validation shows a high accuracy for anion and cation exchange chromatography data (cross-validated R2 of 0.87 and 0.95). Hence, these models demonstrate the potential of the use of QSPR to accelerate process design.

Keywords: Quantitative Structure Activity Relationship (QSAR); Quantitative Structure Property Relationship (QSPR); chromatography; protein features; retention prediction.

MeSH terms

  • Chromatography, Ion Exchange
  • Linear Models
  • Proteins* / chemistry
  • Quantitative Structure-Activity Relationship*
  • Workflow

Substances

  • Proteins