Model certainty in cellular network-driven processes with missing data

PLoS Comput Biol. 2023 Apr 26;19(4):e1011004. doi: 10.1371/journal.pcbi.1011004. eCollection 2023 Apr.

Abstract

Mathematical models are often used to explore network-driven cellular processes from a systems perspective. However, a dearth of quantitative data suitable for model calibration leads to models with parameter unidentifiability and questionable predictive power. Here we introduce a combined Bayesian and Machine Learning Measurement Model approach to explore how quantitative and non-quantitative data constrain models of apoptosis execution within a missing data context. We find model prediction accuracy and certainty strongly depend on rigorous data-driven formulations of the measurement, and the size and make-up of the datasets. For instance, two orders of magnitude more ordinal (e.g., immunoblot) data are necessary to achieve accuracy comparable to quantitative (e.g., fluorescence) data for calibration of an apoptosis execution model. Notably, ordinal and nominal (e.g., cell fate observations) non-quantitative data synergize to reduce model uncertainty and improve accuracy. Finally, we demonstrate the potential of a data-driven Measurement Model approach to identify model features that could lead to informative experimental measurements and improve model predictive power.

MeSH terms

  • Apoptosis
  • Bayes Theorem
  • Calibration
  • Machine Learning*
  • Models, Theoretical*

Grants and funding

This work was supported by the following funding sources: MWI was supported by the National Institutes of Health (NIH)[T32-GM139800]; CFL was supported by the National Science Foundation (NSF) CAREER Award [MCB 1942255]; and the National Institutes of Health (NIH) [U54-CA217450 and U01-CA215845]. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.