Distilling identifiable and interpretable dynamic models from biological data

PLoS Comput Biol. 2023 Oct 18;19(10):e1011014. doi: 10.1371/journal.pcbi.1011014. eCollection 2023 Oct.

Abstract

Mechanistic dynamical models allow us to study the behavior of complex biological systems. They can provide an objective and quantitative understanding that would be difficult to achieve through other means. However, the systematic development of these models is a non-trivial exercise and an open problem in computational biology. Currently, many research efforts are focused on model discovery, i.e. automating the development of interpretable models from data. One of the main frameworks is sparse regression, where the sparse identification of nonlinear dynamics (SINDy) algorithm and its variants have enjoyed great success. SINDy-PI is an extension which allows the discovery of rational nonlinear terms, thus enabling the identification of kinetic functions common in biochemical networks, such as Michaelis-Menten. SINDy-PI also pays special attention to the recovery of parsimonious models (Occam's razor). Here we focus on biological models composed of sets of deterministic nonlinear ordinary differential equations. We present a methodology that, combined with SINDy-PI, allows the automatic discovery of structurally identifiable and observable models which are also mechanistically interpretable. The lack of structural identifiability and observability makes it impossible to uniquely infer parameter and state variables, which can compromise the usefulness of a model by distorting its mechanistic significance and hampering its ability to produce biological insights. We illustrate the performance of our method with six case studies. We find that, despite enforcing sparsity, SINDy-PI sometimes yields models that are unidentifiable. In these cases we show how our method transforms their equations in order to obtain a structurally identifiable and observable model which is also interpretable.

MeSH terms

  • Algorithms
  • Computational Biology
  • Models, Biological*
  • Nonlinear Dynamics*
  • Systems Biology / methods

Grants and funding

This research has received support from grant PID2020-117271RB-C22 (BIODYNAMICS) funded by MCIN/AEI/10.13039/501100011033; from the CSIC intramural project grant PIE 202070E062 (MOEBIUS); from grant PID2020-113992RA-I00 funded by MCIN/AEI/10.13039/501100011033 (PREDYCTBIO); from grant ED431F 2021/003 funded by Consellería de Cultura, Educación e Ordenación Universitaria, Xunta de Galicia; and from grant RYC-2019-027537-I funded by MCIN/AEI/10.13039/501100011033 and by “ESF Investing in your future”. The funding bodies played no role in the design of the study, the collection and analysis of data, or in the writing of the manuscript.