Modeling Dynamic Systems with Efficient Ensembles of Process-Based Models

PLoS One. 2016 Apr 14;11(4):e0153507. doi: 10.1371/journal.pone.0153507. eCollection 2016.

Abstract

Ensembles are a well established machine learning paradigm, leading to accurate and robust models, predominantly applied to predictive modeling tasks. Ensemble models comprise a finite set of diverse predictive models whose combined output is expected to yield an improved predictive performance as compared to an individual model. In this paper, we propose a new method for learning ensembles of process-based models of dynamic systems. The process-based modeling paradigm employs domain-specific knowledge to automatically learn models of dynamic systems from time-series observational data. Previous work has shown that ensembles based on sampling observational data (i.e., bagging and boosting), significantly improve predictive performance of process-based models. However, this improvement comes at the cost of a substantial increase of the computational time needed for learning. To address this problem, the paper proposes a method that aims at efficiently learning ensembles of process-based models, while maintaining their accurate long-term predictive performance. This is achieved by constructing ensembles with sampling domain-specific knowledge instead of sampling data. We apply the proposed method to and evaluate its performance on a set of problems of automated predictive modeling in three lake ecosystems using a library of process-based knowledge for modeling population dynamics. The experimental results identify the optimal design decisions regarding the learning algorithm. The results also show that the proposed ensembles yield significantly more accurate predictions of population dynamics as compared to individual process-based models. Finally, while their predictive performance is comparable to the one of ensembles obtained with the state-of-the-art methods of bagging and boosting, they are substantially more efficient.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Computer Simulation*
  • Ecosystem*
  • Lakes / analysis
  • Machine Learning*
  • Models, Biological*
  • Population Dynamics
  • Predatory Behavior

Grants and funding

NS received funding from the Slovenian Research Agency (https://www.arrs.gov.si, Grant P2-0103) and the European Commission (http://ec.europa.eu, Grant ICT-2013-604102 HBP). LT received funding from the Slovenian Research Agency (https://www.arrs.gov.si, Grant P5-0093(B)). SD received funding from the Slovenian Research Agency (https://www.arrs.gov.si, Grant P2-0103) and the European Commission (http://ec.europa.eu, Grants ICT-2013-612944 MAESTRA and ICT-2013-604102 HBP). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.