Combining data assimilation and machine learning to infer unresolved scale parametrization

Julien Brajard; Alberto Carrassi; Marc Bocquet; Laurent Bertino

doi:10.1098/rsta.2020.0086

Combining data assimilation and machine learning to infer unresolved scale parametrization

Philos Trans A Math Phys Eng Sci. 2021 Apr 5;379(2194):20200086. doi: 10.1098/rsta.2020.0086. Epub 2021 Feb 15.

Authors

Julien Brajard^{1

2}, Alberto Carrassi^{3

4}, Marc Bocquet⁵, Laurent Bertino¹

Affiliations

¹ Nansen Center (NERSC), 5006 Bergen, Norway.
² Sorbonne University, Paris, France.
³ Department of Meteorology, University of Reading and NCEO, Reading, UK.
⁴ Mathematical Institute, University of Utrecht, Utrecht, The Netherlands.
⁵ CEREA, joint laboratory École des Ponts ParisTech and EDF R&D, Université Paris-Est, Paris, France.

Abstract

In recent years, machine learning (ML) has been proposed to devise data-driven parametrizations of unresolved processes in dynamical numerical models. In most cases, the ML training leverages high-resolution simulations to provide a dense, noiseless target state. Our goal is to go beyond the use of high-resolution simulations and train ML-based parametrization using direct data, in the realistic scenario of noisy and sparse observations. The algorithm proposed in this work is a two-step process. First, data assimilation (DA) techniques are applied to estimate the full state of the system from a truncated model. The unresolved part of the truncated model is viewed as a model error in the DA system. In a second step, ML is used to emulate the unresolved part, a predictor of model error given the state of the system. Finally, the ML-based parametrization model is added to the physical core truncated model to produce a hybrid model. The DA component of the proposed method relies on an ensemble Kalman filter while the ML parametrization is represented by a neural network. The approach is applied to the two-scale Lorenz model and to MAOOAM, a reduced-order coupled ocean-atmosphere model. We show that in both cases, the hybrid model yields forecasts with better skill than the truncated model. Moreover, the attractor of the system is significantly better represented by the hybrid model than by the truncated model. This article is part of the theme issue 'Machine learning for weather and climate modelling'.

Keywords: data assimilation; machine learning; numerical modelling.