History Marginalization Improves Forecasting in Variational Recurrent Neural Networks

Chen Qiu; Stephan Mandt; Maja Rudolph

doi:10.3390/e23121563

History Marginalization Improves Forecasting in Variational Recurrent Neural Networks

Entropy (Basel). 2021 Nov 24;23(12):1563. doi: 10.3390/e23121563.

Authors

Chen Qiu^{1

2}, Stephan Mandt³, Maja Rudolph⁴

Affiliations

¹ Bosch Center for AI, 71272 Renningen, Germany.
² Department of Computer Science, TU Kaiserslautern, 67653 Kaiserslautern, Germany.
³ Department of Computer Science, University of California, Irvine, CA 92697, USA.
⁴ Bosch Center for AI, Pittsburgh, PA 15222, USA.

Abstract

Deep probabilistic time series forecasting models have become an integral part of machine learning. While several powerful generative models have been proposed, we provide evidence that their associated inference models are oftentimes too limited and cause the generative model to predict mode-averaged dynamics. Mode-averaging is problematic since many real-world sequences are highly multi-modal, and their averaged dynamics are unphysical (e.g., predicted taxi trajectories might run through buildings on the street map). To better capture multi-modality, we develop variational dynamic mixtures (VDM): a new variational family to infer sequential latent variables. The VDM approximate posterior at each time step is a mixture density network, whose parameters come from propagating multiple samples through a recurrent architecture. This results in an expressive multi-modal posterior approximation. In an empirical study, we show that VDM outperforms competing approaches on highly multi-modal datasets from different domains.

Keywords: sequential latent variable models; time series forecasting; variational inference.