STTRE: A Spatio-Temporal Transformer with Relative Embeddings for multivariate time series forecasting

Neural Netw. 2023 Nov:168:549-559. doi: 10.1016/j.neunet.2023.09.039. Epub 2023 Sep 30.

Abstract

The prevalence of multivariate time series data across several disciplines fosters a demand and, subsequently, significant growth in the research and advancement of multivariate time series analysis. Drawing inspiration from a popular natural language processing model, the Transformer, we propose the Spatio-Temporal Transformer with Relative Embeddings (STTRE) to address multivariate time series forecasting. This work primarily focuses on developing a Transformer-based framework that can fully exploit the spatio-temporal nature of a multivariate time series by incorporating several of the Transformer's key components, but with augmentations that allow them to excel in multivariate time series forecasting. Current Transformer-based models for multivariate time series often neglect the data's spatial component(s) and utilize absolute position embeddings as their only means to detect the data's temporal component(s), which we show is flawed for time series applications. The lack of emphasis on fully exploiting the spatio-temporality of the data can incur subpar results in terms of accuracy. We redesign relative position representations, which we rename to relative embeddings, to unveil a new method for detecting latent spatial, temporal, and spatio-temporal dependencies more effectively than previous Transformer-based models. We couple these relative embeddings with a restructuring of the Transformer's primary sequence learning mechanism, multi-head attention, in a way that allows for full utilization of relative embeddings, thus achieving up to a 24% improvement in accuracy over other state-of-the-art multivariate time series models on a comprehensive selection of publicly available multivariate time series forecasting datasets.

Keywords: Attention; Embeddings; Forecasting; Multivariate time series; Spatio-temporal; Transformer.

MeSH terms

  • Learning*
  • Natural Language Processing*
  • Time Factors