A Comparative Analysis between Efficient Attention Mechanisms for Traffic Forecasting without Structural Priors

Andrei-Cristian Rad; Camelia Lemnaru; Adrian Munteanu

doi:10.3390/s22197457

A Comparative Analysis between Efficient Attention Mechanisms for Traffic Forecasting without Structural Priors

Sensors (Basel). 2022 Oct 1;22(19):7457. doi: 10.3390/s22197457.

Authors

Andrei-Cristian Rad^{1

2}, Camelia Lemnaru¹, Adrian Munteanu²

Affiliations

¹ Computer Science Department, Universitatea Tehnica din Cluj-Napoca, 400027 Cluj-Napoca, Romania.
² Electronics and Informatics Department, Vrije Universiteit Brussel, 1050 Ixelles, Belgium.

Abstract

Dot-product attention is a powerful mechanism for capturing contextual information. Models that build on top of it have acclaimed state-of-the-art performance in various domains, ranging from sequence modelling to visual tasks. However, the main bottleneck is the construction of the attention map, which is quadratic with respect to the number of tokens in the sequence. Consequently, efficient alternatives have been developed in parallel, but it was only recently that their performances were compared and contrasted. This study performs a comparative analysis between some efficient attention mechanisms in the context of a purely attention-based spatio-temporal forecasting model used for traffic prediction. Experiments show that these methods can reduce the training times by up to 28% and the inference times by up to 31%, while the performance remains on par with the baseline.

Keywords: artificial neural networks; deep learning; intelligent transportation systems.

MeSH terms

Forecasting*

Grants and funding

G094122N/Research Foundation - Flanders