Bayesian estimation of subset threshold autoregressions: short-term forecasting of traffic occupancy

J Appl Stat. 2020 Aug 4;47(13-15):2658-2689. doi: 10.1080/02664763.2020.1801606. eCollection 2020.

Abstract

Traffic management authorities in metropolitan areas use real-time systems that analyze high-frequency measurements from fixed sensors, to perform short-term forecasting and incident detection for various locations of a road network. Published research over the last 20 years focused primarily on modeling and forecasting of traffic volumes and speeds. Traffic occupancy approximates vehicular density through the percentage of time a sensor detects a vehicle within a pre-specified time interval. It exhibits weekly periodic patterns and heteroskedasticity and has been used as a metric for characterizing traffic regimes (e.g. free flow, congestion). This article presents a Bayesian three-step model building procedure for parsimonious estimation of Threshold-Autoregressive (TAR) models, designed for location- day- and horizon-specific forecasting of traffic occupancy. In the first step, multiple regime TAR models reformulated as high-dimensional linear regressions are estimated using Bayesian horseshoe priors. Next, significant regimes are identified through a forward selection algorithm based on Kullback-Leibler (KL) distances between the posterior predictive distribution of the full reference model and TAR models with fewer regimes. Given the regimes, the forward selection algorithm can be implemented again to select significant autoregressive terms. In addition to forecasting, the proposed specification and model-building scheme, may assist in determining location-specific congestion thresholds and associations between traffic dynamics observed in different regions of a network. Empirical results applied to data from a traffic forecasting competition, illustrate the efficacy of the proposed procedures in obtaining interpretable models and in producing satisfactory point and density forecasts at multiple horizons.

Keywords: 62M10; Threshold autoregressive model; horseshoe prior; nonlinear time series; regime-switching models; subset selection; traffic occupancy; urban network.