An integer-valued time series model for multivariate surveillance

Stat Med. 2020 Mar 30;39(7):940-954. doi: 10.1002/sim.8453. Epub 2019 Dec 26.

Abstract

In recent days, different types of surveillance data are becoming available for public health purposes. In most cases, several variables are monitored and events of different types are reported. As the amount of surveillance data increases, statistical methods that can effectively address multivariate surveillance scenarios are demanded. Even though research activity in this field is increasing rapidly in recent years, only a few approaches have simultaneously addressed the integer-valued property of the data and its correlation (both time correlation and cross-correlation) structure. In this article, we suggest a multivariate integer-valued autoregressive model that allows for both serial and cross-correlations between the series and can easily accommodate overdispersion and covariate information. Moreover, its structure implies a natural decomposition into an endemic and an epidemic component, a common distinction in dynamic models for infectious disease counts. Detection of disease outbreaks is achieved through the comparison of surveillance data with one-step-ahead predictions obtained after fitting the suggested model to a set of clean historical data. The performance of the suggested model is illustrated on a trivariate series of syndromic surveillance data collected during Athens 2004 Olympic Games.

Keywords: correlation; count data; integer-valued time series; multivariate surveillance.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Communicable Diseases* / epidemiology
  • Disease Outbreaks
  • Epidemics*
  • Humans
  • Population Surveillance
  • Public Health
  • Sentinel Surveillance