The effects of the irregular sample and missing data in time series analysis

Nonlinear Dynamics Psychol Life Sci. 2006 Apr;10(2):187-214.

Abstract

Human self-report time series data are typically marked by irregularities in sampling rates; furthermore, these irregularities are typically natural outcomes of the data generation process. Relatively little has been published to assist the analysis of irregularly sampled data. We report the results of a series of computational experiments on synthetic data sets designed to assess the utility of techniques for handling irregular time series data. The behavior of a conservative quasiperiodic, a dissipative chaotic, and a self-organized critical dynamics were sampled regularly in time and the regular sampling was disrupted by data point removal or by stochastic shifts in time. Missing data segments were then patched by means of segment concatenation, by segment filling with average data values, or by local interpolation in phase space. We compared results of nonlinear analytical tools such as autocorrelations and correlation dimensions using complete and patched sets, as well as power spectra with Lomb periodograms of the decimated sets. Local interpolation in phase space was particularly successful at preserving key features of the original data, but required potentially impractical quantities of intact data as a primer. While the other patching methods are not limited by the need for intact data, they distort results relative to the intact series. We conclude that irregularly sampled data sets with as much as 15 percent missing data can potentially be re-sampled or repaired for analysis with techniques that assume regular sampling without introducing substantial errors.