A user-friendly guide to using distance measures to compare time series in ecology

Ecol Evol. 2023 Oct 5;13(10):e10520. doi: 10.1002/ece3.10520. eCollection 2023 Oct.

Abstract

Time series are a critical component of ecological analysis, used to track changes in biotic and abiotic variables. Information can be extracted from the properties of time series for tasks such as classification (e.g., assigning species to individual bird calls); clustering (e.g., clustering similar responses in population dynamics to abrupt changes in the environment or management interventions); prediction (e.g., accuracy of model predictions to original time series data); and anomaly detection (e.g., detecting possible catastrophic events from population time series). These common tasks in ecological research all rely on the notion of (dis-) similarity, which can be determined using distance measures. A plethora of distance measures have been described, predominantly in the computer and information sciences, but many have not been introduced to ecologists. Furthermore, little is known about how to select appropriate distance measures for time-series-related tasks. Therefore, many potential applications remain unexplored. Here, we describe 16 properties of distance measures that are likely to be of importance to a variety of ecological questions involving time series. We then test 42 distance measures for each property and use the results to develop an objective method to select appropriate distance measures for any task and ecological dataset. We demonstrate our selection method by applying it to a set of real-world data on breeding bird populations in the UK and discuss other potential applications for distance measures, along with associated technical issues common in ecology. Our real-world population trends exhibit a common challenge for time series comparisons: a high level of stochasticity. We demonstrate two different ways of overcoming this challenge, first by selecting distance measures with properties that make them well suited to comparing noisy time series and second by applying a smoothing algorithm before selecting appropriate distance measures. In both cases, the distance measures chosen through our selection method are not only fit-for-purpose but are consistent in their rankings of the population trends. The results of our study should lead to an improved understanding of, and greater scope for, the use of distance measures for comparing ecological time series and help us answer new ecological questions.

Keywords: classification; clustering; dissimilarity measures; distance measure selection; time series analysis; time series comparison.

Associated data

  • Dryad/10.5061/dryad.bzkh189g7