Effect of extreme data loss on long-range correlated and anticorrelated signals quantified by detrended fluctuation analysis

Phys Rev E Stat Nonlin Soft Matter Phys. 2010 Mar;81(3 Pt 1):031101. doi: 10.1103/PhysRevE.81.031101. Epub 2010 Mar 2.

Abstract

Detrended fluctuation analysis (DFA) is an improved method of classical fluctuation analysis for nonstationary signals where embedded polynomial trends mask the intrinsic correlation properties of the fluctuations. To better identify the intrinsic correlation properties of real-world signals where a large amount of data is missing or removed due to artifacts, we investigate how extreme data loss affects the scaling behavior of long-range power-law correlated and anticorrelated signals. We introduce a segmentation approach to generate surrogate signals by randomly removing data segments from stationary signals with different types of long-range correlations. The surrogate signals we generate are characterized by four parameters: (i) the DFA scaling exponent alpha of the original correlated signal u(i) , (ii) the percentage p of the data removed from u(i) , (iii) the average length mu of the removed (or remaining) data segments, and (iv) the functional form P(l) of the distribution of the length l of the removed (or remaining) data segments. We find that the global scaling exponent of positively correlated signals remains practically unchanged even for extreme data loss of up to 90%. In contrast, the global scaling of anticorrelated signals changes to uncorrelated behavior even when a very small fraction of the data is lost. These observations are confirmed on two examples of real-world signals: human gait and commodity price fluctuations. We further systematically study the local scaling behavior of surrogate signals with missing data to reveal subtle deviations across scales. We find that for anticorrelated signals even 10% of data loss leads to significant monotonic deviations in the local scaling at large scales from the original anticorrelated to uncorrelated behavior. In contrast, positively correlated signals show no observable changes in the local scaling for up to 65% of data loss, while for larger percentage of data loss, the local scaling shows overestimated regions (with higher local exponent) at small scales, followed by underestimated regions (with lower local exponent) at large scales. Finally, we investigate how the scaling is affected by the average length, probability distribution, and percentage of the remaining data segments in comparison to the removed segments. We find that the average length mu_{r} of the remaining segments is the key parameter which determines the scales at which the local scaling exponent has a maximum deviation from its original value. Interestingly, the scales where the maximum deviation occurs follow a power-law relationship with mu_{r} . Whereas the percentage of data loss determines the extent of the deviation. The results presented in this paper are useful to correctly interpret the scaling properties obtained from signals with extreme data loss.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Computer Simulation
  • Data Interpretation, Statistical*
  • Models, Biological*
  • Models, Statistical*
  • Sample Size*
  • Signal Processing, Computer-Assisted*
  • Statistics as Topic