Representing the dynamics of high-dimensional data with non-redundant wavelets

Patterns (N Y). 2022 Jan 6;3(3):100424. doi: 10.1016/j.patter.2021.100424. eCollection 2022 Mar 11.

Abstract

A crucial question in data science is to extract meaningful information embedded in high-dimensional data into a low-dimensional set of features that can represent the original data at different levels. Wavelet analysis is a pervasive method for decomposing time-series signals into a few levels with detailed temporal resolution. However, obtained wavelets are intertwined and over-represented across levels for each sample and across different samples within one population. Here, using neuroscience data of simulated spikes, experimental spikes, calcium imaging signals, and human electrocorticography signals, we leveraged conditional mutual information between wavelets for feature selection. The meaningfulness of selected features was verified to decode stimulus or condition with high accuracy yet using only a small set of features. These results provide a new way of wavelet analysis for extracting essential features of the dynamics of spatiotemporal neural data, which then enables to support novel model design of machine learning with representative features.

Keywords: ECoG; calcium imaging; conditional information; dimensionality reduction; feature selection; mutual information; neural coding; neural spikes; wavelet analysis.