Manifold-adaptive dimension estimation revisited

Zsigmond Benkő; Marcell Stippinger; Roberta Rehus; Attila Bencze; Dániel Fabó; Boglárka Hajnal; Loránd G Eröss; András Telcs; Zoltán Somogyvári

doi:10.7717/peerj-cs.790

Manifold-adaptive dimension estimation revisited

PeerJ Comput Sci. 2022 Jan 6:8:e790. doi: 10.7717/peerj-cs.790. eCollection 2022.

Authors

Zsigmond Benkő^{1

2}, Marcell Stippinger¹, Roberta Rehus¹, Attila Bencze¹, Dániel Fabó³, Boglárka Hajnal^{2

3}, Loránd G Eröss^{4

5}, András Telcs^{1

6

7}, Zoltán Somogyvári^{1

8}

Affiliations

¹ Department of Computational Sciences, Wigner Research Centre for Physics, Budapest, Hungary.
² János Szentágothai Doctoral School of Neurosciences, Semmelweis University, Budapest, Hungary.
³ Epilepsy Center, Department of Neurology, National Institute of Clinical Neurosciences, Budapest, Hungary.
⁴ Department of Functional Neurosurgery, National Institute of Clinical Neurosciences, Budapest, Hungary.
⁵ Faculty of Information Technology and Bionics, Péter Pázmány Catholic University, Budapest, Hungary.
⁶ Department of Computer Science and Information Theory, Faculty of Electrical Engineering and Informatics, Budapest University of Technology and Economics, Budapest, Hungary.
⁷ Department of Quantitative Methods, Faculty of Business and Economics,, University of Pannonia, Veszprém, Hungary.
⁸ Neuromicrosystems ltd., Budapest, Hungary.

Abstract

Data dimensionality informs us about data complexity and sets limit on the structure of successful signal processing pipelines. In this work we revisit and improve the manifold adaptive Farahmand-Szepesvári-Audibert (FSA) dimension estimator, making it one of the best nearest neighbor-based dimension estimators available. We compute the probability density function of local FSA estimates, if the local manifold density is uniform. Based on the probability density function, we propose to use the median of local estimates as a basic global measure of intrinsic dimensionality, and we demonstrate the advantages of this asymptotically unbiased estimator over the previously proposed statistics: the mode and the mean. Additionally, from the probability density function, we derive the maximum likelihood formula for global intrinsic dimensionality, if i.i.d. holds. We tackle edge and finite-sample effects with an exponential correction formula, calibrated on hypercube datasets. We compare the performance of the corrected median-FSA estimator with kNN estimators: maximum likelihood (Levina-Bickel), the 2NN and two implementations of DANCo (R and MATLAB). We show that corrected median-FSA estimator beats the maximum likelihood estimator and it is on equal footing with DANCo for standard synthetic benchmarks according to mean percentage error and error rate metrics. With the median-FSA algorithm, we reveal diverse changes in the neural dynamics while resting state and during epileptic seizures. We identify brain areas with lower-dimensional dynamics that are possible causal sources and candidates for being seizure onset zones.

Keywords: Causality; DANCo; Dynamical systems; EEG; Epilepsy; Fractal dimension; Intrinsic dimension; Manifold; Maximum likelihood; Takens theorem.

Grants and funding

The research reported in this paper was supported by the BME NC TKP2020 grant of NKFIH Hungary, by the BME-Artificial Intelligence FIKP grant of EMMI (BME FIKP-MI/SC), by the National Brain Research Program of Hungary (NAP-B, KTIA_NAP_12-2-201), by the National Brain Project II, NRDIO Hungary, PATTERN Group, and by 2017-1.2.1-NKP-2017-00002 of NKFIH and the grants K135837 and NN118902 of the NKFIH. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.