A Decomposition Method for Global Evaluation of Shannon Entropy and Local Estimations of Algorithmic Complexity

Hector Zenil; Santiago Hernández-Orozco; Narsis A Kiani; Fernando Soler-Toscano; Antonio Rueda-Toicen; Jesper Tegnér

doi:10.3390/e20080605

A Decomposition Method for Global Evaluation of Shannon Entropy and Local Estimations of Algorithmic Complexity

Entropy (Basel). 2018 Aug 15;20(8):605. doi: 10.3390/e20080605.

Authors

Hector Zenil^{1

2

3}, Santiago Hernández-Orozco^{1

2

4}, Narsis A Kiani^{1

2}, Fernando Soler-Toscano⁵, Antonio Rueda-Toicen^{1

2

6}, Jesper Tegnér^{7

8}

Affiliations

¹ Algorithmic Dynamics Lab, Unit of Computational Medicine, Department of Medicine Solna, Center for Molecular Medicine, Karolinska Institute and SciLifeLab, SE-171 77 Stockholm, Sweden.
² Algorithmic Nature Group, Laboratoire de Recherche Scientifique (LABORES) for the Natural and Digital Sciences, 75005 Paris, France.
³ Department of Computer Science, University of Oxford, Oxford OX1 3QD, UK.
⁴ Posgrado en Ciencia e Ingeniería de la Computación, Universidad Nacional Autónoma de México (UNAM), Mexico City 04510, Mexico.
⁵ Grupo de Lógica, Lenguaje e Información, Universidad de Sevilla, 41004 Seville, Spain.
⁶ Instituto Nacional de Bioingeniería, Universidad Central de Venezuela, Caracas 1051, Venezuela.
⁷ Unit of Computational Medicine, Department of Medicine Solna, Center for Molecular Medicine, SciLifeLab and Karolinska Institute, Stockholm SE-171 77, Sweden.
⁸ Biological and Environmental Sciences and Engineering Division, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955, Saudi Arabia.

Abstract

We investigate the properties of a Block Decomposition Method (BDM), which extends the power of a Coding Theorem Method (CTM) that approximates local estimations of algorithmic complexity based on Solomonoff-Levin's theory of algorithmic probability providing a closer connection to algorithmic complexity than previous attempts based on statistical regularities such as popular lossless compression schemes. The strategy behind BDM is to find small computer programs that produce the components of a larger, decomposed object. The set of short computer programs can then be artfully arranged in sequence so as to produce the original object. We show that the method provides efficient estimations of algorithmic complexity but that it performs like Shannon entropy when it loses accuracy. We estimate errors and study the behaviour of BDM for different boundary conditions, all of which are compared and assessed in detail. The measure may be adapted for use with more multi-dimensional objects than strings, objects such as arrays and tensors. To test the measure we demonstrate the power of CTM on low algorithmic-randomness objects that are assigned maximal entropy (e.g., π ) but whose numerical approximations are closer to the theoretical low algorithmic-randomness expectation. We also test the measure on larger objects including dual, isomorphic and cospectral graphs for which we know that algorithmic randomness is low. We also release implementations of the methods in most major programming languages-Wolfram Language (Mathematica), Matlab, R, Perl, Python, Pascal, C++, and Haskell-and an online algorithmic complexity calculator.

Keywords: Kolmogorov–Chaitin complexity; Shannon entropy; Thue–Morse sequence; algorithmic probability; algorithmic randomness; information content; information theory; π.

Grants and funding

2015-05299/Vetenskapsrådet