Beyond ℓ1 sparse coding in V1

PLoS Comput Biol. 2023 Sep 12;19(9):e1011459. doi: 10.1371/journal.pcbi.1011459. eCollection 2023 Sep.

Abstract

Growing evidence indicates that only a sparse subset from a pool of sensory neurons is active for the encoding of visual stimuli at any instant in time. Traditionally, to replicate such biological sparsity, generative models have been using the ℓ1 norm as a penalty due to its convexity, which makes it amenable to fast and simple algorithmic solvers. In this work, we use biological vision as a test-bed and show that the soft thresholding operation associated to the use of the ℓ1 norm is highly suboptimal compared to other functions suited to approximating ℓp with 0 ≤ p < 1 (including recently proposed continuous exact relaxations), in terms of performance. We show that ℓ1 sparsity employs a pool with more neurons, i.e. has a higher degree of overcompleteness, in order to maintain the same reconstruction error as the other methods considered. More specifically, at the same sparsity level, the thresholding algorithm using the ℓ1 norm as a penalty requires a dictionary of ten times more units compared to the proposed approach, where a non-convex continuous relaxation of the ℓ0 pseudo-norm is used, to reconstruct the external stimulus equally well. At a fixed sparsity level, both ℓ0- and ℓ1-based regularization develop units with receptive field (RF) shapes similar to biological neurons in V1 (and a subset of neurons in V2), but ℓ0-based regularization shows approximately five times better reconstruction of the stimulus. Our results in conjunction with recent metabolic findings indicate that for V1 to operate efficiently it should follow a coding regime which uses a regularization that is closer to the ℓ0 pseudo-norm rather than the ℓ1 one, and suggests a similar mode of operation for the sensory cortex in general.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Sensory Receptor Cells*

Grants and funding

DP and IR acknowledge the support received by the French National Research Agency (ANR) through Young Investigator (JCJC) grant project ‘Redundancy-free neuro-biological design of visual and auditory sensing’ (RUBIN-VASE). LUP received funding from the ANR project ‘Bio-mimetic agile aerial robots flying in real-life conditions’ (AgileNeuRobot), grant number ANR-20-CE23-0021. LC acknowledges the support received from the French National Centre for Scientific Research (CNRS) to the research group Information, Signal, Image and ViSion (ISIS) for the project ‘Sparse and non-convex optimisation for learning of inverse image microscopy problems’ (SPLIN). LC also received support through ANR JCJC project ‘Task-adapted bilevel learning of flexible statistical models for imaging and vision’ (TASKABILE), grant number ANR-22-CE48-0010. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.