"Found in Translation": An Evolutionary Framework for Auditory-Visual Relationships

Entropy (Basel). 2022 Nov 22;24(12):1706. doi: 10.3390/e24121706.

Abstract

The development of computational artifacts to study cross-modal associations has been a growing research topic, as they allow new degrees of abstraction. In this context, we propose a novel approach to the computational exploration of relationships between music and abstract images, grounded by findings from cognitive sciences (emotion and perception). Due to the problem's high-level nature, we rely on evolutionary programming techniques to evolve this audio-visual dialogue. To articulate the complexity of the problem, we develop a framework with four modules: (i) vocabulary set, (ii) music generator, (iii) image generator, and (iv) evolutionary engine. We test our approach by evolving a given music set to a corresponding set of images, steered by the expression of four emotions (angry, calm, happy, sad). Then, we perform preliminary user tests to evaluate if the user's perception is consistent with the system's expression. Results suggest an agreement between the user's emotional perception of the music-image pairs and the system outcomes, favoring the integration of cognitive science knowledge. We also discuss the benefit of employing evolutionary strategies, such as genetic programming on multi-modal problems of a creative nature. Overall, this research contributes to a better understanding of the foundations of auditory-visual associations mediated by emotions and perception.

Keywords: abstract art; auditory–visual associations; computational creativity; emotions; genetic programming; image generation; music generation; perception.