A corpus of audio-visual Lombard speech with frontal and profile views

Najwa Alghamdi; Steve Maddock; Ricard Marxer; Jon Barker; Guy J Brown

doi:10.1121/1.5042758

A corpus of audio-visual Lombard speech with frontal and profile views

J Acoust Soc Am. 2018 Jun;143(6):EL523. doi: 10.1121/1.5042758.

Authors

Najwa Alghamdi¹, Steve Maddock¹, Ricard Marxer¹, Jon Barker¹, Guy J Brown¹

Affiliation

¹ Department of Computer Science, University of Sheffield, Sheffield, United Kingdom nalghamdi@ksu.edu.sa, s.maddock@sheffield.ac.uk, marxer@univ-tln.fr, j.p.barker@sheffield.ac.uk, g.j.brown@sheffield.ac.uk.

PMID: 29960497
DOI: 10.1121/1.5042758

Abstract

This paper presents a bi-view (front and side) audiovisual Lombard speech corpus, which is freely available for download. It contains 5400 utterances (2700 Lombard and 2700 plain reference utterances), produced by 54 talkers, with each utterance in the dataset following the same sentence format as the audiovisual "Grid" corpus [Cooke, Barker, Cunningham, and Shao (2006). J. Acoust. Soc. Am. 120(5), 2421-2424]. Analysis of this dataset confirms previous research, showing prominent acoustic, phonetic, and articulatory speech modifications in Lombard speech. In addition, gender differences are observed in the size of Lombard effect. Specifically, female talkers exhibit a greater increase in estimated vowel duration and a greater reduction in F2 frequency.

Publication types

Comparative Study
Research Support, Non-U.S. Gov't

MeSH terms

Acoustics
Adaptation, Psychological*
Adolescent
Adult
Female
Humans
Male
Noise / adverse effects*
Phonetics
Sex Factors
Speech Acoustics*
Speech Perception*
Speech Production Measurement
Video Recording
Visual Perception*
Voice Quality*
Young Adult