The GEMMA speech database: VCV and VCCV words for the acoustic analysis of consonants and lexical gemination in Italian

Data Brief. 2022 Jun 27:43:108373. doi: 10.1016/j.dib.2022.108373. eCollection 2022 Aug.

Abstract

The GEMMA database consists of recordings of disyllabic words: vowel-consonant-vowel (VCV) for nongeminate cases and vowel-consonant-consonant-vowel (VCCV) for geminate cases. The consonants in the words are stops /b/, /d/, /g/, /p/, /t/, /k/, affricates /ts/, /dz/, /ʧ/, /ʤ/, fricatives /f/, /v/, /s/, /z/ (singleton only) and /ʃ/ (geminate only), nasals /m/, /n/ and /ɲ/ (geminate only), and liquids /l/, /r/ and / λ / (geminate only). The database also includes recordings for glides (/j/, /w/). The vowels in the words are /a, i, u/; words are symmetric with respect to vowel. Six native adult speakers of Standard Italian, raised and living in Rome, Italy, three female and three male, uttered the speech materials in three different recording sessions; three repetitions for each word per speaker were therefore collected. The dataset also includes the durations of vowel and consonant segments for all cases where the consonant can be singleton vs. geminate (see [1] and [2]).

Keywords: Italian; Lexical gemination; Speech processing; Speech recognition.