Real-time speech MRI datasets with corresponding articulator ground-truth segmentations

Matthieu Ruthven; Agnieszka M Peplinski; David M Adams; Andrew P King; Marc Eric Miquel

doi:10.1038/s41597-023-02766-z

Real-time speech MRI datasets with corresponding articulator ground-truth segmentations

Sci Data. 2023 Dec 2;10(1):860. doi: 10.1038/s41597-023-02766-z.

Authors

Matthieu Ruthven^{1

2}, Agnieszka M Peplinski³, David M Adams¹, Andrew P King², Marc Eric Miquel^{4

5

6}

Affiliations

¹ Clinical Physics, Barts Health NHS Trust, West Smithfield, London, EC1A 7BE, UK.
² School of Biomedical Engineering & Imaging Sciences, King's College London, King's Health Partners, St Thomas' Hospital, London, SE1 7EH, UK.
³ Clinical Physics, Barts Health NHS Trust, West Smithfield, London, EC1A 7BE, UK. agnieszka.peplinski@nhs.net.
⁴ Clinical Physics, Barts Health NHS Trust, West Smithfield, London, EC1A 7BE, UK. marc.miquel@nhs.net.
⁵ Digital Environment Research Institute (DERI), Empire House, 67-75 New Road, Queen Mary University of London, London, E1 1HH, UK. marc.miquel@nhs.net.
⁶ Advanced Cardiovascular Imaging, Barts NIHR BRC, Queen Mary University of London, London, EC1M 6BQ, UK. marc.miquel@nhs.net.

Abstract

The use of real-time magnetic resonance imaging (rt-MRI) of speech is increasing in clinical practice and speech science research. Analysis of such images often requires segmentation of articulators and the vocal tract, and the community is turning to deep-learning-based methods to perform this segmentation. While there are publicly available rt-MRI datasets of speech, these do not include ground-truth (GT) segmentations, a key requirement for the development of deep-learning-based segmentation methods. To begin to address this barrier, this work presents rt-MRI speech datasets of five healthy adult volunteers with corresponding GT segmentations and velopharyngeal closure patterns. The images were acquired using standard clinical MRI scanners, coils and sequences to facilitate acquisition of similar images in other centres. The datasets include manually created GT segmentations of six anatomical features including the tongue, soft palate and vocal tract. In addition, this work makes code and instructions to implement a current state-of-the-art deep-learning-based method to segment rt-MRI speech datasets publicly available, thus providing the community and others with a starting point for developing such methods.

Publication types

Dataset

MeSH terms

Adult
Dental Articulators*
Humans
Image Processing, Computer-Assisted / methods
Magnetic Resonance Imaging / methods
Speech*

Grants and funding

ICA-CDRF-2018-04-ST2-032/DH | National Institute for Health Research (NIHR)