Muscle Cross-Sectional Area Segmentation in Transverse Ultrasound Images Using Vision Transformers

Diagnostics (Basel). 2023 Jan 6;13(2):217. doi: 10.3390/diagnostics13020217.

Abstract

Automatically measuring a muscle’s cross-sectional area is an important application in clinical practice that has been studied extensively in recent years for its ability to assess muscle architecture. Additionally, an adequately segmented cross-sectional area can be used to estimate the echogenicity of the muscle, another valuable parameter correlated with muscle quality. This study assesses state-of-the-art convolutional neural networks and vision transformers for automating this task in a new, large, and diverse database. This database consists of 2005 transverse ultrasound images from four informative muscles for neuromuscular disorders, recorded from 210 subjects of different ages, pathological conditions, and sexes. Regarding the reported results, all of the evaluated deep learning models have achieved near-to-human-level performance. In particular, the manual vs. the automatic measurements of the cross-sectional area exhibit an average discrepancy of less than 38.15 mm2, a significant result demonstrating the feasibility of automating this task. Moreover, the difference in muscle echogenicity estimated from these two readings is only 0.88, another indicator of the proposed method’s success. Furthermore, Bland−Altman analysis of the measurements exhibits no systematic errors since most differences fall between the 95% limits of agreements and the two readings have a 0.97 Pearson’s correlation coefficient (p < 0.001, validation set) with ICC (2, 1) surpassing 0.97, showing the reliability of this approach. Finally, as a supplementary analysis, the texture of the muscle’s visible cross-sectional area was examined using deep learning to investigate whether a classification between healthy subjects and patients with pathological conditions solely from the muscle texture is possible. Our preliminary results indicate that such a task is feasible, but further and more extensive studies are required for more conclusive results.

Keywords: cross-sectional area; deep learning; textural analysis; ultrasound; vision transformers.