Muscle Cross-Sectional Area Segmentation in Transverse Ultrasound Images Using Vision Transformers

Sofoklis Katakis; Nikolaos Barotsis; Alexandros Kakotaritis; Panagiotis Tsiganos; George Economou; Elias Panagiotopoulos; George Panayiotakis

doi:10.3390/diagnostics13020217

Muscle Cross-Sectional Area Segmentation in Transverse Ultrasound Images Using Vision Transformers

Diagnostics (Basel). 2023 Jan 6;13(2):217. doi: 10.3390/diagnostics13020217.

Authors

Sofoklis Katakis¹, Nikolaos Barotsis², Alexandros Kakotaritis¹, Panagiotis Tsiganos³, George Economou¹, Elias Panagiotopoulos⁴, George Panayiotakis²

Affiliations

¹ Electronics Laboratory, Department of Physics, University of Patras, 26504 Patras, Greece.
² Department of Medical Physics, School of Medicine, University of Patras, 26504 Patras, Greece.
³ Clinical Radiology Laboratory, School of Medicine, University of Patras, 26504 Patras, Greece.
⁴ Orthopaedic and Rehabilitation Department, Patras University Hospital, 26504 Patras, Greece.

Abstract

Automatically measuring a muscle’s cross-sectional area is an important application in clinical practice that has been studied extensively in recent years for its ability to assess muscle architecture. Additionally, an adequately segmented cross-sectional area can be used to estimate the echogenicity of the muscle, another valuable parameter correlated with muscle quality. This study assesses state-of-the-art convolutional neural networks and vision transformers for automating this task in a new, large, and diverse database. This database consists of 2005 transverse ultrasound images from four informative muscles for neuromuscular disorders, recorded from 210 subjects of different ages, pathological conditions, and sexes. Regarding the reported results, all of the evaluated deep learning models have achieved near-to-human-level performance. In particular, the manual vs. the automatic measurements of the cross-sectional area exhibit an average discrepancy of less than 38.15 mm2, a significant result demonstrating the feasibility of automating this task. Moreover, the difference in muscle echogenicity estimated from these two readings is only 0.88, another indicator of the proposed method’s success. Furthermore, Bland−Altman analysis of the measurements exhibits no systematic errors since most differences fall between the 95% limits of agreements and the two readings have a 0.97 Pearson’s correlation coefficient (p < 0.001, validation set) with ICC (2, 1) surpassing 0.97, showing the reliability of this approach. Finally, as a supplementary analysis, the texture of the muscle’s visible cross-sectional area was examined using deep learning to investigate whether a classification between healthy subjects and patients with pathological conditions solely from the muscle texture is possible. Our preliminary results indicate that such a task is feasible, but further and more extensive studies are required for more conclusive results.

Keywords: cross-sectional area; deep learning; textural analysis; ultrasound; vision transformers.

Grants and funding

MIS-5000432/State Scholarships Foundation