Reliability of Judging in DanceSport

Front Psychol. 2019 May 7:10:1001. doi: 10.3389/fpsyg.2019.01001. eCollection 2019.

Abstract

Purpose: The aim of this study was to assess the reliability and validity of the new judging system in DanceSport.

Methods: Eighteen judges rated the 12 best placed adult dancing couples competing at an international competition. They marked each couple on all judging criteria on a 10 level scale. Absolute agreement and consistency of judging were calculated for all main judging criteria and sub-criteria.

Results: A mean correlation of overall judging marks was 0.48. Kendall's coefficient of concordance for overall marks (W = 0.58) suggesting relatively low agreement among judges. Slightly lower coefficients were found for the artistic part [Partnering skills (W = 0.45) and Choreography and performance (W = 0.49)] compared to the technical part [Technical qualities (W = 0.56) and Movement to music (W = 0.54)]. ICC for overall criteria was low for absolute agreement [ICC(2,3) = 0.62] but higher for consistency [ICC(3,3) = 0.80].

Conclusion: The relatively large differences between judges' marks suggest that judges either disagreed to some extent on the quality of the dancing or used the judging scale in different ways. The biggest concern was standard error of measurement (SEM) which was often larger than the difference between dancers scores suggesting that this judging system lacks validity. This was the first research to assess judging in DanceSport and offers suggestions to potentially improve both its objectivity and validity in the future.

Keywords: DanceSport; aesthetic sports; ballroom dance; judging system; reliability; validity.