Monitoring Variables Influence on Random Forest Models to Forecast Injuries in Short-Track Speed Skating

Front Sports Act Living. 2022 Jul 14:4:896828. doi: 10.3389/fspor.2022.896828. eCollection 2022.

Abstract

Injuries limit the athletes' ability to participate fully in their training and competitive process. They are detrimental to performance, affecting the athletes psychologically while limiting physiological adaptations and long-term development. This study aims to present a framework for developing random forest classifier models, forecasting injuries in the upcoming 1 to 7 days, to assist the performance support staff in reducing injuries and maximizing performance within the Canadian National Female Short-Track Speed Skating Program. Forty different variables monitored daily over two seasons (2018-2019 and 2019-2020) were used to develop two sets of forecasting models. One includes only training load variables (TL), and a second (ALL) combines a wide array of monitored variables (neuromuscular function, heart rate variability, training load, psychological wellbeing, past injury type, and location). The sensitivity (ALL: 0.35 ± 0.19, TL: 0.23 ± 0.03), specificity (ALL: 0.81 ± 0.05, TL: 0.74 ± 0.03) and Matthews Correlation Coefficients (MCC) (ALL: 0.13 ± 0.05, TL: -0.02 ± 0.02) were computed. Paired T-test on the MCC revealed statistically significant (p < 0.01) and large positive effects (Cohen d > 1) for the ALL forecasting models' MCC over every forecasting window (1 to 7 days). These models were highly determined by the athletes' training completion, lower limb and trunk/lumbar injury history, as well as sFatigue, a training load marker. The TL forecasting models' MCC suggests they do not bring any added value to forecast injuries. Combining a wide array of monitored variables and quantifying the injury etiology conceptual components significantly improve the injury forecasting performance of random forest models. The ALL forecasting models' performances are promising, especially on one time windows of one or two days, with sensitivities and specificities being respectively above 0.5 and 0.7. They could add value to the decision-making process for the support staff in order to assist the Canadian National Female Team Short-Track Speed Skating program in reducing the number of incomplete training days, which could potentially increase performance. On longer forecasting time windows, ALL forecasting models' sensitivity and MCC decrease gradually. Further work is needed to determine if such models could be useful for forecasting injuries over three days or longer.

Keywords: data mining; high performance; machine learning; modeling; sport injury prevention.