A Learn-to-Rank Approach for Predicting Road Cycling Race Outcomes

Leonid Kholkine; Thomas Servotte; Arie-Willem de Leeuw; Tom De Schepper; Peter Hellinckx; Tim Verdonck; Steven Latré

doi:10.3389/fspor.2021.714107

A Learn-to-Rank Approach for Predicting Road Cycling Race Outcomes

Front Sports Act Living. 2021 Oct 6:3:714107. doi: 10.3389/fspor.2021.714107. eCollection 2021.

Authors

Leonid Kholkine¹, Thomas Servotte², Arie-Willem de Leeuw¹, Tom De Schepper¹, Peter Hellinckx¹, Tim Verdonck², Steven Latré¹

Affiliations

¹ Department of Computer Science, University of Antwerp-IMEC, Antwerp, Belgium.
² Department of Mathematics, University of Antwerp, Antwerp, Belgium.

Abstract

Professional road cycling is a very competitive sport, and many factors influence the outcome of the race. These factors can be internal (e.g., psychological preparedness, physiological profile of the rider, and the preparedness or fitness of the rider) or external (e.g., the weather or strategy of the team) to the rider, or even completely unpredictable (e.g., crashes or mechanical failure). This variety makes perfectly predicting the outcome of a certain race an impossible task and the sport even more interesting. Nonetheless, before each race, journalists, ex-pro cyclists, websites and cycling fans try to predict the possible top 3, 5, or 10 riders. In this article, we use easily accessible data on road cycling from the past 20 years and the Machine Learning technique Learn-to-Rank (LtR) to predict the top 10 contenders for 1-day road cycling races. We accomplish this by mapping a relevancy weight to the finishing place in the first 10 positions. We assess the performance of this approach on 2018, 2019, and 2021 editions of six spring classic 1-day races. In the end, we compare the output of the framework with a mass fan prediction on the Normalized Discounted Cumulative Gain (NDCG) metric and the number of correct top 10 guesses. We found that our model, on average, has slightly higher performance on both metrics than the mass fan prediction. We also analyze which variables of our model have the most influence on the prediction of each race. This approach can give interesting insights to fans before a race but can also be helpful to sports coaches to predict how a rider might perform compared to other riders outside of the team.

Keywords: cycling race performance; learn-to-rank; machine learning; road cycling; winner prediction.