Early prediction of student performance in CS1 programming courses

PeerJ Comput Sci. 2023 Oct 31:9:e1655. doi: 10.7717/peerj-cs.1655. eCollection 2023.

Abstract

There is a high failure rate and low academic performance observed in programming courses. To address these issues, it is crucial to predict student performance at an early stage. This allows teachers to provide timely support and interventions to help students achieve their learning objectives. The prediction of student performance has gained significant attention, with researchers focusing on machine learning features and algorithms to improve predictions. This article proposes a model for predicting student performance in a 16-week CS1 programming course, specifically in weeks 3, 5, and 7. The model utilizes three key factors: grades, delivery time, and the number of attempts made by students in programming labs and an exam. Eight classification algorithms were employed to train and evaluate the model, with performance assessed using metrics such as accuracy, recall, F1 score, and AUC. In week 3, the gradient boosting classifier (GBC) achieved the best results with an F1 score of 86%, followed closely by the random forest classifier (RFC) with 83%. These findings demonstrate the potential of the proposed model in accurately predicting student performance.

Keywords: Early prediction; Model prediction; Predicting student performance; Programming course; Student performance.

Grants and funding

This work was supported by the Corporación Universitaria del Huila—CORHUILA, and COLCIENCIAS sponsored the doctoral studies of Jose Llanos Mosquera. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.