E-learning recommender system dataset

Data Brief. 2023 Feb 1:47:108942. doi: 10.1016/j.dib.2023.108942. eCollection 2023 Apr.

Abstract

Mandarine Academy is an Ed-Tech company that specializes in innovative corporate training techniques such as personalized Massive Open Online Courses (MOOCs), web conferences, etc. With more than 550K users spread across 100 active e-learning platforms. The company creates online pedagogical content (videos, quizzes, documents, etc.) on daily basis to support the digitization of work environments and to keep up with current trends. Mandarine Academy provided us with access to Mooc.office365-training.com. A publicly available MOOC in both French and English versions to conduct research on recommender systems in online learning environments. Mandarine Academy collects user feedback using two types of ratings: Explicit (Like Button, Social share, Bookmarks), and Implicit (Watch Time, Page View). Unfortunately, explicit ratings are underutilized. Most users avoid the burden of stating their preferences explicitly. To address this, we shift our attention to implicit interactions, which generate more data that can be significant in some cases. Implicit Ratings are what constitute Mandarine Academy Recommender System (MARS) Dataset. We believe that the degree of viewing has an impact on the overall impression, for this reason, we applied changes to the implicit data and made a part of it similar to the explicit rating format found in other known datasets (e.g., Movielens). This paper presents two real-world dataset variations that consist of 89,000 explicit ratings and 276,000 implicit ratings. Data was collected starting early 2016 until late 2021. Chosen users had rated at least one item. To protect their privacy, sensitive information has been removed. To the best of our knowledge, this is the first publicly available real-world dataset of E-Learning recommendations in both French and English with mixed ratings (implicit and explicit), allowing the research community to focus on pre-and post-COVID-19 behavior in online learning.

Keywords: Collaborative filtering; E-Learning; Explicit ratings; Implicit ratings; MOOC; Recommender systems.