Both sides of the story: comparing student-level data on reading performance from administrative registers to application generated data from a reading app

Bent Sortkær; Emil Smith; David Reimer; Stefan Oehmcke; Ida Gran Andersen

doi:10.1140/epjds/s13688-021-00300-y

Both sides of the story: comparing student-level data on reading performance from administrative registers to application generated data from a reading app

EPJ Data Sci. 2021;10(1):44. doi: 10.1140/epjds/s13688-021-00300-y. Epub 2021 Aug 19.

Authors

Bent Sortkær¹, Emil Smith¹, David Reimer¹, Stefan Oehmcke², Ida Gran Andersen¹

Affiliations

¹ Danish School of Education, Aarhus University, Aarhus, Denmark.
² Department of Computer Science, University of Copenhagen, Copenhagen, Denmark.

Abstract

The use of various learning apps in school settings is growing and thus producing an increasing amount of usage generated data. However, this usage generated data has only to a very little extend been used for monitoring and promoting learning progress. We test if application usage generated data from a reading app holds potential for measuring reading ability, reading speed progress and for pointing out features in a school setting that promotes learning. We analyze new data from three different sources: (1) Usage generated data from a widely used reading app, (2) Data from a national reading ability test, and (3) Register data on student background and family characteristics. First, we find that reading app generated data to some degree tells the same story about reading ability as does the formal national reading ability test. Second, we find that the reading app data has the potential to monitor reading speed progress. Finally, we tested several models including machine learning models. Two of these were able to identify variables associated with reading speed progress with some degree of success and to point at certain conditions that promotes reading speed progress. We discuss the results and avenues for further research are presented.

Keywords: Administrative data; Application generated data; Learning analytics; Machine learning; Reading speed; Reading speed progress; Reading test.