Data-Driven User-Type Clustering of a Physical Activity Promotion App: Usage Data Analysis Study

JMIR Form Res. 2022 Aug 1;6(8):e30149. doi: 10.2196/30149.

Abstract

Background: Physical inactivity remains a leading risk factor for mortality worldwide. Owing to increasing sedentary behavior (activities in a reclining, seated, or lying position with low-energy expenditures), vehicle-based transport, and insufficient physical workload, the prevalence of physical activity decreases significantly with age. To promote sufficient levels of participation in physical activities, the research prototype Fit-mit-ILSE was developed with the goal of making adults aged ≥55 years physically fit and fit for the use of assistive technologies. The system combines active and assisted living technologies and smart services in the ILSE app.

Objective: The clustering of health and fitness app user types, especially in the context of active and assisted living projects, has been mainly defined by experts through 1D cluster thresholds based on app usage frequency. We aimed to investigate and present data-driven methods for clustering app user types and to identify usage patterns based on the ILSE app function Fit at home.

Methods: During the 2 phases of the field trials, ILSE app log data were collected from 165 participants. Using this data set, 2 data-driven approaches were applied for clustering to group app users who were similar to each other. First, the common approach of user-type clustering based on expert-defined thresholds was replaced by a data-driven derivation of the cluster thresholds using the Jenks natural breaks algorithm. Second, a multidimensional clustering approach using the Partitioning Around Medoids algorithm was explored to consider the detailed app usage pattern data.

Results: Applying the Jenks clustering algorithm to the mean usage per day and clustering the users into 4 groups showed that most of the users (63/165, 38.2%) used the Fit at home function between once a week and every second day. More men were in the low usage group than women. In addition, the younger users were more often identified as moderate or high users than the older users, who were mainly classified as low users; moreover, the regional differences between Vienna and Salzburg were identified. In addition, the multidimensional approach identified 4 different user groups that differed mainly in terms of time of use, gender, and region. Overall, the younger women living in Salzburg were the users with highest average app usage.

Conclusions: The application of different clustering approaches showed that data-driven calculations of user groups can complement expert-based definitions, provide objective thresholds for the analysis of app usage data, and identify groups that can be targeted individually based on their specific group characteristics.

Keywords: Jenks natural breaks algorithm; Partitioning Around Medoids algorithm; active and assisted living; app usage; cluster analysis; physical activity promotion; usage groups.