Distributional data analysis via quantile functions and its application to modeling digital biomarkers of gait in Alzheimer's Disease

Biostatistics. 2023 Jul 14;24(3):539-561. doi: 10.1093/biostatistics/kxab041.

Abstract

With the advent of continuous health monitoring with wearable devices, users now generate their unique streams of continuous data such as minute-level step counts or heartbeats. Summarizing these streams via scalar summaries often ignores the distributional nature of wearable data and almost unavoidably leads to the loss of critical information. We propose to capture the distributional nature of wearable data via user-specific quantile functions (QF) and use these QFs as predictors in scalar-on-quantile-function-regression (SOQFR). As an alternative approach, we also propose to represent QFs via user-specific L-moments, robust rank-based analogs of traditional moments, and use L-moments as predictors in SOQFR (SOQFR-L). These two approaches provide two mutually consistent interpretations: in terms of quantile levels by SOQFR and in terms of L-moments by SOQFR-L. We also demonstrate how to deal with multi-modal distributional data via Joint and Individual Variation Explained using L-moments. The proposed methods are illustrated in a study of association of digital gait biomarkers with cognitive function in Alzheimers disease. Our analysis shows that the proposed methods demonstrate higher predictive performance and attain much stronger associations with clinical cognitive scales compared to simple distributional summaries.

Keywords: Alzheimer’s disease; Gait; JIVE; L-Moments; Quantile functions; Scalar-on-quantile-function regression; Wearable data.

MeSH terms

  • Alzheimer Disease* / diagnosis
  • Data Analysis
  • Gait
  • Humans
  • Wearable Electronic Devices*