Predictors of attrition in a longitudinal population-based study of aging

Int Psychogeriatr. 2021 Aug;33(8):767-778. doi: 10.1017/S1041610220000447. Epub 2020 Apr 17.

Abstract

Background: Longitudinal studies predictably experience non-random attrition over time. Among older adults, risk factors for attrition may be similar to risk factors for outcomes such as cognitive decline and dementia, potentially biasing study results.

Objective: To characterize participants lost to follow-up which can be useful in the study design and interpretation of results.

Methods: In a longitudinal aging population study with 10 years of annual follow-up, we characterized the attrited participants (77%) compared to those who remained in the study. We used multivariable logistic regression models to identify attrition predictors. We then implemented four machine learning approaches to predict attrition status from one wave to the next and compared the results of all five approaches.

Results: Multivariable logistic regression identified those more likely to drop out as older, male, not living with another study participant, having lower cognitive test scores and higher clinical dementia ratings, lower functional ability, fewer subjective memory complaints, no physical activity, reported hobbies, or engagement in social activities, worse self-rated health, and leaving the house less often. The four machine learning approaches using areas under the receiver operating characteristic curves produced similar discrimination results to the multivariable logistic regression model.

Conclusions: Attrition was most likely to occur in participants who were older, male, inactive, socially isolated, and cognitively impaired. Ignoring attrition would bias study results especially when the missing data might be related to the outcome (e.g. cognitive impairment or dementia). We discuss possible solutions including oversampling and other statistical modeling approaches.

Keywords: artificial neural network (ANN); epidemiology; gradient boosting machine (GBM); least absolute shrinkage and selection operator-type regression (LASSO); loss to follow-up; random forest (RF).

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Activities of Daily Living*
  • Aged
  • Aged, 80 and over
  • Aging / physiology*
  • Female
  • Health Behavior*
  • Humans
  • Logistic Models
  • Longitudinal Studies
  • Lost to Follow-Up*
  • Machine Learning
  • Male
  • Patient Dropouts
  • Population Surveillance
  • Quality of Life*