Identifying patterns of item missing survey data using latent groups: an observational study

BMJ Open. 2017 Oct 30;7(10):e017284. doi: 10.1136/bmjopen-2017-017284.

Abstract

Objectives: To examine whether respondents to a survey of health and physical activity and potential determinants could be grouped according to the questions they missed, known as 'item missing'.

Design: Observational study of longitudinal data.

Setting: Residents of Brisbane, Australia.

Participants: 6901 people aged 40-65 years in 2007.

Materials and methods: We used a latent class model with a mixture of multinomial distributions and chose the number of classes using the Bayesian information criterion. We used logistic regression to examine if participants' characteristics were associated with their modal latent class. We used logistic regression to examine whether the amount of item missing in a survey predicted wave missing in the following survey.

Results: Four per cent of participants missed almost one-fifth of the questions, and this group missed more questions in the middle of the survey. Eighty-three per cent of participants completed almost every question, but had a relatively high missing probability for a question on sleep time, a question which had an inconsistent presentation compared with the rest of the survey. Participants who completed almost every question were generally younger and more educated. Participants who completed more questions were less likely to miss the next longitudinal wave.

Conclusions: Examining patterns in item missing data has improved our understanding of how missing data were generated and has informed future survey design to help reduce missing data.

Keywords: epidemiology; public health.

Publication types

  • Observational Study

MeSH terms

  • Adult
  • Aged
  • Australia
  • Bayes Theorem
  • Bias*
  • Data Collection* / statistics & numerical data
  • Female
  • Humans
  • Logistic Models
  • Male
  • Middle Aged
  • Surveys and Questionnaires*