Deletion diagnostics for the generalised linear mixed model with independent random effects

Stat Med. 2016 Apr 30;35(9):1488-501. doi: 10.1002/sim.6810. Epub 2015 Dec 2.

Abstract

The Generalised linear mixed model (GLMM) is widely used for modelling environmental data. However, such data are prone to influential observations, which can distort the estimated exposure-response curve particularly in regions of high exposure. Deletion diagnostics for iterative estimation schemes commonly derive the deleted estimates based on a single iteration of the full system holding certain pivotal quantities such as the information matrix to be constant. In this paper, we present an approximate formula for the deleted estimates and Cook's distance for the GLMM, which does not assume that the estimates of variance parameters are unaffected by deletion. The procedure allows the user to calculate standardised DFBETAs for mean as well as variance parameters. In certain cases such as when using the GLMM as a device for smoothing, such residuals for the variance parameters are interesting in their own right. In general, the procedure leads to deleted estimates of mean parameters, which are corrected for the effect of deletion on variance components as estimation of the two sets of parameters is interdependent. The probabilistic behaviour of these residuals is investigated and a simulation based procedure suggested for their standardisation. The method is used to identify influential individuals in an occupational cohort exposed to silica. The results show that failure to conduct post model fitting diagnostics for variance components can lead to erroneous conclusions about the fitted curve and unstable confidence intervals.

Keywords: Cook's distance; DFBETAs; deletion diagnostics; exposure-response; generalised linear mixed models.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Data Interpretation, Statistical
  • Datasets as Topic
  • Environmental Exposure / statistics & numerical data
  • Humans
  • Linear Models*
  • Models, Statistical