A method of correction for heaping error in the variables using validation data

Stat Pap (Berl). 2023 Feb 21:1-18. doi: 10.1007/s00362-023-01405-4. Online ahead of print.

Abstract

When self-reported data are used in statistical analysis to estimate the mean and variance, as well as the regression parameters, the estimates tend, in many cases, to be biased. This is because interviewees have a tendency to heap their answers to certain values. The aim of the paper is to examine the bias-inducing effect of the heaping error in self-reported data, and study the effect on the heaping error on the mean and variance of a distribution as well as the regression parameters. As a result a new method is introduced to correct the effects of bias due to the heaping error using validation data. Using publicly available data and simulation studies, it can be shown that the newly developed method is practical and can easily be applied to correct the bias in the estimated mean and variance, as well as in the estimated regression parameters computed from self-reported data. Hence, using the method of correction presented in this paper allows researchers to draw accurate conclusions leading to the right decisions, e.g. regarding health care planning and delivery.

Keywords: Bias; Heaping error; Measurement error; Self-reported data.