[Variance estimation considering multistage sampling design in multistage complex sample analysis]

Yichong Li; Yinjun Zhao; Limin Wang; Mei Zhang; Maigeng Zhou

doi:10.3760/cma.j.issn.0254-6450.2016.03.028

[Variance estimation considering multistage sampling design in multistage complex sample analysis]

Zhonghua Liu Xing Bing Xue Za Zhi. 2016 Mar;37(3):425-9. doi: 10.3760/cma.j.issn.0254-6450.2016.03.028.

[Article in Chinese]

Authors

Yichong Li¹, Yinjun Zhao¹, Limin Wang¹, Mei Zhang¹, Maigeng Zhou²

Affiliations

¹ Division of Surveillance, Chinese Center for Disease Control and Prevention, Beijing 100050, China.
² National Center for Chronic and Noncommunicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing 100050, China.

PMID: 27005551
DOI: 10.3760/cma.j.issn.0254-6450.2016.03.028

Abstract

Multistage sampling is a frequently-used method in random sampling survey in public health. Clustering or independence between observations often exists in the sampling, often called complex sample, generated by multistage sampling. Sampling error may be underestimated and the probability of type I error may be increased if the multistage sample design was not taken into consideration in analysis. As variance (error) estimator in complex sample is often complicated, statistical software usually adopt ultimate cluster variance estimate (UCVE) to approximate the estimation, which simply assume that the sample comes from one-stage sampling. However, with increased sampling fraction of primary sampling unit, contribution from subsequent sampling stages is no more trivial, and the ultimate cluster variance estimate may, therefore, lead to invalid variance estimation. This paper summarize a method of variance estimation considering multistage sampling design. The performances are compared with UCVE and the method considering multistage sampling design by simulating random sampling under different sampling schemes using real world data. Simulation showed that as primary sampling unit (PSU) sampling fraction increased, UCVE tended to generate increasingly biased estimation, whereas accurate estimates were obtained by using the method considering multistage sampling design.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Analysis of Variance*
Bias
Cluster Analysis
Computer Simulation
Humans
Research Design*
Software