Evaluation of Respondent-Driven Sampling Prevalence Estimators Using Real-World Reported Network Degree

Lisa Avery; Michael Rotondi

doi:10.1177/00811750231163832

Evaluation of Respondent-Driven Sampling Prevalence Estimators Using Real-World Reported Network Degree

Sociol Methodol. 2023 Aug;53(2):269-287. doi: 10.1177/00811750231163832. Epub 2023 Apr 21.

Authors

Lisa Avery^{1

2}, Michael Rotondi³

Affiliations

¹ Department of Biostatistics, Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada.
² Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada.
³ York University, Toronto, ON, Canada.

Abstract

Respondent-driven sampling (RDS) is used to measure trait or disease prevalence in populations that are difficult to reach and often marginalized. The authors evaluated the performance of RDS estimators under varying conditions of trait prevalence, homophily, and relative activity. They used large simulated networks (N = 20,000) derived from real-world RDS degree reports and an empirical Facebook network (N = 22,470) to evaluate estimators of binary and categorical trait prevalence. Variability in prevalence estimates is higher when network degree is drawn from real-world samples than from the commonly assumed Poisson distribution, resulting in lower coverage rates. Newer estimators perform well when the sample is a substantive proportion of the population, but bias is present when the population size is unknown. The choice of preferred RDS estimator needs to be study specific, considering both statistical properties and knowledge of the population under study.

Keywords: disease prevalence; respondent-driven sampling; sampling; social networks; validation.