Estimating the size of populations at high risk for HIV using respondent-driven sampling data

Biometrics. 2015 Mar;71(1):258-266. doi: 10.1111/biom.12255. Epub 2015 Jan 13.

Abstract

The study of hard-to-reach populations presents significant challenges. Typically, a sampling frame is not available, and population members are difficult to identify or recruit from broader sampling frames. This is especially true of populations at high risk for HIV/AIDS. Respondent-driven sampling (RDS) is often used in such settings with the primary goal of estimating the prevalence of infection. In such populations, the number of people at risk for infection and the number of people infected are of fundamental importance. This article presents a case-study of the estimation of the size of the hard-to-reach population based on data collected through RDS. We study two populations of female sex workers and men-who-have-sex-with-men in El Salvador. The approach is Bayesian and we consider different forms of prior information, including using the UNAIDS population size guidelines for this region. We show that the method is able to quantify the amount of information on population size available in RDS samples. As separate validation, we compare our results to those estimated by extrapolating from a capture-recapture study of El Salvadorian cities. The results of our case-study are largely comparable to those of the capture-recapture study when they differ from the UNAIDS guidelines. Our method is widely applicable to data from RDS studies and we provide a software package to facilitate this.

Keywords: Hard-to-reach population sampling; Model-based survey sampling; Network sampling; Social networks; Successive sampling.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computer Simulation
  • Data Interpretation, Statistical*
  • El Salvador / epidemiology
  • Epidemiologic Methods
  • HIV Infections / epidemiology*
  • Homosexuality, Male / statistics & numerical data*
  • Humans
  • Male
  • Models, Statistical*
  • Prevalence
  • Reproducibility of Results
  • Risk Assessment / methods*
  • Sample Size
  • Sensitivity and Specificity
  • Urban Population / statistics & numerical data*