Multiple seed structure and disconnected networks in respondent-driven sampling
(English)Manuscript (preprint) (Other academic)
Respondent-driven sampling (RDS) is a link-tracing sampling method that is especially suitable for sampling hidden populations. RDS combines an efficient snowball-type sampling scheme with inferential procedures that yield unbiased population estimates under some assumptions about the sampling procedure and population structure. Several seed individuals are typically used to initiate RDS recruitment. However, standard RDS estimation theory assumes that all sampled individuals originate from only one seed. We use a random walk with teleportation to describe the multiple seed structure of RDS and develop an estimator based on this process. The new estimator is also valid for populations with disconnected social networks. We numerically evaluate our estimator by simulations on artificial and real networks. Our estimator outperforms previous estimators, especially when the proportion of seeds in the sample is large. We recommend our new estimator to be used in RDS studies, in particular when the number of seeds is large or the social network of the population is disconnected.
Probability Theory and Statistics
Research subject Mathematical Statistics
IdentifiersURN: urn:nbn:se:su:diva-129257OAI: oai:DiVA.org:su-129257DiVA: diva2:920624
FunderSwedish Research Council, 621-2012-3868