Change search
ReferencesLink to record
Permanent link

Direct link
Estimating the size of hidden populations from register data
Stockholm University, Faculty of Social Sciences, Centre for Social Research on Alcohol and Drugs (SoRAD).
Stockholm University, Faculty of Social Sciences, Centre for Social Research on Alcohol and Drugs (SoRAD).
2014 (English)In: BMC Medical Research Methodology, ISSN 1471-2288, Vol. 14, 58- p.Article in journal (Refereed) Published
Abstract [en]

Background: Prevalence estimates of drug use, or of its consequences, are considered important in many contexts and may have substantial influence over public policy. However, it is rarely possible to simply count the relevant individuals, in particular when the defining characteristics might be illegal, as in the drug use case. Consequently methods are needed to estimate the size of such partly 'hidden' populations, and many such methods have been developed and used within epidemiology including studies of alcohol and drug use. Here we introduce a method appropriate for estimating the size of human populations given a single source of data, for example entries in a health-care registry. Methods: The setup is the following: during a fixed time-period, e. g. a year, individuals belonging to the target population have a non-zero probability of being registered. Each individual might be registered multiple times and the time-points of the registrations are recorded. Assuming that the population is closed and that the probability of being registered at least once is constant, we derive a family of maximum likelihood (ML) estimators of total population size. We study the ML estimator using Monte Carlo simulations and delimit the range of cases where it is useful. In particular we investigate the effect of making the population heterogeneous with respect to probability of being registered. Results: The new estimator is asymptotically unbiased and we show that high precision estimates can be obtained for samples covering as little as 25% of the total population size. However, if the total population size is small (say in the order of 500) a larger fraction needs to be sampled to achieve reliable estimates. Further we show that the estimator give reliable estimates even when individuals differ in the probability of being registered. We also compare the ML estimator to an estimator known as Chao's estimator and show that the latter can have a substantial bias when applied to epidemiological data. Conclusions: The population size estimator suggested herein complements existing methods and is less sensitive to certain types of dependencies typical in epidemiological data.

Place, publisher, year, edition, pages
2014. Vol. 14, 58- p.
Keyword [en]
Prevalence, Hidden population, Capture-recapture, Truncated Poisson, Opiates, Heroin, Mortality
National Category
Sociology Substance Abuse
URN: urn:nbn:se:su:diva-104419DOI: 10.1186/1471-2288-14-58ISI: 000335462100001OAI: diva2:725922


Available from: 2014-06-17 Created: 2014-06-10 Last updated: 2014-06-17Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full text

Search in DiVA

By author/editor
Ledberg, AndersWennberg, Peter
By organisation
Centre for Social Research on Alcohol and Drugs (SoRAD)
In the same journal
BMC Medical Research Methodology
SociologySubstance Abuse

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 67 hits
ReferencesLink to record
Permanent link

Direct link