Change search
ReferencesLink to record
Permanent link

Direct link
Some Contributions to Statistical Disclosure Control
Stockholm University, Faculty of Social Sciences, Department of Statistics.
Responsible organisation
2002 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

An important issue associated with the release of statistical data, is the possibility of disclosing individual information about respondents. Statistical disclosure control (SDC), is the discipline that deals with methods of producing statistical data that are safe enough to be released while retaining its analytical value and also methods of assessing the disclosures risks. This thesis deals with both aspects.

In the first paper, a method for limiting disclosure risks in microdata (individual data) is described. The method is a variant of so-called data-swapping and is intended to be applied to quantitative data and is based on the rank structure of the original data. Theoretical results and simulation studies indicate that the method performs at least reasonably well when applied to bivariate normal data. An important measure of identification risk associated with the release of microdata or large complex tables is proportion of population units that can be uniquely identified by a set of matchable attributes. In the second paper a model based on the Poisson-inverse Gaussian distribution is proposed as a possible approach within this context. Disclosure risk measures are discussed and derived under the proposed model as are various methods of estimation. The results indicate that the model may be a useful and analytically tractable alternative to other models. The third paper reports the results of an empirical comparison between different methods of assessing file-level disclosure risk as measured by the estimated number of unique population units amongst unique records and the number of unique units in the population. The results indicate that no one model or method performs uniformly best and that performance varies greatly between different types of data. The fourth and last paper presents a method for assessing a per-record measure of disclosure risk based on a Poisson-inverse Gaussian regression model. Per-record measures may be used to identify sensitive (atypical) records in a file which can be modified separately using SDC techniques prior to the release. The method builds on loglinear modelling and is exemplified using both sample and population level information. The results indicate that the model provides a tractable alternative to the Poisson-lognormal model and that using population level information sharpens the measure.

Place, publisher, year, edition, pages
Stockholm: Department of Statistics, Stockholm University , 2002. , 114 p.
Keyword [en]
Data dissemination, Data-swapping, Disclosure control, Poisson-inverse Gaussian, Risk-per-record, Superpopulation, Uniqueness
National Category
Probability Theory and Statistics
Research subject
URN: urn:nbn:se:su:diva-7813ISBN: 91-7265-568-2OAI: diva2:199087
Public defence
2003-01-17, Högbomsalen, Geovetenskapens hus, Stockholm, 10:00 (English)
Available from: 2002-12-25 Created: 2002-12-25 Last updated: 2009-12-16Bibliographically approved

Open Access in DiVA

No full text

By organisation
Department of Statistics
Probability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 279 hits
ReferencesLink to record
Permanent link

Direct link