Change search
ReferencesLink to record
Permanent link

Direct link
Theory of selective editing with score functions
Stockholm University, Faculty of Social Sciences, Department of Statistics.
2014 (English)In: Journal of Official Statistics, ISSN 0282-423X, E-ISSN 2001-7367Article in journal (Refereed) Accepted
Abstract [en]

In many realistic datasets there are values that we may suspect to be erroneous. To clean data cost-effectively we must prioritise observations to contact or measure again in order to validate or correct them. Sometimes there is auxiliary information apart from the initially and possibly dubious reported values that allows for prediction of the true values prior to editing. The weighted difference in absolute terms between a predicted and a reported value is referred to as an item score. A large item score indicates a need for checking the observation. Usually we want to edit and verify values on all items of the same unit that need be looked at rather than going back to the same unit  several times for each item separately. The article discusses ways of forming a unit score out of a generic set of p item scores for continuous variables. A generalised unit score function that unifies the functions widely used in statistical editing is presented. The optimal choice of unit score function is discussed in a variety of scenarios. The problem of prioritising manual statistical editing of business survey data has been the motivating example.

Place, publisher, year, edition, pages
Keyword [en]
measurement errors, Minkowski’s metric, subsample for recontact, validation.
National Category
Probability Theory and Statistics
Research subject
URN: urn:nbn:se:su:diva-97146OAI: diva2:675861
Available from: 2013-12-04 Created: 2013-12-04 Last updated: 2016-09-12

Open Access in DiVA

No full text

Search in DiVA

By author/editor
Hedlin, Dan
By organisation
Department of Statistics
In the same journal
Journal of Official Statistics
Probability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 82 hits
ReferencesLink to record
Permanent link

Direct link