Change search
ReferencesLink to record
Permanent link

Direct link
Bias Reduction Of Finite Population Imputation By Kernel Methods
Stockholm University, Faculty of Social Sciences, Department of Statistics.
(English)In: Statistics in Transition, ISSN 1234-7655Article in journal (Refereed) In press
Abstract [en]

Missing data is a nuisance in statistics. Real donor imputation can be used with item nonresponse. A pool of donor units with similar values on auxiliary variables is matched to each unit with missing values. The missing value is then replaced by a copy of the corresponding observed value from a randomly drawn donor. Such methods can to some extent protect against nonresponse bias. But bias also depends on the estimator and the nature of the data. We adopt techniques from kernel estimation to combat this bias. Motivated by Pólya urn sampling, we sequentially update the set of potential donors with units already imputed, and use multiple imputations via Bayesian bootstrap to account for imputation uncertainty. Simulations with a single auxiliary variable show that our imputation method performs almost as well as competing methods with linear data, but better when data is nonlinear, especially with large samples.

National Category
Social Sciences Probability Theory and Statistics
Research subject
URN: urn:nbn:se:su:diva-89309OAI: diva2:616997
Available from: 2013-04-20 Created: 2013-04-20 Last updated: 2013-04-25Bibliographically approved
In thesis
1. Multiple Kernel Imputation: A Locally Balanced Real Donor Method
Open this publication in new window or tab >>Multiple Kernel Imputation: A Locally Balanced Real Donor Method
2013 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

We present an algorithm for imputation of incomplete datasets based on Bayesian exchangeability through Pólya sampling. Each (donee) unit with a missing value is imputed multiple times by observed (real) values on units from a donor pool. The donor pools are constructed using auxiliary variables. Several features from kernel estimation are used to counteract unbalances that are due to sparse and bounded data. Three balancing features can be used with only one single continuous auxiliary variable, but an additional fourth feature need, multiple continuous auxiliary variables. They mainly contribute by reducing nonresponse bias. We examine how the donor pool size should be determined, that is the number of potential donors within the pool. External information is shown to be easily incorporated in the imputation algorithm. Our simulation studies show that with a study variable which can be seen as a function of one or two continuous auxiliaries plus residual noise, the method performs as well or almost as well as competing methods when the function is linear, but usually much better when the function is nonlinear.

Place, publisher, year, edition, pages
Stockholm: Department of Statistics, Stockholm University, 2013. 40 p.
Bayesian Bootstrap, Boundary Effects, External Information, Kernel estimation features, Local Balancing, Pólya Sampling
National Category
Probability Theory and Statistics
Research subject
urn:nbn:se:su:diva-89435 (URN)978-91-7447-699-6 (ISBN)
Public defence
2013-05-28, hörsal 4, hus B, Universitetsvägen 10 B, Stockholm, 10:00 (English)

At the time of the doctoral defense, the following papers were unpublished and had a status as follows: Paper 1: In press. Paper 3: Submitted. Paper 4: Submitted.

Available from: 2013-05-06 Created: 2013-04-25 Last updated: 2014-06-02Bibliographically approved

Open Access in DiVA

No full text

Search in DiVA

By author/editor
Pettersson, Nicklas
By organisation
Department of Statistics
Social SciencesProbability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 46 hits
ReferencesLink to record
Permanent link

Direct link