Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Multiple Kernel Imputation: A Locally Balanced Real Donor Method
Stockholm University, Faculty of Social Sciences, Department of Statistics.
2013 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

We present an algorithm for imputation of incomplete datasets based on Bayesian exchangeability through Pólya sampling. Each (donee) unit with a missing value is imputed multiple times by observed (real) values on units from a donor pool. The donor pools are constructed using auxiliary variables. Several features from kernel estimation are used to counteract unbalances that are due to sparse and bounded data. Three balancing features can be used with only one single continuous auxiliary variable, but an additional fourth feature need, multiple continuous auxiliary variables. They mainly contribute by reducing nonresponse bias. We examine how the donor pool size should be determined, that is the number of potential donors within the pool. External information is shown to be easily incorporated in the imputation algorithm. Our simulation studies show that with a study variable which can be seen as a function of one or two continuous auxiliaries plus residual noise, the method performs as well or almost as well as competing methods when the function is linear, but usually much better when the function is nonlinear.

Place, publisher, year, edition, pages
Stockholm: Department of Statistics, Stockholm University , 2013. , 40 p.
Keyword [en]
Bayesian Bootstrap, Boundary Effects, External Information, Kernel estimation features, Local Balancing, Pólya Sampling
National Category
Probability Theory and Statistics
Research subject
Statistics
Identifiers
URN: urn:nbn:se:su:diva-89435ISBN: 978-91-7447-699-6 (print)OAI: oai:DiVA.org:su-89435DiVA: diva2:617951
Public defence
2013-05-28, hörsal 4, hus B, Universitetsvägen 10 B, Stockholm, 10:00 (English)
Opponent
Supervisors
Note

At the time of the doctoral defense, the following papers were unpublished and had a status as follows: Paper 1: In press. Paper 3: Submitted. Paper 4: Submitted.

Available from: 2013-05-06 Created: 2013-04-25 Last updated: 2014-06-02Bibliographically approved
List of papers
1. Bias Reduction Of Finite Population Imputation By Kernel Methods
Open this publication in new window or tab >>Bias Reduction Of Finite Population Imputation By Kernel Methods
(English)In: Statistics in Transition, ISSN 1234-7655Article in journal (Refereed) In press
Abstract [en]

Missing data is a nuisance in statistics. Real donor imputation can be used with item nonresponse. A pool of donor units with similar values on auxiliary variables is matched to each unit with missing values. The missing value is then replaced by a copy of the corresponding observed value from a randomly drawn donor. Such methods can to some extent protect against nonresponse bias. But bias also depends on the estimator and the nature of the data. We adopt techniques from kernel estimation to combat this bias. Motivated by Pólya urn sampling, we sequentially update the set of potential donors with units already imputed, and use multiple imputations via Bayesian bootstrap to account for imputation uncertainty. Simulations with a single auxiliary variable show that our imputation method performs almost as well as competing methods with linear data, but better when data is nonlinear, especially with large samples.

National Category
Social Sciences Probability Theory and Statistics
Research subject
Statistics
Identifiers
urn:nbn:se:su:diva-89309 (URN)
Available from: 2013-04-20 Created: 2013-04-20 Last updated: 2013-04-25Bibliographically approved
2. Real donor imputation pools
Open this publication in new window or tab >>Real donor imputation pools
2012 (English)In: Proceedings of the Workshop of the Baltic-Nordic-Ukrainian network on survey statistics, 2012 / [ed] Mārtiņš Liberts, Valmiera, 2012, 162-168 p.Conference paper, Published paper (Other academic)
Abstract [en]

Real donor matching is associated with hot deck imputation. Aux-iliary variables are used to match donee units with missing values to aset of donor units with observed values, and the donee missing valuesare ‘replaced’ by copies of the donor values, as to create completelyfilled in datasets. The matching of donees and donors is complicatedby the fact that the observed sample survey data is often both sparseand bounded. The important choice of how many possible donors tochoose from involves a trade-off between bias and variance. We trans-fer concepts from kernel estimators to real donor imputation. In asimulation study we show how bias, variance and the estimated vari-ance of a population behaves, focusing on the size of donor pools.

Place, publisher, year, edition, pages
Valmiera: , 2012
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:su:diva-89431 (URN)
Conference
Workshop of the Baltic-Nordic-Ukrainian network on survey statistics, 2012.
Available from: 2013-04-25 Created: 2013-04-25 Last updated: 2013-04-25Bibliographically approved
3. Kernel imputation with multivariate auxiliaries
Open this publication in new window or tab >>Kernel imputation with multivariate auxiliaries
(English)Article in journal (Refereed) Submitted
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:su:diva-89433 (URN)
Available from: 2013-04-25 Created: 2013-04-25 Last updated: 2013-04-25Bibliographically approved
4. Informed kernel imputation
Open this publication in new window or tab >>Informed kernel imputation
(English)Article in journal (Refereed) Submitted
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:su:diva-89434 (URN)
Available from: 2013-04-25 Created: 2013-04-25 Last updated: 2013-04-25Bibliographically approved

Open Access in DiVA

fulltext(596 kB)265 downloads
File information
File name FULLTEXT01.pdfFile size 596 kBChecksum SHA-512
9242f0f828001a86d351ec3932fadb9f6b48f24f009928f171ef9c5e854b869f75b78010a2e45767e72d9a71b11d1ee89161f560de9f38548bd9cbc097f5c315
Type fulltextMimetype application/pdf
errata(27 kB)10 downloads
File information
File name ERRATA01.pdfFile size 27 kBChecksum SHA-512
b19533d6fae568857fec30e701f14e26c54c78eff99f669d97547478ceba3fe42f576d898d1fa17fc414cc8d43cd23a2dccfb37e6f6ab7206b673b6c600b5d48
Type errataMimetype application/pdf

Search in DiVA

By author/editor
Pettersson, Nicklas
By organisation
Department of Statistics
Probability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar
Total: 265 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 445 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf