Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
optimStrat: An R package for assisting the choice of robust and efficient sampling strategies
Stockholm University, Faculty of Social Sciences, Department of Statistics.
(English)Manuscript (preprint) (Other academic)
Abstract [en]

At the planning stage of a survey, statisticians have to choose the sampling strategy to implement. If auxiliary information is available, the statistician might, for example, consider using sampling with probabilities proportional to size, or, stratified sampling. With this context in mind, optimStrat, an R package, has been developed. The package assists the choice through two superpopulation models. The first one, called working model, reflects the knowledge or beliefs the statistician has about the relation between the auxiliary variables and the variable of interest. This model, however, might be misspecified. This possibility is reflected on the second model, called true model. In this way, the package allows for determining which sampling strategy is more efficient if a given working model is used when in reality the population is generated by a different model. The package includes an interactive web app that allows users not familiar with R to perform the comparisons for five sampling strategies, namely, STSI--HT, STSI--pos, STSI-reg, πps--pos and πps--reg. Additional functions allow the user to simulate study variables, stratify a given population or calculate the variance of the GREG estimator, among others.

Keywords [en]
survey sampling, stratified sampling, probability propotional-to-size sampling, sampling strategy, R, Shiny
National Category
Probability Theory and Statistics
Research subject
Statistics
Identifiers
URN: urn:nbn:se:su:diva-185928OAI: oai:DiVA.org:su-185928DiVA, id: diva2:1477254
Available from: 2020-10-17 Created: 2020-10-17 Last updated: 2022-02-25Bibliographically approved
In thesis
1. Essays on Sample Surveys: Design and Estimation
Open this publication in new window or tab >>Essays on Sample Surveys: Design and Estimation
2020 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Sampling is a core stage in every survey. A sampling design carefully elaborated may imply not only a more accurate estimation of the parameters of interest, but also a reduction in the required sample size in a study. In this thesis we consider two particular but connected subjects. On the one hand, the selection of samples with probabilities proportional to some prescribed values. The first two papers are devoted to this topic. On the other hand, the choice of sampling design to implement in a given survey, which is a topic to which the last two papers are devoted.

Probability proportional to size sampling designs, often referred to as πps designs, are of practical interest due to their potential efficiency. In the literature we can find many of these designs, all having different characteristics. In the first paper we describe and compare ten πps designs with respect to several desired properties. The results suggest that the so called order sampling methods, as well as those proposed by Sunter and Chromy may be considered as good options from a practitioner's point of view.

In the second paper we introduce an algorithm for approximating a target distribution by a mixture distribution. Being a mixture, most of its properties are easy to calculate. We illustrate the use of the algorithm with several examples, both univariate and multivariate. The results indicate that the algorithm succeeds in approximating the target distribution.

The strategy that couples πps designs with the generalized regression estimator is optimal under a given superpopulation model. However, this optimality assumes that the model is correct and some of its parameters are known, which are assumptions that are hardly satisfied in practice. In the third paper we introduce a method that allows for incorporating uncertainty about the model parameters into the choice of the sampling design and then quantifying this uncertainty with a risk measure. The method is illustrated with a real dataset. The results show that the method allowed us to correctly choose the sampling design. The risk measure -as well as other functions that are useful at the planning stage of a survey- is implemented in the package optimStrat developed for R. The fourth paper in this thesis describes the functions in this package.

Place, publisher, year, edition, pages
Stockholm: Department of Statistics, Stockholm University, 2020. p. 40
Keywords
GREG estimator, mixture distribution, probability proportional to size sampling, sampling algorithms, sampling design, sampling strategy, survey sampling, stratified sampling
National Category
Probability Theory and Statistics
Research subject
Statistics
Identifiers
urn:nbn:se:su:diva-185930 (URN)978-91-7911-268-4 (ISBN)978-91-7911-269-1 (ISBN)
Public defence
2020-12-04, Nordenskiöldsalen, Geovetenskapens hus, Svante Arrhenius väg 12, Stockholm, 13:00 (English)
Opponent
Supervisors
Available from: 2020-11-11 Created: 2020-10-17 Last updated: 2022-02-25Bibliographically approved

Open Access in DiVA

No full text in DiVA

Authority records

Bueno, Edgar

Search in DiVA

By author/editor
Bueno, Edgar
By organisation
Department of Statistics
Probability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 71 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf