Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Design and analysis of response selective samples in observational studies
Stockholm University, Faculty of Science, Department of Mathematics.
2011 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Outcome dependent sampling may increase efficiency in observational studies. It is however not always obvious how to sample efficiently, and how to analyze the resulting data without introducing bias. This thesis describes a general framework for efficiency calculations in multistage sampling, with focus on what is sometimes referred to as ascertainment sampling. A method for correcting for the sampling scheme in analysis of ascertainment samples is also presented. Simulation based methods are used to overcome computational issues in both efficiency calculations and analysis of data.

Place, publisher, year, edition, pages
Stockholm: Department of Mathematics, Stockholm University , 2011. , 68 p.
Keyword [en]
ascertainment, missing data, outcome dependent sampling, response selective samples, sequential design, stochastic EM algorithm
National Category
Probability Theory and Statistics
Research subject
Mathematical Statistics
Identifiers
URN: urn:nbn:se:su:diva-49328ISBN: 978-91-7447-201-1 (print)OAI: oai:DiVA.org:su-49328DiVA: diva2:377687
Public defence
2011-02-04, sal 14, hus 5, Kräftriket, Roslagsvägen 101, Stockholm, 10:00 (English)
Opponent
Supervisors
Note
At the time of doctoral defense, the following paper was unpublished and had a status as follows: Paper 1: Submitted.Available from: 2011-01-13 Created: 2010-12-13 Last updated: 2011-05-26Bibliographically approved
List of papers
1. A General Statistical Framework for Multistage Designs
Open this publication in new window or tab >>A General Statistical Framework for Multistage Designs
2010 (English)Manuscript (preprint) (Other academic)
Abstract [en]

The efficiency of observational studies may be increased by applying multistage sampling designs. It is however not always transparent how to construct such a design in order to obtain increased efficiency. We here present a general statistical framework for describing and con- structing multistage designs. We also provide tools for efficiency and cost-efficiency comparisons, to facilitate the choice of sampling scheme. The comparisons are based on Fisher information matrices and the results are suggested being presented in graphs, where either efficiency or cost adjusted efficiency is plotted against a normalized measure of cost. The former curve resides in the unit square and is analogous to the receiver operating characteristic curve used for testing.

 

Keyword
Cost-efficiency, efficient design, Fisher information, Hierarchical multistage model, Multistage sampling
National Category
Probability Theory and Statistics
Research subject
Mathematical Statistics
Identifiers
urn:nbn:se:su:diva-49327 (URN)
Funder
Swedish Research Council, 621-2005-2810Swedish Research Council, 621-2008-4946
Available from: 2010-12-13 Created: 2010-12-13 Last updated: 2010-12-15Bibliographically approved
2. Efficient ascertainment schemes for maximum likelihood estimation
Open this publication in new window or tab >>Efficient ascertainment schemes for maximum likelihood estimation
2010 (English)In: Journal of Statistical Planning and Inference, ISSN 0378-3758, E-ISSN 1873-1171, Vol. 140, no 7, 2078-2088 p.Article in journal (Refereed) Published
Abstract [en]

A well chosen sampling scheme can substantially increase the efficiency of a study. However, it is not always obvious how to sample well. Neyman (1938) presents the possibility of two-stage sampling to increase efficiency in field sampling, and concludes that two-stage sampling sometimes, but not always, reduces the variance of estimates of means. Since then various authors have investigated the effects of two-stage and multistage sampling in different settings, most of which focus on binary outcome variables. In some special cases, such as case-control studies, there are rules of thumb to follow with regards to efficiency, see for example Maydrech and Kupper (1978), but in most other settings more elaborate calculations are necessary to discriminate between different options. Multistage sampling is described in the context of genetic epidemiology by, among others, Whittemore and Halpern (1997): Case-control status of prostate cancer is first ascertained and then more expensive measures such as family history of disease and DNA samples are collected. Asymptotic variances of Horvitz–Thompson estimates are derived. Reilly (1996) investigates optimal allocation of available resources for two-stage data with binary outcomes. Complete information is there available from variables sampled in Stage 1, while Stage 2 variables are sampled more sparsely with probabilities determined by Stage 1 data. Cost is allowed to differ between sampling in Stage 1 and sampling in Stage 2. The author emphasizes the usefulness of pilot studies to obtain information needed to find the optimal allocation. Zhou et al. (2007) investigate outcome dependent sampling where the outcome variable is continuous. Power of tests based on a semi-parametric estimator are compared with the power of an inverse probability weighted estimator and the power of a maximum likelihood estimator based on a simple random sample. 

Keyword
Ascertainment, Cost adjusted efficiency, Fisher information, Efficient design, Outcome dependent sampling, Multistage design, Continuous outcome variables
National Category
Probability Theory and Statistics
Research subject
Mathematical Statistics
Identifiers
urn:nbn:se:su:diva-49325 (URN)10.1016/j.jspi.2010.02.003 (DOI)000276369000039 ()
Funder
Swedish Research Council, 621-2005-2810Swedish Research Council, 621-2008-4946
Available from: 2010-12-13 Created: 2010-12-13 Last updated: 2017-12-11Bibliographically approved
3. A Stochastic EM Type Algorithm for Parameter Estimation in Models with Continuous Outcomes, under Complex Ascertainment
Open this publication in new window or tab >>A Stochastic EM Type Algorithm for Parameter Estimation in Models with Continuous Outcomes, under Complex Ascertainment
2010 (English)In: The International Journal of Biostatistics, ISSN 1557-4679, E-ISSN 1557-4679, Vol. 6, no 1, Article 23- p.Article in journal (Refereed) Published
Abstract [en]

Outcome-dependent sampling probabilities can be used to increase efficiency in observational studies. For continuous outcomes, appropriate consideration of sampling design in estimating parameters of interest is often computationally cumbersome. In this article, we suggest a Stochastic EM type algorithm for estimation when ascertainment probabilities are known or estimable. The computational complexity of the likelihood is avoided by filling in missing data so that an approximation of the full data likelihood can be used. The method is not restricted to any specific distribution of the data and can be used for a broad range of statistical models. 

Keyword
ascertainment, stochastic EM algorithm, missing data, outcome-dependent sampling, genetic epidemiology
National Category
Probability Theory and Statistics
Research subject
Mathematical Statistics
Identifiers
urn:nbn:se:su:diva-49323 (URN)10.2202/1557-4679.1222 (DOI)
Funder
Swedish Research Council, 523-2006-972Swedish Research Council, 621-2005-2810
Available from: 2010-12-13 Created: 2010-12-13 Last updated: 2017-12-11Bibliographically approved

Open Access in DiVA

fulltext(336 kB)791 downloads
File information
File name FULLTEXT01.pdfFile size 336 kBChecksum SHA-512
91834676bbbb5e402529af75e16d3886849ec458a81a87dc720ac4a0b65e222b4b82ac6a244eb2987b0e39413df9086f0d686a4604a8a88134cbe2ee90b95b32
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Grünewald, Maria
By organisation
Department of Mathematics
Probability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar
Total: 791 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 524 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf