Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
When are Bayesian model probabilities overconfident?
Stockholm University, Faculty of Social Sciences, Department of Statistics.
Stockholm University, Faculty of Social Sciences, Department of Statistics.ORCID iD: 0000-0003-2786-2519
Show others and affiliations
2020 (English)Manuscript (preprint) (Other academic)
Place, publisher, year, edition, pages
2020.
National Category
Probability Theory and Statistics
Research subject
Statistics
Identifiers
URN: urn:nbn:se:su:diva-187858OAI: oai:DiVA.org:su-187858DiVA, id: diva2:1510655
Available from: 2020-12-16 Created: 2020-12-16 Last updated: 2022-11-04Bibliographically approved
In thesis
1. Learning local predictive accuracy for expert evaluation and forecast combination
Open this publication in new window or tab >>Learning local predictive accuracy for expert evaluation and forecast combination
2022 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

This thesis consists of four papers that study several topics related to expert evaluation and aggregation. 

Paper I explores the properties of Bayes factors. Bayes factors, which are used for Bayesian hypothesis testing as well as to aggregate models using Bayesian model averaging, are sometimes observed to behave erratically. We analyze some of the sources of this erratic behavior, which we call overconfidence, by deriving the sampling distribution of Bayes factors for a class of linear model. We show that overconfidence is most likely to occur when comparing models that are complex and approximate the data-generating process in widely different ways.  

Paper II proposes a general framework for creating linear aggregate density forecasts based on local predictive ability, where we define local predictive ability to be the conditional expected log  predictive density given an arbitrary set of pooling variables. We call the space spanned by the variables in this set the pooling space and propose the caliper method as a way to estimate  local predictive ability. We further introduce a local version of linear optimal pools that works by optimizing the historic performance of a linear pool only for past observations that were made at  points in the pooling space close to the new point at which we want to make a prediction. Both methods are illustrated in two applications: macroeconomic forecasting predictions of bike sharing usage in Washington D.C.

Paper III builds on Paper II by introducing a Gaussian process (GP) as a model for estimating local predictive ability. When the predictive distribution of an expert, as well as the data-generating process, is normal, it follows that the distribution of the log scores will follow a scaled and translated noncentral chi-squared distribution with one degree of freedom. We show that,  following a power-transform of the log scores, they can be modeled using a Gaussian process  with Gaussian noise. The proposed model has the advantage that the latent Gaussian process surface can be marginalized out in order to quickly obtain the marginal posteriors of the hyperparameters of the GP, which is important since the computational cost of the unmarginalized model is often prohibitive. The paper demonstrates the GP approach to modeling local predictive ability with a simulation study and an application using the bike sharing data from Paper II, and develops new methods for pooling predictive distributions conditional on full posterior distributions of local predictive ability.  

Paper IV further expands on Paper III by considering the problem of estimating local predictive ability for a set of experts jointly using a multi-output Gaussian process. In Paper III, the posterior distribution of the local predictive ability of each expert is obtained separately. By instead estimating a joint posterior, we can exploit dependencies in the correlation between the predictive ability of the experts to create better aggregate predictions. We can also use this joint posterior for inference, for example to learn about the relationships between the different experts. The method is illustrated using a simulation study and the same bike sharing data as in Paper III.

Place, publisher, year, edition, pages
Stockholm: Department of Statistics, Stockholm University, 2022. p. 34
Keywords
Bayesian, forecast combination, predictive density, Gaussian process, bootstrap, Bayes factors, model selection, Bayesian predictive synthesis, nonparametric methods, power transformation, expected log predictive density, variable selection
National Category
Probability Theory and Statistics
Research subject
Statistics
Identifiers
urn:nbn:se:su:diva-210919 (URN)978-91-8014-102-4 (ISBN)978-91-8014-103-1 (ISBN)
Public defence
2022-12-19, lärosal 22, hus 4, plan 2, Albanovägen 12, and online via Zoom, public link is available at the department website, Stockholm, 13:15 (English)
Opponent
Supervisors
Available from: 2022-11-24 Created: 2022-11-04 Last updated: 2022-11-11Bibliographically approved

Open Access in DiVA

No full text in DiVA

Authority records

Oelrich, OscarVillani, Mattias

Search in DiVA

By author/editor
Oelrich, OscarVillani, Mattias
By organisation
Department of Statistics
Probability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 113 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf