Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Classification under partial reject options
Stockholm University, Faculty of Science, Department of Mathematics.ORCID iD: 0000-0001-9662-507x
Stockholm University, Faculty of Science, Department of Mathematics.ORCID iD: 0000-0003-2767-8818
(English)Manuscript (preprint) (Other academic)
Abstract [en]

We study set-valued classification for a Bayesian model where data originates from one of a finite number N of possible hypotheses. Thus we consider the scenario where the size of the classified set of categories ranges from 0 to N. Empty sets corresponds to an outlier, size 1 represents a firm decision that singles out one hypotheses, size N corresponds to a rejection to classify, whereas sizes 2…,N−1 represent a partial rejection, where some hypotheses are excluded from further analysis. We introduce a general framework of reward functions with a set-valued argument and derive the corresponding optimal Bayes classifiers, for a homogeneous block of hypotheses and for when hypotheses are partitioned into blocks, where ambiguity within and between blocks are of different severity. We illustrate classification using an ornithological dataset, with taxa partitioned into blocks and parameters estimated using MCMC. The associated reward function's tuning parameters are chosen through cross-validation.

Keywords [en]
Blockwise cross-validation, Bayesian classification, con- formal prediction, classes of hypotheses, indifference zones, Markov Chain Monte Carlo, reward functions with set-valued inputs, set-val- ued classifiers
National Category
Probability Theory and Statistics
Research subject
Mathematical Statistics
Identifiers
URN: urn:nbn:se:su:diva-203754DOI: arXiv.2202.14011OAI: oai:DiVA.org:su-203754DiVA, id: diva2:1653217
Available from: 2022-04-21 Created: 2022-04-21 Last updated: 2022-04-21
In thesis
1. Statistical Methods for Taxon Classification and Bird Migration Phenology
Open this publication in new window or tab >>Statistical Methods for Taxon Classification and Bird Migration Phenology
2022 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

The connection between ecology and statistics is deep. Methodological advancement in statistics open up new possibilities to understand the distribution of life on earth, and research questions in ecology cause new statistical methods to be developed. The four papers of this thesis examplify this exchange in providing a statistical approach to taxon classification, and developing novel measures of distributional properties driven by the application area of phenology.

Paper I contains a comprehensive Bayesian approach to phenotypical taxon classification with covariates. We formulate a multivariate regression model for a collection of phenotypical traits, which are assumed to be partial observations of latent variables with a Gaussian distribution. Through blocked Gibbs sampling we estimate the parameters of these distributions for a real data set, and derive decision regions of new observations in terms of set-valued classifiers, called Karlsson-Hössjer (K-H) classifiers, analogous to partial reject options. We introduce model selection through cross-validation and compare the K-H classifier’s performance with other existing methods on real data.

Paper II introduces a general Bayesian framework for K-H classification. This is achieved by using a reward function with a set-valued argument, and in this context we derive the optimal Bayes classifier, for a homogeneous block of hypotheses as well as for scenarios where the hypotheses are divided into blocks, and where misclassification or ambiguity within blocks is less or more serious than between. These reward functions include tuning parameters which we choose using cross-validation, and we apply the method to a real data set with block structure.

In Paper III a large class of L-functionals is studied for the response variable in regression models. These L-functionals are given order numbers through an orthogonal series expansion of the quantile function of the response variable. We apply the framework to quantile regression models with and without transformations of the outcome variable, and present a unified asymptotic theory for estimates of L-functionals. The derived estimators are applied to a quantile regression model for phenological analysis, and in this context a novel version of the coefficient of determination is introduced.

In Paper IV two statistical approaches for phenological analysis are compared, for singular as well as for multiple species models. For singular species, we show that the estimates from linear models fitted to empirical quantiles of the response distribution give less detailed results on the effects of covariates compared to non-parametric quantile regression. For multiple species models, we highlight an identifiability issue in quantile regression with random effects, and deduce similarity of performance of a mixed effects linear model for empirical quantiles and a quantile regression model with species as one of the covariates.

Place, publisher, year, edition, pages
Stockholm: Department of Mathematics, Stockholm University, 2022. p. 39
Keywords
Classification, quantile regression, phenology, statistical ornithology, L-functionals, set-valued classification, species identification, statistical ecology, multispecies modelling
National Category
Probability Theory and Statistics Ecology
Research subject
Mathematical Statistics
Identifiers
urn:nbn:se:su:diva-204128 (URN)978-91-7911-892-1 (ISBN)978-91-7911-893-8 (ISBN)
Public defence
2022-06-07, sal 15, hus 5, Kräftriket, Roslagsvägen 101, online via Zoom, public link is available at the department website, Stockholm, 09:00 (English)
Opponent
Supervisors
Available from: 2022-05-13 Created: 2022-04-21 Last updated: 2022-05-06Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textPreprint at arXiv

Authority records

Karlsson, MånsHössjer, Ola

Search in DiVA

By author/editor
Karlsson, MånsHössjer, Ola
By organisation
Department of Mathematics
Probability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 106 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf