Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
On the use of L-functionals in regression models
Stockholms universitet, Naturvetenskapliga fakulteten, Matematiska institutionen.ORCID-id: 0000-0003-2767-8818
Stockholms universitet, Naturvetenskapliga fakulteten, Matematiska institutionen.ORCID-id: 0000-0001-9662-507x
Antal upphovsmän: 22023 (Engelska)Ingår i: Open Mathematics, ISSN 2391-5455, Vol. 21, nr 1, artikel-id 20220597Artikel, forskningsöversikt (Refereegranskat) Published
Abstract [en]

In this article, we survey and unify a large class or L -functionals of the conditional distribution of the response variable in regression models. This includes robust measures of location, scale, skewness, and heavytailedness of the response, conditionally on covariates. We generalize the concepts of L -moments (G. Sillito, Derivation of approximants to the inverse distribution function of a continuous univariate population from the order statistics of a sample, Biometrika 56 (1969), no. 3, 641–650.), L -skewness, and L -kurtosis (J. R. M. Hosking, L-moments: analysis and estimation of distributions using linear combinations or order statistics, J. R. Stat. Soc. Ser. B Stat. Methodol. 52 (1990), no. 1, 105–124.) and introduce order numbers for a large class of L -functionals through orthogonal series expansions of quantile functions. In particular, we motivate why location, scale, skewness, and heavytailedness have order numbers 1, 2, (3,2), and (4,2), respectively, and describe how a family of L -functionals, with different order numbers, is constructed from Legendre, Hermite, Laguerre, or other types of polynomials. Our framework is applied to models where the relationship between quantiles of the response and the covariates follows a transformed linear model, with a link function that determines the appropriate class of L -functionals. In this setting, the distribution of the response is treated parametrically or nonparametrically, and the response variable is either censored/truncated or not. We also provide a framework for asymptotic theory of estimates of L -functionals and illustrate our approach by analyzing the arrival time distribution of migrating birds. In this context, a novel version of the coefficient of determination is introduced, which makes use of the abovementioned orthogonal series expansion.

Ort, förlag, år, upplaga, sidor
2023. Vol. 21, nr 1, artikel-id 20220597
Nyckelord [en]
bird phenology, coefficient of determination, L-functionals, L-statistics, order numbers, orthogonal series expansion, quantile function, quantile regression
Nationell ämneskategori
Sannolikhetsteori och statistik
Identifikatorer
URN: urn:nbn:se:su:diva-203755DOI: 10.1515/math-2022-0597ISI: 001053084400001Scopus ID: 2-s2.0-85170428452OAI: oai:DiVA.org:su-203755DiVA, id: diva2:1653221
Tillgänglig från: 2022-04-21 Skapad: 2022-04-21 Senast uppdaterad: 2023-09-21Bibliografiskt granskad
Ingår i avhandling
1. Statistical Methods for Taxon Classification and Bird Migration Phenology
Öppna denna publikation i ny flik eller fönster >>Statistical Methods for Taxon Classification and Bird Migration Phenology
2022 (Engelska)Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
Abstract [en]

The connection between ecology and statistics is deep. Methodological advancement in statistics open up new possibilities to understand the distribution of life on earth, and research questions in ecology cause new statistical methods to be developed. The four papers of this thesis examplify this exchange in providing a statistical approach to taxon classification, and developing novel measures of distributional properties driven by the application area of phenology.

Paper I contains a comprehensive Bayesian approach to phenotypical taxon classification with covariates. We formulate a multivariate regression model for a collection of phenotypical traits, which are assumed to be partial observations of latent variables with a Gaussian distribution. Through blocked Gibbs sampling we estimate the parameters of these distributions for a real data set, and derive decision regions of new observations in terms of set-valued classifiers, called Karlsson-Hössjer (K-H) classifiers, analogous to partial reject options. We introduce model selection through cross-validation and compare the K-H classifier’s performance with other existing methods on real data.

Paper II introduces a general Bayesian framework for K-H classification. This is achieved by using a reward function with a set-valued argument, and in this context we derive the optimal Bayes classifier, for a homogeneous block of hypotheses as well as for scenarios where the hypotheses are divided into blocks, and where misclassification or ambiguity within blocks is less or more serious than between. These reward functions include tuning parameters which we choose using cross-validation, and we apply the method to a real data set with block structure.

In Paper III a large class of L-functionals is studied for the response variable in regression models. These L-functionals are given order numbers through an orthogonal series expansion of the quantile function of the response variable. We apply the framework to quantile regression models with and without transformations of the outcome variable, and present a unified asymptotic theory for estimates of L-functionals. The derived estimators are applied to a quantile regression model for phenological analysis, and in this context a novel version of the coefficient of determination is introduced.

In Paper IV two statistical approaches for phenological analysis are compared, for singular as well as for multiple species models. For singular species, we show that the estimates from linear models fitted to empirical quantiles of the response distribution give less detailed results on the effects of covariates compared to non-parametric quantile regression. For multiple species models, we highlight an identifiability issue in quantile regression with random effects, and deduce similarity of performance of a mixed effects linear model for empirical quantiles and a quantile regression model with species as one of the covariates.

Ort, förlag, år, upplaga, sidor
Stockholm: Department of Mathematics, Stockholm University, 2022. s. 39
Nyckelord
Classification, quantile regression, phenology, statistical ornithology, L-functionals, set-valued classification, species identification, statistical ecology, multispecies modelling
Nationell ämneskategori
Sannolikhetsteori och statistik Ekologi
Forskningsämne
matematisk statistik
Identifikatorer
urn:nbn:se:su:diva-204128 (URN)978-91-7911-892-1 (ISBN)978-91-7911-893-8 (ISBN)
Disputation
2022-06-07, sal 15, hus 5, Kräftriket, Roslagsvägen 101, online via Zoom, public link is available at the department website, Stockholm, 09:00 (Engelska)
Opponent
Handledare
Tillgänglig från: 2022-05-13 Skapad: 2022-04-21 Senast uppdaterad: 2022-05-06Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltextScopus

Person

Hössjer, OlaKarlsson, Måns

Sök vidare i DiVA

Av författaren/redaktören
Hössjer, OlaKarlsson, Måns
Av organisationen
Matematiska institutionen
Sannolikhetsteori och statistik

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetricpoäng

doi
urn-nbn
Totalt: 168 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf