Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Machine Learning, Regression Models, and Prediction of Claims Reserves
Stockholm University, Faculty of Science, Department of Mathematics.
Stockholm University, Faculty of Science, Department of Mathematics.
Stockholm University, Faculty of Science, Department of Mathematics.
2020 (English)In: Casualty Actuarial Society E-Forum, Summer 2020, Arlington: Casualty Actuary Society , 2020Conference paper, Published paper (Refereed)
Abstract [en]

The current paper introduces regression based reserving models that allow for separate RBNS and IBNR reserves based on aggregated discrete time data containing information about accident years, reporting years, and payment delay, since reporting. All introduced models will be closely related to the cross-classified over-dispersed Poisson (ODP) chain-ladder model. More specifically, two types of models are introduced (i) models consisting of an explicit claim count part, where payments, in a second step, are modelled conditionally on claim counts, and (ii) models defined directly in terms of claim payments without using claim count information. Further, these general ODP models will be estimated using regression functions defined by (i) tree-based gradient boosting machines (GBM), and (ii) feed-forward neural networks (NN). This will provide us with machine learning based reserving models that have interpretable output, and that are easy to bootstrap from. In the current paper we will give a brief introduction to GBMs and NNs, including calibration and model selection. All of this is illustrated in a longer numerical simulation study, which shows the benefits that can be gained by using machine learning based reserving models. 

Place, publisher, year, edition, pages
Arlington: Casualty Actuary Society , 2020.
Keywords [en]
Claims reserving, Reported But Not Settled Claims, Incurred But Not Reported Claims, Gradient Boosting Machines, Neural Networks
National Category
Probability Theory and Statistics
Identifiers
URN: urn:nbn:se:su:diva-190526OAI: oai:DiVA.org:su-190526DiVA, id: diva2:1530233
Conference
Casualty Actuary Society, E-Forum, Summer 2020
Available from: 2021-02-22 Created: 2021-02-22 Last updated: 2024-02-19Bibliographically approved
In thesis
1. Tree-based machine learning methods with non-life insurance applications
Open this publication in new window or tab >>Tree-based machine learning methods with non-life insurance applications
2024 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Non-life insurance is a field which has been data-driven for a long time, with the statistical framework behind modern-day actuarial sciences laid out at the beginning of the 20th century. Problems regarding the estimation and prediction of risk are relevant to the insurance industry specifically, but also for society as a whole. The rise of machine learning methods has created a new set of tools that can be used to solve these problems. This thesis contains five individual papers, all of which are related to developing machine learning- or data-driven methods and algorithms that can be applied to, but are not limited to, non-life insurance applications.

Paper I takes an existing probabilistic model for claims reserving, the Collective Reserving Model (CRM), and replaces the linear modeling approach of the original paper with non-linear machine learning methods. The paper addresses issues in these applications and provides a framework for how to implement and evaluate machine learning models in a reserving setting. It also discusses how to implement early stopping methods given different levels of data granularity. The models are evaluated on a series of simulated data sets with promising results.

Paper II does not use a machine learning method per se but instead develops the CRM used in Paper I by adding the openness status of the claims to the dynamics and presents the CRM with Openness (CRMO), as a means to model the non-linear effects implied in Paper I. The paper presents how the model can be estimated using regression methods, and provides recursive formulas for the moments of the predicted reserve. The algorithm is evaluated in terms of accuracy on the same data set as in Paper I and shows results that are comparable to the machine learning implementations of the CRM model.

Paper III presents a new boosting algorithm called the Cyclic Gradient Boosting Machine (CGBM). The algorithm extends the classical gradient boosting machine to provide multi-dimensional function approximation. The paper shows how the CGBM can be used to estimate entire probability distributions rather than just the mean of the distribution. The paper also discusses potential problems with hyperparameter tuning in this higher-dimensional hyperparameter space and provides a dimension-wise early stopping method, which is proven useful to avoid overfitting. Numerical illustrations show accurate results on simulated and real data sets.

Paper IV is a paper that is not directly related to non-life insurance but rather to so-called decision trees used for classification and regression. The paper presents the trinary tree algorithm, which is a new way to handle missing input data for tree-based models, meant to provide a more regularized model than other suggested methods. The algorithm is benchmarked against standard methods for missing data-handling and shows promising results even for high rates of missing data.

Paper V presents a generalized linear model with non-linear effects induced by varying coefficients, with the varying coefficients estimated using the CGBM from Paper III. This is a special case of a varying coefficient model (VCM). The model that can handle highly non-linear effects while maintaining local interpretability. The paper also shows how tuning, feature selection, and evaluation of interaction effects can be simplified as compared to other VCMs. The model is evaluated on the same data set as in Paper III and shows promising results in terms of accuracy and interpretability.

Place, publisher, year, edition, pages
Stockholm: Department of Mathematics, Stockholm University, 2024. p. 65
National Category
Probability Theory and Statistics
Research subject
Mathematical Statistics
Identifiers
urn:nbn:se:su:diva-226748 (URN)978-91-8014-677-7 (ISBN)978-91-8014-678-4 (ISBN)
Public defence
2024-04-12, hörsal 4, hus 2, Campus Albano, Greta Arwidssons väg 28, Stockholm, 13:00 (English)
Opponent
Supervisors
Available from: 2024-03-20 Created: 2024-02-19 Last updated: 2024-03-12Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Free full text

Authority records

Lindholm, MathiasWahl, FelixZakrisson, Henning

Search in DiVA

By author/editor
Lindholm, MathiasWahl, FelixZakrisson, Henning
By organisation
Department of Mathematics
Probability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 627 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf