Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
The Impact of SMOTE and ADASYN on Random Forest and Advanced Gradient Boosting Techniques in Telecom Customer Churn Prediction
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
Ayatollah Boroujerdi University, Boroujerd, Iran.
University of Kashan, Kashan, Iran.
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.ORCID iD: 0000-0003-4884-4600
Number of Authors: 42024 (English)In: 2024 10th International Conference on Web Research (ICWR), IEEE (Institute of Electrical and Electronics Engineers) , 2024, p. 202-209Conference paper, Published paper (Refereed)
Abstract [en]

This paper explores the capability of various machine learning algorithms, including Random Forest and advanced gradient boosting techniques such as XGBoost, LightGBM, and CatBoost, to predict customer churn in the telecommunications sector. For this analysis, a dataset available to the public was employed. The performance of these algorithms was assessed using recognized metrics, including Accuracy, Precision, Recall, F1-score, and the Receiver Operating Characteristic Area Under Curve (ROC AUC). These metrics were evaluated at different phases: subsequent to data preprocessing and feature selection; following the application of SMOTE and ADASYN sampling methods; and after conducting hyperparameter tuning on the data that had been adjusted by SMOTE and ADASYN.The outcomes underscore the notable efficiency of upsampling techniques such as SMOTE and ADASYN in addressing the imbalance inherent in customer churn prediction. Notably, the application of random grid search for hyperparameter optimization did not significantly alter the results, which remained comparatively unchanged. The algorithms' performance post-ADASYN application marginally surpassed that observed after SMOTE application. Remarkably, LightGBM achieved an exceptional F1-score of 89% and an ROC AUC of 95% subsequent to the ADASYN sampling technique. This underlines the effectiveness of advanced boosting algorithms and upsampling methods like SMOTE and ADASYN in navigating the complexities of imbalanced datasets and intricate feature interdependencies.

Place, publisher, year, edition, pages
IEEE (Institute of Electrical and Electronics Engineers) , 2024. p. 202-209
Series
International Conference on Web Research (ICWR), E-ISSN 2837-8296
Keywords [en]
Customer Churn Prediction, Machine Learning, Classification Techniques, SMOTE, ADASYN, Random Forest, XGBoost, LightGBM, CatBoost
National Category
Computer Sciences
Research subject
Computer and Systems Sciences
Identifiers
URN: urn:nbn:se:su:diva-231318DOI: 10.1109/ICWR61162.2024.10533320Scopus ID: 2-s2.0-85194893134ISBN: 979-8-3503-9498-6 (print)OAI: oai:DiVA.org:su-231318DiVA, id: diva2:1872813
Conference
10th International Conference on Web Research (ICWR), 24-25 April 2024, Tehran, Iran.
Available from: 2024-06-18 Created: 2024-06-18 Last updated: 2025-01-03Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Imani, MehdiBeikmohammadi, Ali

Search in DiVA

By author/editor
Imani, MehdiBeikmohammadi, Ali
By organisation
Department of Computer and Systems Sciences
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 83 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf