The Impact of SMOTE and ADASYN on Random Forest and Advanced Gradient Boosting Techniques in Telecom Customer Churn Prediction
Number of Authors: 42024 (English)In: 2024 10th International Conference on Web Research (ICWR), IEEE (Institute of Electrical and Electronics Engineers) , 2024, p. 202-209Conference paper, Published paper (Refereed)
Abstract [en]
This paper explores the capability of various machine learning algorithms, including Random Forest and advanced gradient boosting techniques such as XGBoost, LightGBM, and CatBoost, to predict customer churn in the telecommunications sector. For this analysis, a dataset available to the public was employed. The performance of these algorithms was assessed using recognized metrics, including Accuracy, Precision, Recall, F1-score, and the Receiver Operating Characteristic Area Under Curve (ROC AUC). These metrics were evaluated at different phases: subsequent to data preprocessing and feature selection; following the application of SMOTE and ADASYN sampling methods; and after conducting hyperparameter tuning on the data that had been adjusted by SMOTE and ADASYN.The outcomes underscore the notable efficiency of upsampling techniques such as SMOTE and ADASYN in addressing the imbalance inherent in customer churn prediction. Notably, the application of random grid search for hyperparameter optimization did not significantly alter the results, which remained comparatively unchanged. The algorithms' performance post-ADASYN application marginally surpassed that observed after SMOTE application. Remarkably, LightGBM achieved an exceptional F1-score of 89% and an ROC AUC of 95% subsequent to the ADASYN sampling technique. This underlines the effectiveness of advanced boosting algorithms and upsampling methods like SMOTE and ADASYN in navigating the complexities of imbalanced datasets and intricate feature interdependencies.
Place, publisher, year, edition, pages
IEEE (Institute of Electrical and Electronics Engineers) , 2024. p. 202-209
Series
International Conference on Web Research (ICWR), E-ISSN 2837-8296
Keywords [en]
Customer Churn Prediction, Machine Learning, Classification Techniques, SMOTE, ADASYN, Random Forest, XGBoost, LightGBM, CatBoost
National Category
Computer Sciences
Research subject
Computer and Systems Sciences
Identifiers
URN: urn:nbn:se:su:diva-231318DOI: 10.1109/ICWR61162.2024.10533320Scopus ID: 2-s2.0-85194893134ISBN: 979-8-3503-9498-6 (print)OAI: oai:DiVA.org:su-231318DiVA, id: diva2:1872813
Conference
10th International Conference on Web Research (ICWR), 24-25 April 2024, Tehran, Iran.
2024-06-182024-06-182025-01-03Bibliographically approved