Comparative Analysis of Several Models for Churning Customer Prediction
Customer churn prediction is critical for financial institutions to retain clients and optimize resource allocation. It is less expensive to keep current clients than to find new ones. There lots of research in this field, but their performance is often limited by data imbalance issues. This study c...
Saved in:
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
EDP Sciences
2025-01-01
|
Series: | SHS Web of Conferences |
Online Access: | https://www.shs-conferences.org/articles/shsconf/pdf/2025/09/shsconf_icdde2025_02013.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Customer churn prediction is critical for financial institutions to retain clients and optimize resource allocation. It is less expensive to keep current clients than to find new ones. There lots of research in this field, but their performance is often limited by data imbalance issues. This study compares three machine learning models: Random Forest, XGBoost Classifier, and Light Gradient Boosting Machine Classifier for predicting credit card customer churn using a dataset from Kaggle. The research addresses data imbalance issues through oversampling techniques (SMOTE, SMOTEENN, Borderline SMOTE) and evaluates model performance using accuracy and F1 score. Results show that the LGBM Classifier with Borderline SMOTE achieves the highest accuracy (97.43%) and F1 score (0.9259), outperforming other methods. This approach effectively balances precision and recall, improving minority class prediction. These findings provide actionable insights for financial institutions to implement proactive retention strategies. There are still limitations and future work to do. More different datasets, updated models for small datasets, and more feature engineering methods should be taken into consideration. |
---|---|
ISSN: | 2261-2424 |