Comparative Analysis of Several Models for Churning Customer Prediction

Customer churn prediction is critical for financial institutions to retain clients and optimize resource allocation. It is less expensive to keep current clients than to find new ones. There lots of research in this field, but their performance is often limited by data imbalance issues. This study c...

Full description

Saved in:
Bibliographic Details
Main Author: Tan Zhaoyuan
Format: Article
Language:English
Published: EDP Sciences 2025-01-01
Series:SHS Web of Conferences
Online Access:https://www.shs-conferences.org/articles/shsconf/pdf/2025/09/shsconf_icdde2025_02013.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Customer churn prediction is critical for financial institutions to retain clients and optimize resource allocation. It is less expensive to keep current clients than to find new ones. There lots of research in this field, but their performance is often limited by data imbalance issues. This study compares three machine learning models: Random Forest, XGBoost Classifier, and Light Gradient Boosting Machine Classifier for predicting credit card customer churn using a dataset from Kaggle. The research addresses data imbalance issues through oversampling techniques (SMOTE, SMOTEENN, Borderline SMOTE) and evaluates model performance using accuracy and F1 score. Results show that the LGBM Classifier with Borderline SMOTE achieves the highest accuracy (97.43%) and F1 score (0.9259), outperforming other methods. This approach effectively balances precision and recall, improving minority class prediction. These findings provide actionable insights for financial institutions to implement proactive retention strategies. There are still limitations and future work to do. More different datasets, updated models for small datasets, and more feature engineering methods should be taken into consideration.
ISSN:2261-2424