Predicting Customer Churn in Retail Using Machine Learning on Transaction Data
Loading...

Date
2025
Journal Title
Journal ISSN
Volume Title
Publisher
Institute of Electrical and Electronics Engineers Inc.
Open Access Color
OpenAIRE Downloads
OpenAIRE Views
Abstract
Customer churn prediction is critical for businesses to retain customers and reduce revenue loss. This paper presents a retail customer churn prediction study. We preprocess transactional data from a retail dataset comprising approximately 19.7 million transactions involving over 1 million customers. Temporal behavioral features, such as purchase frequency, monetary value, product variety, and promotional engagement metrics, are engineered using a four-month observation window. A Random Forest classifier is trained, utilizing balanced class weighting to address churn class imbalance. The churn label is defined as customers not purchasing in the subsequent six-month period. Our Random Forest model achieves approximately 84% accuracy, 86% precision, 85% recall, and an F1- score of 85%. Additionally, an XGBoost model achieves similar accuracy (≈ 84%) but higher recall (93%) and F1-score (89%), indicating improved churn prediction. The confusion matrix illustrates clear model performance. This study demonstrates that carefully engineered RFM-based features and ensemble learning approaches significantly enhance churn prediction in retail contexts. © 2025 IEEE.
Description
Keywords
Customer Behavior, Customer Churn, Ensemble Learning, Feature Engineering, Machine Learning, Predictive Modeling, Random Forest, Retention Strategies
Fields of Science
Citation
WoS Q
N/A
Scopus Q
N/A

OpenCitations Citation Count
N/A
Source
International Conference on Computer Science and Engineering, UBMK
Volume
Issue
2025
Start Page
1135
End Page
1140
PlumX Metrics
Citations
Scopus : 0
Captures
Mendeley Readers : 1
Google Scholar™

