Bozan M.T.Gozukara H.Patel J.Kizilay A.Sahin Z.Tosun B.Cakar T.2026-03-052026-03-0520252521-1641https://doi.org/10.1109/UBMK67458.2025.11207037https://hdl.handle.net/20.500.11779/3228Customer churn prediction is critical for businesses to retain customers and reduce revenue loss. This paper presents a retail customer churn prediction study. We preprocess transactional data from a retail dataset comprising approximately 19.7 million transactions involving over 1 million customers. Temporal behavioral features, such as purchase frequency, monetary value, product variety, and promotional engagement metrics, are engineered using a four-month observation window. A Random Forest classifier is trained, utilizing balanced class weighting to address churn class imbalance. The churn label is defined as customers not purchasing in the subsequent six-month period. Our Random Forest model achieves approximately 84% accuracy, 86% precision, 85% recall, and an F1- score of 85%. Additionally, an XGBoost model achieves similar accuracy (≈ 84%) but higher recall (93%) and F1-score (89%), indicating improved churn prediction. The confusion matrix illustrates clear model performance. This study demonstrates that carefully engineered RFM-based features and ensemble learning approaches significantly enhance churn prediction in retail contexts. © 2025 IEEE.eninfo:eu-repo/semantics/closedAccessCustomer BehaviorCustomer ChurnEnsemble LearningFeature EngineeringMachine LearningPredictive ModelingRandom ForestRetention StrategiesPredicting Customer Churn in Retail Using Machine Learning on Transaction DataConference Object10.1109/UBMK67458.2025.112070372-s2.0-105030842914