Explainable Machine Learning for Telecom Customer Churn Prediction and Actionable Retention Strategies
DOI:
https://doi.org/10.65455/1z4v1627Keywords:
Explainable Machine Learning, SHAP, Logistic Regression, Telecom Operations, Customer Retention StrategiesAbstract
Customer churn is a critical issue in customer relationship management (CRM) in the telecommunications industry. Accurately identifying high-risk customers and providing actionable intervention recommendations is crucial for improving customer lifetime value and reducing operating costs. This paper addresses the task of predicting customer churn in the telecommunications industry by constructing an interpretable machine learning analysis framework of "prediction-intervention." On the Telco customer dataset containing 21 customer features, the discriminative performance of three models—XGBoost, Random Forest, and Logistic Regression—is compared. SHAP (SHapley Additive exPlanations) is introduced into the optimal model family to achieve interpretability analysis at both the global and individual levels. Experimental results show that the performance of the three models is similar, with Logistic Regression achieving the highest AUC (0.835) and F1 (0.593), while XGBoost and Random Forest have AUCs of 0.833 and 0.829, respectively. Confusion matrix analysis reveals that the main bottleneck under the current setup is false negative reporting (FN), indicating the need for threshold and cost-sensitive optimization. SHAP results indicate that contract type, tenure, online security, monthly charges, and technical support are the most critical churn drivers. Furthermore, this paper proposes a three-tiered, precise retention strategy targeting different risk levels and driving factors, providing a practical reference for deploying explainable AI-driven churn management systems in real-world telecom operations.
References
[1]NESLIN S A, GUPTA S, KAMAKURA W, et al. Defection detection: Measuring and understanding the predictive accuracy of customer churn models. Journal of Marketing Research, 2006, 43(2): 204–211. DOI: https://doi.org/10.1509/jmkr.43.2.204
[2]COUSSEMENT K, VAN DEN POEL D. Churn prediction in subscription services: An application of support vector machines while comparing two parameter-selection techniques. Expert Systems with Applications, 2008, 34(1): 313–327. DOI: https://doi.org/10.1016/j.eswa.2006.09.038
[3]CHEN T, GUESTRIN C. XGBoost: A scalable tree boosting system//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: ACM, 2016: 785–794. DOI: https://doi.org/10.1145/2939672.2939785
[4]VAFEIADIS T, DIAMANTARAS K I, SARIGIANNIDIS G, et al. A comparison of machine learning techniques for customer churn prediction. Simulation Modelling Practice and Theory, 2015, 55: 1–9. DOI: https://doi.org/10.1016/j.simpat.2015.03.003
[5]HUANG B, KECHADI M T, BUCKLEY B. Customer churn prediction in telecommunications. Expert Systems with Applications, 2012, 39(1): 1414–1425. DOI: https://doi.org/10.1016/j.eswa.2011.08.024
[6]WANGPERAWONG A, BRUN C, LAUDY O, et al. Churn analysis using deep convolutional neural networks and autoencoders. arXiv, 2016: arXiv:1604.05377.
[7]VERBEKE W, DEJAEGER K, MARTENS D, et al. New insights into churn prediction in the telecommunication sector: A profit driven data mining approach. European Journal of Operational Research, 2012, 218(1): 211–229. DOI: https://doi.org/10.1016/j.ejor.2011.09.031
[8]DE CAIGNY A, COUSSEMENT K, DE BOCK K W. A new hybrid classification algorithm for customer churn prediction based on logistic regression and decision trees. European Journal of Operational Research, 2018, 269(2): 760–772. DOI: https://doi.org/10.1016/j.ejor.2018.02.009
[9]ARRIETA A B, DÍAZ-RODRÍGUEZ N, DEL SER J, et al. Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion, 2020, 58: 82–115. DOI: https://doi.org/10.1016/j.inffus.2019.12.012
[10]RIBEIRO M T, SINGH S, GUESTRIN C. "Why should I trust you?" Explaining the predictions of any classifier//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: ACM, 2016: 1135–1144. DOI: https://doi.org/10.1145/2939672.2939778
[11]LUNDBERG S M, LEE S I. A unified approach to interpreting model predictions//Advances in Neural Information Processing Systems. San Diego, USA: NIPS, 2017, 30: 4765–4774.
[12]LUNDBERG S M, ERION G, CHEN H, et al. From local explanations to global understanding with explainable AI for trees. Nature Machine Intelligence, 2020, 2(1): 56–67. DOI: https://doi.org/10.1038/s42256-019-0138-9
[13]MOLNAR C. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable. 2nd ed. 2022.
[14]BREIMAN L. Random forests. Machine Learning, 2001, 45(1): 5–32. DOI: https://doi.org/10.1023/A:1010933404324
[15]ASCARZA E. Retention futility: Targeting high-risk customers might be ineffective. Journal of Marketing Research, 2018, 55(1): 80–98. DOI: https://doi.org/10.1509/jmr.16.0163
Downloads
Published
Issue
Section
License
Copyright (c) 2026 The Author(s) .Applied Artificial Intelligence Research published by CSTDP

This work is licensed under a Creative Commons Attribution 4.0 International License.