Telecommunication industry is one of the major sectors which is at higher risk of losing revenue due to customer churn. Thus, when churn management is done effectively, it provides a competitive advantage to the telecom company over its competitors by increasing customer retention rate. Although many machine learning algorithms exist today, few algorithms are effective to consider the imbalanced nature of the telecommunication’s dataset. The real telecommunication data also varies differently from the publicly available dataset and hence the effectiveness of machine learning algorithms may vary differently. Therefore, this research has tried to bridge this gap by undertaking native dataset of one of the major Telecommunications Industry of Nepal and applying XGBoost on this dataset which contains 52332 records of customers, out of which 46204 are non-churned and 6128 are churned customers. The accuracy and f1-score obtained on the native dataset are 97% and 88% respectively. This research work has also undertaken publicly available dataset that contains 3333 subscribers for the purpose of comparison with previous researches and obtained an improved accuracy and f1-score of 96.25% and 86.34% respectively.
Publication URL:
https://sciencedirect.com/science/article/pii/S187705092202138X