基于特征分组集成学习的电信客户流失预测系统

VIP文档

ID:38374

大小:2.05 MB

页数:12页

时间:2023-03-10

金币:10

上传者:战必胜
applied
sciences
Article
Telecom Churn Prediction System Based on Ensemble Learning
Using Feature Grouping
Tianpei Xu, Ying Ma and Kangchul Kim *

 
Citation: Xu, T.; Ma, Y.; Kim, K.
Telecom Churn Prediction System
Based on Ensemble Learning Using
Feature Grouping. Appl. Sci. 2021, 11,
4742. https://doi.org/10.3390/
app11114742
Academic Editor: João Carlos de
Oliveira Matias
Received: 20 April 2021
Accepted: 19 May 2021
Published: 21 May 2021
Publishers Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2021 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
Department of Computer Engineering, Chonnam National Unversity, Yeosu 59626, Korea;
197525@jnu.ac.kr (T.X.); 207939@jnu.ac.kr (Y.M.)
* Correspondence: kkc@jnu.ac.kr
Abstract:
In recent years, the telecom market has been very competitive. The cost of retaining existing
telecom customers is lower than attracting new customers. It is necessary for a telecom company to
understand customer churn through customer relationship management (CRM). Therefore, CRM
analyzers are required to predict which customers will churn. This study proposes a customer-churn
prediction system that uses an ensemble-learning technique consisting of stacking models and soft
voting. Xgboost, Logistic regression, Decision tree, and Naïve Bayes machine-learning algorithms are
selected to build a stacking model with two levels, and the three outputs of the second level are used
for soft voting. Feature construction of the churn dataset includes equidistant grouping of customer
behavior features to expand the space of features and discover latent information from the churn
dataset. The original and new churn datasets are analyzed in the stacking ensemble model with four
evaluation metrics. The experimental results show that the proposed customer churn predictions
have accuracies of 96.12% and 98.09% for the original and new churn datasets, respectively. These
results are better than state-of-the-art churn recognition systems.
Keywords: customer churn; CRM; machine learning; ensemble learning; feature grouping
1. Introduction
Owing to fierce competition among telecom companies, customer churn is inevitable.
Customer churn is the act of a customer ending a subscription to a service provider and
choosing the services of another company.
Companies must reduce customer churn because it weakens the company. A survey
showed that the annual churn rate in the telecom industry ranges from 20% to 40%, and
the cost of retaining existing customers is 5–10 times lower than the cost of obtaining
new customers [
1
]. The cost of predicting churn customers is 16 times lower than that
for obtaining new customers [
2
]. Decreasing the churn rate by 5% increases the profit
from 25% to 85% [
3
]. This shows that customer-churn prediction is important for the
telecom sector. Telecom companies consider customer relationship management (CRM) an
important factor in retaining existing customers and preventing customer churn.
To retain existing customers, CRM analyzers must predict which customers will churn
and analyze the reasons for customer churn. Once the at-risk customers are identified,
the company must perform marketing campaigns for churn customers to maximize the
churn-customer retention. Therefore, customer-churn prediction is an important part of
CRM [4].
The accuracy of the prediction systems used by CRM analyzers is important. If analyz-
ers are inaccurate in predicting customer churn, no campaigns can be performed. Owing
to recent advancements in data science, data mining and machine learning technologies
provide solutions to customer churn. However, there are several limitations in existing
models. For example, logistic regression, a common churn-prediction model based on
older data-mining methods, is relatively inaccurate. Furthermore, feature construction [
5
]
Appl. Sci. 2021, 11, 4742. https://doi.org/10.3390/app11114742 https://www.mdpi.com/journal/applsci
资源描述:

当前文档最多预览五页,下载文档查看全文

此文档下载收益归作者所有

当前文档最多预览五页,下载文档查看全文
温馨提示:
1. 部分包含数学公式或PPT动画的文件,查看预览时可能会显示错乱或异常,文件下载后无此问题,请放心下载。
2. 本文档由用户上传,版权归属用户,天天文库负责整理代发布。如果您对本文档版权有争议请及时联系客服。
3. 下载前请仔细阅读文档内容,确认文档内容符合您的需求后进行下载,若出现内容与标题不符可向本站投诉处理。
4. 下载文档时可能由于网络波动等原因无法下载或下载错误,付费完成后未能成功下载的用户请联系客服处理。
关闭