Churn modelling
-
Upload
yogesh-khandelwal -
Category
Data & Analytics
-
view
147 -
download
1
Transcript of Churn modelling
![Page 1: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/1.jpg)
Predicting Churn in Telecom
![Page 2: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/2.jpg)
Outline
• Business Problem• Variable Description• Exploratory Data Analysis• Feature Selection• Data Pre-Processing• Model Development• Model Validation
![Page 3: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/3.jpg)
Business Problem• Consumers today go through a complex decision making
process before subscribing to any one of the numerous Telecom service options.
• The services provided by the Telecom vendors are not highly differentiated and number portability is commonplace.
• customer loyalty becomes an issue. Hence, it is becoming increasingly important for telecommunications companies to proactively identify factors that have a tendency to unsubscribe and take preventive measures to retain customers.
![Page 4: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/4.jpg)
Variable Description• State : categorical, for the 50 states and the District of Columbia• Account Length : integer-valued, how long account has been active• Area Code : categorical• Phone : Phone number of customer• Int'l Plan : International plan activated ( yes , no)• VMail Plan : Voice Mail plan activated ( yes , no )• VMail Message :No. of voice mail messages• Day Mins : Total day minutes used• Day Calls : Total day calls made• Day Charge : Total day charge• Eve Mins : Total evening minutes• Eve Calls : Total evening calls• Eve Charge : Total evening charge• Night Mins : Total night minutes• Night Calls : Total night calls• Night Charge : Total night charge• Intl Mins : Total International minutes used• Intl Calls : Total International calls made• Intl Charge : Total International charge• CustServ Calls : Number of customer service calls made• Churn : Customer churn (Target Variable 1= churn , 0= not churned )
![Page 5: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/5.jpg)
Exploratory Data Analysis
![Page 6: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/6.jpg)
Summary statistics
![Page 7: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/7.jpg)
Visualizing statistics
![Page 8: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/8.jpg)
![Page 9: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/9.jpg)
Plot 1
![Page 10: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/10.jpg)
Plot 2:
![Page 11: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/11.jpg)
Plot 3
![Page 12: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/12.jpg)
Few observation from exploratory analysis
• Customers with the International Plan tend to churn more frequently
• Customers with the Voice Mail Plan tend to churn less frequently.
• Customers with four or more customer service calls churn more than four times as often as do the other customers.
![Page 13: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/13.jpg)
Feature Selection
• Important features were identified during model building process for ex:– Stepwise regression indicates important variable
to consider– Variable importance graph has been generated
using random forest and so on
![Page 14: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/14.jpg)
Data Pre-Processing
• Dataset considered for this project is already cleaned• We have partitioned our dataset into training and
testing set using simple random sampling• We have dropped following four variables as they
are not adding any meaning for modelling purpose– State– Area.code– Account.length– Phone number
![Page 15: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/15.jpg)
Model 1: Decision Tree
• Easy to interpret• Generates if-else business rules• Recursive partitioning and classification technique is
used • Tree build– Fully grown (results in overfitting of data)– Pruned tree (optimal tree)
• R packages used:– Rpart– Caret
![Page 16: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/16.jpg)
Tree 1: Full Tree
![Page 17: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/17.jpg)
Performance measure of full tree : ROC Curve
![Page 18: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/18.jpg)
Performance measure of full tree : Confusion Matrix and other statistics
![Page 19: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/19.jpg)
Tree 2: Pruned Tree
![Page 20: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/20.jpg)
Performance Measure of Pruned Tree:ROC Curve
![Page 21: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/21.jpg)
Performance measure of Pruned tree : Confusion Matrix and other statistics
![Page 22: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/22.jpg)
Comparing Performance of both the tree: ROC Curve
![Page 23: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/23.jpg)
Compare : Confusion Matrix and other statistics
Full Tree Pruned Tree
![Page 24: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/24.jpg)
Model 2: Logistic Regression
• Widely used across industry• R packages used– Glm for model building– Caret for model evaluation
![Page 25: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/25.jpg)
Model Summary on all variable as Input
![Page 26: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/26.jpg)
Model summary on statistically significant variables
![Page 27: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/27.jpg)
Model Evaluation-Confusion Matrix
![Page 28: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/28.jpg)
Model 3: Support Vector Machine
• Widely used black box technique for binary classification
• R packages used– e1071 (for model building)– Caret (for model evaluation)
![Page 29: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/29.jpg)
Model performance: Confusion Matrix
![Page 30: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/30.jpg)
Model Evaluation: SVM Roc Curve
![Page 31: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/31.jpg)
Model 4: Ensemble (Random Forest)
• Ensembling of decision trees will be done • R packages used:– randomForest (model development)– caret (model evaluation)
![Page 32: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/32.jpg)
Variable Importance Plot : Random Forest
![Page 33: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/33.jpg)
Model Evaluation : Confusion Matrix
![Page 34: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/34.jpg)
Model Evaluation : ROC curve (Random Forest)
![Page 35: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/35.jpg)
Models Comparison: ROC curve
![Page 36: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/36.jpg)
CUSTMER SEGMENTATION & CLTV CALCULATION
• Different techniques are available for customer segmentation.
• Customer can be segmented into different kind of profiles like high value, low value, warm, cold and so on.
• RFM analaysis, CLTV based segmentation, clustering based segmentation are few techniques to name
![Page 37: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/37.jpg)
CLTV( customer life time value)
• CLTV (Customer LifeTime Value) refers to the amount of revenues that you expect to generate from a customer during the period over which your service will be of value.
• On the basis of above values we segment customer profiles and treat them accordingly
![Page 38: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/38.jpg)
Assumptions• Due to limitation in our dataset we performed CLTV
analysis on the basis of the following assumptions:– Given data contains one year of transaction details– Unit of amount is dollars– following are the margins that company is getting from their
customer• 5% of day charge• 10% of evening hours• 20% of night and international calls
– Monthly churn rate of telecom industry is 4%
Note: above numbers are for illustration purpose only and it depends on domain knowledge of analyst.
![Page 39: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/39.jpg)
CLTV calculation
• On the basis of this assumptions net profit from any customer can be calculated as:
-> Net profit = 0.05*daycharge + 0.10* eve.charge + 0.15 *night charge + 0.20 * Intnl charge->Churnrate = 0.04->Customer_cltv = (netprofit-0.5*cust_serv_call)/churnrate
• For illustration purpose in our case customers whose cltv is less than mean(cltv) are considered as LVC and other are HVC
Note: Above segmentation can be done in a better way with the help of business domain expert
![Page 40: Churn modelling](https://reader036.fdocuments.us/reader036/viewer/2022062223/58997e961a28abb97c8b49fb/html5/thumbnails/40.jpg)
• THANK YOU