Fraud Analytics with Machine Learning and Big Data Engineering for Telecom
-
Upload
sudarson-roy-pratihar -
Category
Data & Analytics
-
view
654 -
download
2
Transcript of Fraud Analytics with Machine Learning and Big Data Engineering for Telecom
Fraud Analytics with Machine Learning & Engineering (FAME) for Telecom using Big Data
Presented by:
Sudarson Roy Pratihar
Pranab Kumar Dash
Subhadip Paul
Amartya Kumar Das 1 Copyright © 2015 Authors. All rights reserved.
A Quick Intro – Telecom Frauds
Fraud Analytics With Machine Learning & Engineering
2
• Have you got missed call from unknown numbers from overseas?
• Have you heard of PBX hacking and corporate facing huge bills?
Problem Definition
• Telecom industries loose 46.3 billion USD globally due to various frauds
• 10% operators have bad debt due to fraud
• Detection is cat and mouse game – pattern changes to get undetected by available data mining techniques
• Timely alert by processing huge volume of call records is a challenge
• Alerts with high false positives have more operational expenses
Fraud Analytics With Machine Learning & Engineering
3
Importance to Telecom Industry & Society
• Efficient and self adaptive detection
mechanism can reduce significant loss
(about 2.1% of the revenue) due to fraud
and operational cost
• Less “Bad Money” to the system
Fraud Analytics With Machine Learning & Engineering 4
Data Source
• More than 1 TB of Call Detail Record
(CDR) from a reputed wholesale carrier
as history data
• Tested on few weeks of live CDR of the
carrier
Fraud Analytics With Machine Learning & Engineering 5
Analytics Technique
• Basic components of FAME are:
– Self adaptive Machine learning
methodology
– Actionable dash board for operations and
investigations team to act upon the alerts
and feedback sent to machine learning
model for adjusting weights.
– High performance big data platform for
data processing and machine learning
Fraud Analytics With Machine Learning & Engineering 6
How it detects and adapts …
7 Fraud Analytics With Machine Learning & Engineering
Fraud Detection Model Pipeline
Novelty Detection Pipeline / Stacking
Actionable Dashboards
Pattern validation and tuning work bench
CDR Feed
1
2 4
Remaining Data
Frauds detected 3
5
6
7 New Patterns More frauds
8
New model addition / Tuning of existing 9
10
Operators feedback
Analyst
Operator
Novelty Detection Pipeline
8 Fraud Analytics With Machine Learning & Engineering
• Novelty detection of origin and destination numbers separately
• Various Contextual Anomaly Detection used and outputs are combined
• Below are some examples of algorithms used • Box-plot based outlier • Clustering to find out cluster with distinct
centroid • Use of Mahalonbis Distance – Mdist > ɸ. IQR
Fraud Detection Pipeline
10
• Use history data and flag records based on “Novelty Detection Pipeline”
• Verify those records and mark them
• Build separate models (logistic regression, random forest models and threshold based) for different patterns
• Combine outputs of the models
Fraud Analytics With Machine Learning & Engineering
ACTIONABLE DASHBOARD
System Behind Magic …
11 Fraud Analytics With Machine Learning & Engineering
ENSEMBLE OF SELF ADAPTIVE ALGOS
BIG DATA PLATFORM POWERED BY HADOOP & SPARK
INTE
GR
ATI
ON
FA
CET
S
FEEDBACK
CDR FEED FROM TELECOM SYSTEM
Accuracy Results
13
0 0.2 0.4 0.6 0.8 1
True positive
False positive
Accuracy
B-Number A-Number
Fraud Analytics With Machine Learning & Engineering
• Individual accuracy for origin and destination numbers detection
• Combined mechanism has <5% false positive
What Next …
14
• Test for different types telecom frauds
• Extend this industrialized approach to other areas (such as network intrusion detection)
• Productize as cloud based service as well as on premise implementation
Fraud Analytics With Machine Learning & Engineering
Contact Us @
15 Fraud Analytics With Machine Learning & Engineering
Amartya Kumar Das [email protected]
https://in.linkedin.com/pub/amartya-das/b/72b/637
Subhadip Paul [email protected]
https://in.linkedin.com/in/subhadippaul
Pranab Kumar Dash [email protected]
www.linkedin.com/profile/view?id=19155039
Sudarson Roy Pratihar [email protected]
www.linkedin.com/in/sudarson
Follow us #FAMETELCO