Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud •...

51
#1 Agile Predictive Analytics Platform for Today’s Modern Analysts RapidMiner Wisdom 2018 New Orleans, LA, USA, October 12 th , 2018 Ralf Klinkenberg, Founder & Head of Data Science Research, RapidMiner [email protected] www.RapidMiner.com Fraud Detection and Prevention: Leveraging Machine Learning to Detect Fraud Patterns, Anomalies, and Unusual Behaviors

Transcript of Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud •...

Page 1: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

#1 Agile Predictive Analytics Platform for Today’s Modern Analysts

RapidMiner Wisdom 2018 – New Orleans, LA, USA, October 12th, 2018

Ralf Klinkenberg, Founder & Head of Data Science Research, RapidMiner

[email protected]

www.RapidMiner.com

Fraud Detection and Prevention: Leveraging Machine Learning to Detect Fraud Patterns, Anomalies, and Unusual Behaviors

Page 2: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 2 -

Creating Value from Big Data

Fraud – Areas & Types & Relevance

Machine Learning for Fraud Detection & Prevention

Credit Card Fraud Detection & Prevention

1.

2.

3.

4. Healthcare Fraud Detection & Prevention

Page 3: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

- 3 -©2016 RapidMiner, Inc. All rights reserved.

Fraud

©2016 RapidMiner, Inc. All rights reserved.

Page 4: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 4 -

Fraud: Areas and Types of Fraud

• Credit Card Fraud

Page 5: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 5 -

Fraud: Areas and Types of Fraud

• Credit Card Fraud

• Tax Fraud

– EU: Value Added Tax (VAT) Fraud in Transactions withinNetworks of Companies

– Income Tax Fraud / Corporate Tax Fraud

Page 6: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 6 -

Fraud: Areas and Types of Fraud

• Credit Card Fraud

• Tax Fraud

– EU: Value Added Tax (VAT) Fraud in Transactions withinNetworks of Companies

– Income Tax Fraud / Corporate Tax Fraud

• Fraud in Supply Chains, Retail Networks, Purchase Departments, Procurement

Page 7: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 7 -

Fraud: Areas and Types of Fraud

• Credit Card Fraud

• Tax Fraud– EU: Value Added Tax (VAT) Fraud in Transactions within

Networks of Companies– Income Tax Fraud / Corporate Tax Fraud

• Fraud in Supply Chains, Retail Networks, Purchase Departments, Procurement

• Insurance Fraud:– Car Insurance (Faked Accidents)

– Fire Insurance

– Healthcare Insurance

Page 8: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 8 -

Fraud: Healthcare Insurance Fraud

• Example: Medicaid/Medicare in the USA: 1 US State alone: 6 billion US$ budget per year => estimated 10-20% fraud & waste=> 1 billion US$ per year lost

Page 9: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 9 -

Fraud: Healthcare Insurance Fraud

• Example: Medicaid/Medicare in the USA: 1 US State alone: 6 billion US$ budget per year => estimated 10-20% fraud & waste=> 1 billion US$ per year lost

• Fraudulent Patients (e.g. Drug Addicts/Dealers/Resellers)

• Fraudulent Doctors

• Fraudulant Pharmacies / Hospitals / Service Providers / Suppliers

• Individuals as well as Networks of Fraudsters

Page 10: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 10 -

Fraud: Challenges for Fraud Detection

• Large Number of Potential Types and Areas of Fraud

• Intelligent and Constantly Improving Adversaries

• Changing Fraud Patterns and Types

• Large Amounts of Potentially Relevant Data

• Large Variety of Potentially Relevant Data Sources & Types– Structured and Unstructured Data: Transactions, Time Series Data,

Textual Data, Network Data, Entity Relations, etc.

• Limited Resources for Fraud Detection & Prevention– Which cases to investigate (first / at all)?

– Prioritize & focus to maximize effectiveness & efficiency

Page 11: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 11 -

Fraud: Known vs. Unknown Types of Fraud

• New instances of known types of fraud should beautomatically identified

Page 12: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 12 -

Fraud: Known vs. Unknown Types of Fraud

• New instances of known types of fraud should beautomatically identified:

=> use Machine Learning to automatically find patterns(in data from the past with known fraud cases)

Page 13: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 13 -

Fraud: Known vs. Unknown Types of Fraud

• New instances of known types of fraud should beautomatically identified:

=> use Machine Learning to automatically find patterns

=> deploy generated models to automatically identify new cases

Page 14: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 14 -

Fraud: Known vs. Unknown Types of Fraud

• New instances of known types of fraud should beautomatically identified:

=> use Machine Learning to automatically find patterns

=> deploy generated models to automatically identify new cases

• But what about new types of fraud?

Page 15: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

- 15 -©2016 RapidMiner, Inc. All rights reserved.

Machine Learning forFraud Detection

©2016 RapidMiner, Inc. All rights reserved.

Page 16: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

- 16 -©2016 RapidMiner, Inc. All rights reserved.

Predictive Analytics Transforms Insight into ACTION

Descriptive

Diagnostic

Predictive

Prescriptive

OBSERVEWhat happened

EXPLAINWhy did it happen

ANTICIPATEWhat will happen

ACTOperationalize

Value

Page 17: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 17 -

Metrics & Indicators for Fraud Risk

• Domain experts often know metrics that may be indicative of a high risk of fraud

Page 18: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 18 -

Metrics & Indicators for Fraud Risk

• Domain experts often know metrics that may be indicative of a high risk of fraud => incorporate into entity features

• Examples:

– Entity = Patient:

▪ Total Payments Received,

▪ Number of Prescriptions,

▪ Number of Doctors Visited,

▪ Number of Pills per Month, etc.

– Entity = Prescriber (e.g. Doctor):

▪ Total Payments Received, Number of Patients per Month, Amount per Patient, etc.

– Entity = Service Provider (e.g. Pharmacy, Hospital, etc.):

▪ Total Payments Received, Price per Unit, Price per Treatment of Type X, etc.

Page 19: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 19 -

Comparison to Peer Groups

• Does a high value of „Total Amounts Prescribed“ automaticallymean the entity (e.g. doctor) is fraudulent?

Page 20: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 20 -

Comparison to Peer Groups

• Does a high value of „Total Amounts Prescribed“ automaticallymean the entity (e.g. doctor) is fraudulent?

• No.

Page 21: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 21 -

Comparison to Peer Groups

• Does a high value of „Total Amounts Prescribed“ automaticallymean the entity (e.g. doctor) is fraudulent?

• No, but a high total amount prescribed my indicate ahigh risk of fraud.

Page 22: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 22 -

Comparison to Peer Groups

• Does a high value of „Total Amounts Prescribed“ automaticallymean the entity (e.g. doctor) is fraudulent?

• No, but a high total amount prescribed my indicate a high risk of fraud.

• Oncologists often need to prescribe expensive anti-cancerdrugs=> oncologists may have higher „Total Amounts Prescribed“

than other types of doctors (specializations)=> compare a doctor‘s metric to the average value of his/her

peers (and not to the average for all doctors) => ratio.

Page 23: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 23 -

Leverage Fraud Risk Indicators

• Does a high value of „Total Payments Received“ automaticallymean the entity (e.g. doctor) is fraudulent?

• No, but a high total amount received my indicate a high risk offraud.

• => Rank entities by value of key metrics => suspects

Page 24: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 24 -

Combined Fraud Risk Indicators

• Does a high value of „Total Payments Received“ automaticallymean the entity (e.g. doctor) is fraudulent?

• No, but a high total amount received my indicate a high risk offraud.

• => Rank entities by value of key metrics => suspects

• => Combine metrics (e.g. weighted sum): Fraud Risk Score=> Rank entities by value of combined metric => suspects

Page 25: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 25 -

Leverage Fraud Risk Indicators

• Does a high value of „Total Payments Received“ automaticallymean the entity (e.g. doctor) is fraudulent?

• No, but a high total amount received my indicate a high risk offraud.

• => Rank entities by value of key metrics => suspects

• => Combine metrics (e.g. weighted sum): Fraud Risk Score=> Rank entities by value of combined metric => suspects

• No machine learning yet, but an often used initial solution torank and prioritize entities for review / audits / investigation

• => more effective & efficient use of resources (auditors)

Page 26: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 26 -

Classification

Algorithms to predict classes(Fraud / No Fraud)

Grouping

Group similar items together(Segmentation, Clustering, Item Sets,Association Rules, Sequence Analysis,

Network Analysis)

Anomaly Detection

Find outliers in your data(unusual behaviors)

Regression

Algorithms to predict numbers(Fraud Risk Scores or Expected Values)

Automation

Optimization

Deployment

Feature Extraction

&Selection

Unsupervised Learning

Supervised Learning

Page 27: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 27 -

Machine Learning: Supervised vs. Unsupervised

• Supervised Machine Learning:– Data from the past with known fraud and non-fraud cases (label);

– Machine Learning of Classification models or Association rules to find fraud patterns from the past and to automatically identify newinstances of these fraud types in new data;

– Applicable to known fraud cases, patterns, and types.

Page 28: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 28 -

Machine Learning: Supervised vs. Unsupervised

• Supervised Machine Learning:– Data from the past with known fraud and non-fraud cases (label);– Machine Learning of Classification models or Association rules to find

fraud patterns from the past and to automatically identify new instancesof these fraud types in new data;

– Applicable to known fraud cases, patterns, and types.

• Unsupervised Machine Learning:– Clustering (Segmentation): Grouping entities into clusters of similar

entities (patients, doctors, service providers, etc.);– Anomaly Detection / Outlier Detection: detect unusual behaviors;– Both depend on selected attributes, normalization and/or weighting;– Attribute Weighting can be used to incorporate domain knowledge and/or

priorities;– Allows to find previously unknown types of fraud.

Page 29: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

- 29 -©2016 RapidMiner, Inc. All rights reserved.

Fraud Detection and Prediction

©2016 RapidMiner, Inc. All rights reserved.

Page 30: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 30 -

Fraud Detection with Machine Learning

• Step 1: Finding Known Fraud Patterns by Embedding Domain Expert Knowledge: Fraud Risk Scoring & Ranking of Entities=> From Random Checks to

Systematic Automated Checks & Prioritization: => Data Mining to Automate Fraud Detection

Page 31: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 31 -

Fraud Detection with Machine Learning

• Step 1: Finding Known Fraud Patterns by Embedding Domain Expert Knowledge: Fraud Risk Scoring & Ranking of Entities=> From Random Checks to

Systematic Automated Checks & Prioritization: => Data Mining to Automate Fraud Detection

• Step 2: Identifying Known Fraud Patterns with Machine Learning and Automatically Detecting Them in the Future:Supervised Learning:

– Automated Classification

– Risk Score Regression

– Association Rule Generation

Page 32: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 32 -

Fraud Detection with Machine Learning

• Step 1: Finding Known Fraud Patterns by Embedding Domain Expert Knowledge: Fraud Risk Scoring & Ranking of Entities=> From Random Checks to

Systematic Automated Checks & Prioritization: => Data Mining to Automate Fraud Detection

• Step 2: Identifying Known Fraud Patterns with Machine Learning and Automatically Detecting Them in the Future:Supervised Learning: Automated Classification, Risk Score Regression, Association Rule Generation

• Step 3: Identifying Previously Unknown Fraud Cases or Patterns: Unsupervised Learning: Anomaly Detection, Outlier Detection

Page 33: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 33 -

Fraud Detection with Machine Learning

• Step 1: Finding Known Fraud Patterns by Embedding Domain Expert Knowledge: Fraud Risk Scoring & Ranking of Entities=> From Random Checks to Systematic Automated Checks & Prioritization: => Data Mining to Automate Fraud Detection

• Step 2: Identifying Known Fraud Patterns with Machine Learning and Automatically Detecting Them in the Future:Supervised Learning: Automated Classification, Risk Score Regression, Association Rule Generation

• Step 3: Identifying Previously Unknown Fraud Cases or Patterns: Unsupervised Learning: Anomaly Detection, Outlier Detection

• Step 4: Comparison with Expectations: Predict Volumes & Prices and Compare with Actual Medications

Page 34: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 34 -

Fraud Detection with Machine Learning

• Step 1: Finding Known Fraud Patterns by Embedding Domain Expert Knowledge: Fraud Risk Scoring & Ranking of Entities=> From Random Checks to Systematic Automated Checks & Prioritization: => Data Mining to Automate Fraud Detection

• Step 2: Identifying Known Fraud Patterns with Machine Learning and Automatically Detecting Them in the Future:Supervised Learning: Automated Classification, Risk Score Regression, Association Rule Generation

• Step 3: Identifying Previously Unknown Fraud Cases or Patterns: Unsupervised Learning, Anomaly Detection, Outlier Detection

• Step 4: Comparison with Expectations: Predict Volumes & Prices and Compare with Actual Medications

• Step 5: Adversial Machine Learning / Text Analytics / Process Mining

Page 35: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 35 -

Fraud Detection with Machine Learning

• Step 1: Finding Known Fraud Patterns by Embedding Domain Expert Knowledge: Fraud Risk Scoring & Ranking of Entities=> From Random Checks to Systematic Automated Checks & Prioritization: => Data Mining to Automate Fraud Detection

• Step 2: Identifying Known Fraud Patterns with Machine Learning and Automatically Detecting Them in the Future:Supervised Learning: Automated Classification, Risk Score Regression, Association Rule Generation

• Step 3: Identifying Previously Unknown Fraud Cases or Patterns: Unsupervised Learning, Anomaly Detection, Outlier Detection

• Step 4: Comparison with Expectations: Predict Volumes & Prices and Compare with Actual Medications

• Step 5: Adversial Machine Learning / Text Analytics / Process Mining• Step 6: (Semi-)Automated Audits (Auditors Remain in Control)

Page 36: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 36 -

Credit Card FraudCredit Card Fraud

Page 37: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 37 -

Meta Data

Amount

Location

Receiver

TimeStamp

CardId

Page 38: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 38 -

RandomUnsupervised(Semi) Supervised

- 38 -

Three Method’s to Combine

Page 39: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 39 -

Card-Number (ID) Probability

RandomUnsupervised(Semi)

Supervised

Page 40: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 40 -

Challenge I

Being good at detecting known patternsvs

Seeing the new and unknown

Page 41: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 41 -

Challenge II

Detection Rate

Page 42: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 42 -

Transforming transactional data (e.g., purchase/date) into a table (RapidMiner Example Set)

Data aggregation and enrichment

=> Creating a profile of the customer

Being good at detecting known patterns

vs

Seeing the new and unknown

Unsupervised

vs

(Semi-) Supervised Learning

Detection rate is critical

Relatively few fraud cases compared to thousands of legit transactions

Page 43: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 43 -

Now we have a profile, what do we do with it?

Rule based

Daily amount < 500€ p.d.

Local Outlier Factor (LOF)

Distance based algorithm for outlier

detection

Source: https://en.wikipedia.org/wiki/Local_outlier_factor

Supervised

Random Forrest, SVM, …

Page 44: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 44 -

Being good at detecting known patterns

vs

Seeing the new and unknown

How my customer profile should look like

Class balance:

Relatively few fraud cases

compared to thousands of legit

transactions

Page 45: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 45 -

Local Outlier Factor (LOF)

Distance based algorithm for outlier detection

Incorporates the concept of local density

(similar to DBSCAN clustering)

Calculated scores are comparable Source: https://en.wikipedia.org/wiki/Local_outlier_factor

Rule Based Systems

A fixed set of rules for classifying events

Classic example: Naïve Bayes for detecting spam mails

HypGraphs and HypTrails

Bayesian Methods for comparing hypothesises of sequential data

Can be applied on transition networks

Page 46: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 46 -

Healthcare Fraud Detection

RapidMiner Demo

Page 47: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 47 -

The Challenge

RapidMiner Solution

Outcome

Safeguarding Electronic Payments

• Protecting against fraud and anticipation of risk 7x24

• Large and diverse set of partners (merchants) – over 70,0000

• How to classify and check merchant ecommerce sites for payment system compliance?

• Analyze, classify and check merchants’ ecommerce sites for compliance

• Utilize text mining with NLP to auto-categorize with high sentiment accuracy

• Mashup the widest data sets - historical data on service usage, transaction history, customer profiles, usage logs, and known cases of fraudulent behavior

• Detect anomalies, misuse and fraud through operationalized classification model

• Only 8-10% of merchant sites now screened manually at 80% confidence threshold

• Accurate automated analysis of high risk sites- 92% correctly classified

• Elimination of false positives - no normal sites classified as high risk

• Time and cost to resolve fraud case radically reduced

Anticipating the risk of fraud

Russia’sLargest

Electronic Payment Service

Page 48: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 48 -

Process Mining & Fraud Detection

• Insurance Claims & Payments Leave Footprints and Audit Trails:– Contracts– Claim reports / incidents– Payments / transactions– Individuals & organisations involved– IT system log files

• Use Process Mining to :– Collect– Normalize– Correlate– Analyze

• RapidMiner RapidProM Extension on the RapidMiner Marketplace

• Financial Audits– Compliance / regulatory audits

– Operational audits

– Transactional services (M&A)

• Purchase Processes & Procurement

• IT Audits– IT Service management

– Cyber security

– Systems compliance

– IT forensic services

• Manufacturing– Identifying assembly bottlenecks

Page 49: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

© 2017 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 49 -

Process Mining – RapidMiner with RapidProM

ProcessTask 1

ProcessTask 2

ProcessTask 3a

IF/THEN

ProcessTask 3b

ProcessTask 4

ProcessTask 5

Appl. A Appl. B Appl. B Appl. B

Appl. B Appl. C Appl. C

…200612 10:30 User0015 Task1 Case0099260612 23:01 User4801 Task1 Case0223

…200612 10:31 User0015 Task2 Case0099200612 10:35 User0015 Task3b Case0099 …

…200612 10:37 System Task4 Case0099200612 10:38 System Task5 Case0099

Log File App A Log File App B Log File App C

Log File Normalizationand Merge

Process LogData Lake

RapidMiner with

Process Documentation(Bottom up model generation, determination of reference processes)

Social CollaborationSocial Graphs Analysis

Process Harmonization(Compare against to-be processes and show deltas)

Process Optimization(Runtime Analysis, late runners, waiting times, unexpected stops, congestion)

http://www.rapidprom.org

Page 50: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

- 50 -CONFIDENTIAL

#1 Agile Predictive Analytics Platform for Today’s Modern Analysts

- 50 -©2015 RapidMiner, Inc. All rights reserved.

Thanks for your Attention!

Ralf Klinkenberg

[email protected]

www.RapidMiner.com

Page 51: Fraud Detection and Prevention: Leveraging Machine ... · Fraud: Areas and Types of Fraud • Credit Card Fraud • Tax Fraud –EU: Value Added Tax (VAT) Fraud in Transactions within

#1 Agile Predictive Analytics Platform for Today’s Modern Analysts

RapidMiner Wisdom 2018 – New Orleans, LA, USA, October 12th, 2018

Ralf Klinkenberg, Founder & Head of Data Science Research, RapidMiner

[email protected]

www.RapidMiner.com

Fraud Detection and Prevention: Leveraging Machine Learning to Detect Fraud Patterns, Anomalies, and Unusual Behaviors