Data mining in Health Insurance
description
Transcript of Data mining in Health Insurance
![Page 1: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/1.jpg)
Data mining in Health Insurance
![Page 2: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/2.jpg)
Introduction
• Rob Konijn, [email protected]– VU University Amsterdam– Leiden Institute of Advanced Computer Science (LIACS)– Achmea Health Insurance
• Currently working here• Delivering leads for other departments to follow up
– Fraud, abuse
• Research topic keywords: data mining/ unsupervised learning / fraud detection
2
![Page 3: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/3.jpg)
Outline
• Intro Application– Health Insurance– Fraud detection
• Part 1: Subgroup discovery • Part 2: Anomaly detection (slides partly
by Z. Slavik, VU)
![Page 4: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/4.jpg)
Intro Application
• Health Insurance Data• Health Insurance in NL
– Obligatory– Only private insurance companies– About 100 euro/month(everyone)+170 euro (income)– Premium increase of 5-12% each year
Achmea: about 6 million customers
![Page 5: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/5.jpg)
Funding of Health Insurance Costs in the Netherlands
vereveningsfonds
verzekerde zorgverzekeraar
rijksbijdrageverzekerden 18-
2 mld
inkomensafh.bijdragewerkgevers 17 mld
30 mld
zorguitgaven
vereveningsbijdrage
18 mld
nominale premie 18+:
- rekenpremie (~€ 947/vrz): 12 mld- opslag (~€ 150/vrz) : 2 mld
vereveningsfondsvereveningsfondsvereveningsfondsvereveningsfondsvereveningsfondsvereveningsfonds
zorgverzekeraar
vereveningsfonds
![Page 6: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/6.jpg)
Verevenings-model• By population
characteristics– Age– Gender– Income, social class– Type of work
• Calculation afterwards– High costs
compensation (>15.000 euro)
30 - 34 jr98035 - 39 jr1,044
50 - 54 jr
2,394
1,639
45 - 49 jr
55 - 59 jr60 - 64 jr 1,885
1,1831,354
40 - 44 jr
25 - 29 jr 870
1,400 0 - 4 jr1,026 5 - 9 jr90710 - 14 jr96415 - 17 jr89218 - 24 jr
905
3,34980 - 84 jr75 - 79 jr
65 - 69 jr
3,42490 jr e.o.
2,8263,244
70 - 74 jr
3,464
Mannen
85 - 89 jr
1,876
1,7131,905
1,366
2,560
1,476
2,201
1,768
1,532
1,232
Vrouwen
2,8863,0183,0343,014
918
1,2141,062
9361,210
![Page 7: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/7.jpg)
Fraude in de zorg
![Page 8: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/8.jpg)
Introduction Application:The Data
• Transactional data– Records of an event– Visit to a medical practitioner
• Charged directly by medical practioner• Patient is not involved• Risk of fraud
![Page 9: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/9.jpg)
Transactional Data
• Transactions: Facts– Achmea:
About 200 mln transactions per year
• Info of customers and practitioners: dimensions
![Page 10: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/10.jpg)
Different levels of hierarchy
• Records represent events• However, for example for fraud detection, we are
interested in customers, or medical practitoners
• See examples next pages• Groups of records: Subgroup Discovery• Individual patients/practioners: outlier detection
![Page 11: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/11.jpg)
Different types of fraud hierarchy
• On a patient level, or on a hospital level:
![Page 12: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/12.jpg)
Handling different hierarchy
• Creating profiles from transactional data• Aggregating costs over a time period
– Each record: patient• Each attribute i =1 to n: cost spent on treatment i
• Feature construction, for example– The ratio of long/short consults (G.P.)– The ratio of 3-way and 2 way fillings (Dentist)– Usually used for one-way analysis
![Page 13: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/13.jpg)
Different types of fraud detection
• Supervised– A labeled fraud set– A labeled non-fraud set– Credit cards, debit cards
• Unsupervised– No labels– Health Insurance, Cargo, telecom, tax etc.
![Page 14: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/14.jpg)
Unsupervised learning in Health Insurance Data
• Anomaly Detection (outlier detection)– Finding individual deviating points
• Subgroup Discovery– Finding (descriptions of) deviating groups
• Focus on differences and uncommon behavior– In contrast to other unsupervised learning methods
• Clustering• Frequent Pattern mining
![Page 15: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/15.jpg)
Subgroup Discovery
• Goal: Find differences in claim behavior of medical practitioners
• To detect inefficient claim behavior– Actions:
• A visit from the account manager• To include in contract negotiations
– In the extreme case: fraud• Investigation by the fraud detection department
• By describing deviations of a practitioner from its peers– Subgroups
![Page 16: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/16.jpg)
Patient-level, Subgroup Discovery
• Subgroup (orange): group of patients• Target (red)
– Indicates whether a patient visited a practitioner (1), or not (0)
![Page 17: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/17.jpg)
Subgroup Discovery: Quality Measures
• Target Dentist: 1672 patiënten– Compare with peer group, 100.000 patients in
total
• Subgroup V11 > 42 euro : 10347 patients– V11: one sided filling
• Crosstable
target dentist rest totaal
V11 >= 42 871 9476 10347rest 801 88852 89653totaal 1672 98328 100000
![Page 18: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/18.jpg)
The cross table
• Cross table in data
• Cross table expected:
• Assuming independence
target dentist rest totalV11 >= 42 173 10174 10347
rest 1499 88154 89653
total 1672 98328 100000
target dentist rest totalV11 >= 42 871 9476 10347rest 801 88852 89653total 1672 98328 100000
![Page 19: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/19.jpg)
Calculating Wracc and Lift
• Size subgroup = P(S) = 0.10347, size target dentist = P(T) = 0.01672• Weighted Relative ACCuracy (WRAcc) = P(ST) – P(S)P(T) = (871 –
173)/100000 = 689/100000• Lift = P(ST)/P(S)P(T) = 871/173 = 5.03
target dentist rest totalV11 >= 42 173 10174 10347
rest 1499 88154 89653
total 1672 98328 100000
target dentist rest totalV11 >= 42 871 9476 10347rest 801 88852 89653total 1672 98328 100000
![Page 20: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/20.jpg)
Example dentistry, at depth 1, one target dentist
![Page 21: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/21.jpg)
ROC analysis, target dentist
![Page 22: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/22.jpg)
Making SD more useful: adding prior knowledge
• Adding prior knowledge– Background variables patient (age, gender, etc.)– Specialism practitioner– For dentistry: choice of insurance
• Adding already known differences– Already detected by domain experts themselves– Already detected during a previous data mining run
![Page 23: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/23.jpg)
Prior Knowledge, Motivation
![Page 24: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/24.jpg)
Example, influence of prior knowledge
![Page 25: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/25.jpg)
The idea: create an expected cross table using prior knowledge
![Page 26: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/26.jpg)
Quality Measures• Ratio (Lift)
• Difference (WRAcc)
• Squared sum (Chi-square statistic)
![Page 27: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/27.jpg)
Example, iterative approach
• Idea: add subgroup to prior knowledge iteratively• Target = single pharmacy• Patients that visited the hospital in last 3 years removed
from data• Compare with peer group (400,000 patients), 2929 patiënts
of target pharmacy• Top subgroup : “B03XA01 (Erythropoietin)>0 euro”
subgroup T F
T 1297 224
F 1632 396,847
B03XA01 > 0
1 ‘target’ pharmacy
rest
rest
![Page 28: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/28.jpg)
Next iteration• Add “B03XA01 (EPO) >0 euro” to prior knowledge• Next best subgroup: “N05AX08 (Risperdal)>= 500 euro”
![Page 29: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/29.jpg)
Figure describing subgroup: N05AX08 > 500
Left: target pharmacy, right: other pharmacies
![Page 30: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/30.jpg)
Addition: adding costs to quality measure
– M55: dental cleaning– V11: 1-way filling– V21: polishing
• Cost of treatments in subgroup 370 euro (average)• 791 more patients than expected• Total quality 791*370 = 292,469 euro
![Page 31: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/31.jpg)
Iterative approach, top 3 subgroups
V12: 2-sided filling V21: polishing V60: indirect pulpa covering
V21 and V60 are not allowed on the same day Claim back (from all dentists): 1.3 million euro
![Page 32: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/32.jpg)
3d isometrics, cost based QM
![Page 33: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/33.jpg)
![Page 34: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/34.jpg)
Other target types: double binary target
• Target 1: year: 2009 or 2008• Target 2: target practitioner
• Pattern:– M59: extensive (expensive) dental cleaning– C12: second consult in one year
• Crosstable:
![Page 35: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/35.jpg)
Other target types: Multiclass target
• Subgroup (orange): group of patients• Target (red), now is a multi-value column, one
value per dentist
![Page 36: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/36.jpg)
Multiclass target, in ROC Space
![Page 37: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/37.jpg)
Anemaly Detection
The example above contains a contextual anomaly...
![Page 38: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/38.jpg)
Outline Anomaly Detection
• Anomalies– Definition– Types– Technique categories– Examples
• Lecture based on– Chandola et al. (2009). Anomaly
Detection: A Survey– Paper in BB
38
![Page 39: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/39.jpg)
Definition
• “Anomaly detection refers to the problem of finding patterns in data that do not conform to expected behavior”
• Anomalies, aka.– Outliers– Discordant observations– Exceptions– Aberrations– Surprises– Peculiarities– Contaminants
39
![Page 40: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/40.jpg)
Anomaly typesPoint anomalies
– A data point is anomalous with respect to the rest of the data
40
![Page 41: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/41.jpg)
Not covered today
• Other types of anomalies:– Collective anomalies– Contextual anomalies
• Other detection approaches:– Supervised learning– Semi supervised
• Assume training data is from normal class• Use to detect anomalies in the future
![Page 42: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/42.jpg)
We focus on outlier scores
• Scores– You get a ranked list of anomalies– “We investigate the top 10”– “An anomaly has a score of at least 134”– Leads followed by fraud investigators
• Labels
42
ANOMAL
Y
![Page 43: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/43.jpg)
Detection method categorisation
1. Model based2. Depth based3. Distance Based
4. Information theory related (not covered)5. Spectral theory related (not covered)
43
![Page 44: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/44.jpg)
Model based
• Build a (statistical) model of the data
• Data instances occur in high probability regions of a stochastic model, while anomalies occur in low probability regions
• Or: data instances have a high distance to the model are outliers
• Or: data instances have a high influence on the model are outliers
![Page 45: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/45.jpg)
Example: one way outlier detection
• Pharmacy records• Records represent patients• One attribute at a time:
– This example: attribute describing the costs spent on fertility medication (gonodatropin) in a year
• We could use such one way detection for each attribute in the data
![Page 46: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/46.jpg)
Example, model = parametric probability density function
![Page 47: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/47.jpg)
Example, model = non-parametric distribution
• Left: kernel density estimate• Right: boxplot
![Page 48: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/48.jpg)
Example: regression model
![Page 49: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/49.jpg)
Other models possible
• Probabilistic– Bayesian networks
• Regression models– Regression trees/ random forests– Neural networks
• Outlier score = prediction error (residual)
![Page 50: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/50.jpg)
Depth based methods
• Applied on 1-4 dimensional datasets– Or 1-4 attributes at a time
• Objects that have a high distance to the “center of the data” are considered outliers
• Example Pharmacy:– Records represent patients– 2 attributes:
• Costs spent on diabetes medication • Costs spent on diabetes testing material
![Page 51: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/51.jpg)
Example: bagplot, halfspace depth
![Page 52: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/52.jpg)
Distance based (nearest neighbor based)
• Assumption:– Normal data instances occur in dense neighbourhoods,
while anomalies occur far from their closest neighbours
![Page 53: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/53.jpg)
Similarity/distance
• You need a similarity measure between two data points– Numeric attributes: Eucledian, etc.– Nominal: simple match often enough– Multivariate:
• Distance using all attributes• Distance between attribute values, then combine
![Page 54: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/54.jpg)
Example, dentistry data
• Records represent dentists
• Attributes are 14 cost categories– Denote the percentage
of patients that received a claim from the category
![Page 55: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/55.jpg)
Option 1:Distance to kth neighbour as anomaly
score
![Page 56: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/56.jpg)
Option 2:Use relative densities of neighbourhoods
• Density of neighbourhood estimated for each instance
• Instances in the low density neighbourhoods are anomalous, others normal
• Note:– Distance to kth neighbour is an estimate for the
inverse of density (large distance low density)– But this estimates outliers in varying density
neighbourhoods badly
56
![Page 57: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/57.jpg)
LOF• Local Outlier Factor:• Local density:
– k divided by the volume of the smallest hyper-sphere centred around the instance, containing k neighbours
• Anomalous instance:– Local density will be
lower than that ofthe k nearest neighbours
57
Average local density of k nearest neighboursLocal density of instance
Average local density of k nearest neighboursLocal density of instance
![Page 58: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/58.jpg)
Example LOF outlier, dentistry
![Page 59: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/59.jpg)
3. Clustering based a.d. techniques
• 3 possibilities;1. Normal data instances belong to a cluster in
the data, while anomalies do not belong to any cluster– Use clustering methods that do not force all
instances to belong to a cluster• DBSCAN, ROCK, SSN
2. Distance to the cluster center = outlier score3. Clusters with too few points are outlying
clusters59
![Page 60: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/60.jpg)
K-means with 6 clusters, centers of the dentistry data set
• Attributes: percent of patient that received claim from cost category
• Clusters correspond to specialism1. Dentist2. Orthodontist3. Orthodontist
(charged by dentist)
4. Dentist5. Dentist6. Dental hygenist
![Page 61: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/61.jpg)
Combining Subgroup Discovery and Outlier Detection
• Describe regions with outliers using SD• Identify suspicious medical practitioners• 2 or 3 step approach to describe outliers:
1. Calculate outlier score2. Use subgroup discovery to describe regions with
outliers.3. (optional) identify the involved medical
practitioners
![Page 62: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/62.jpg)
Example output:
• Look at patients with ‘P30>1050 euro’ for practitioner number 221
• Left: all data, right: practitioner 221
![Page 63: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/63.jpg)
Descriptions of outliers: LOCI outlier score
• 1. Calculate outlier score – LOCI is a density based
outlier score• 2. Describe outlying
regions• Result top subgroup:
– Orthodontics (dentist) 0.044 ^ Orthodontics 0.78
– Group of 9 dentists with an average score of 3.9
![Page 64: Data mining in Health Insurance](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813c02550346895da55f0b/html5/thumbnails/64.jpg)
Conclusions
• Health insurance: Interesting application domain– Very relevant
• Outlier Detection and Subgroup discovery are useful