1
Using datamining to collect taxes - development and implementation of an automatized collection in the
danish public sector
Mads Krogh Nielsen
Danish ministry of Taxation
More efficient collection in SKAT
27. januar 2012
The motivation for a shift in regime
• National Audit Office 2003: ”>The IT based structures are ineffective”.
• New structure in SKAT – centralized collection, 30 centres -> 6 regions
• Reduction – 4 year plan: 2006, 12000 FTE, 2010, 7000 FTE
Also – great potential:
• Shift in strategy – OECD Compliance approach rather than solely focusing on control.
• Denmark has already one single ID code for all individuals and companies. Have great opportunity for data analysis.
• Established cooperation with rather many 3rd party partners such as banks, other authorities, employers mortgage institutions..
-
1.000
2.000
3.000
4.000
5.000
6.000
7.000
8.000
9.000
10.000
2007
2008
2009
2010
2011
2012
2013
A need for improved efficiency in the public sector
2
More efficient collection in SKAT
27. januar 2012
What do we do to achieve this?
1. Higher rate of efficiency - Doing the right things at the right time
2. Maximum use of automation - Intelligent systems – Knowledge based models
- Using data we already have available
3. Automatized collection
More efficient collection in SKAT
27. januar 2012
The EFI system with graphic representation
Decisions
Dialogue
The Nervous system
Adjustment of the
comprehensive
machinery
Proper manning at the
right time and place
Collection
Engine
Ensures a Uniform
and fair collection
Approved
collection
strategies
3
More efficient collection in SKAT
27. januar 2012
i
i
Score Score B
0 1 1 2 2(oddsof beingagoodaccount ) + * * ... *n nLog B B Var B Var B Var
0 0( ) * / log(2)
( ) * / log(2)i i
Score B E D B
Score B D B
So how does it work, this modelling?
More efficient collection in SKAT
27. januar 2012
Risk assessment – the projection of risk based on events and/or states that have occured.
First we must agree on a
”Definition of Bad”
Then we find what characterizes
such a person/company
Subsequently we are able to score individuals/companies who
have not yet failed, based on the results of other individuals
[12 months of non-payment] = [Properties of the individual
So how does it work, this modelling?
4
More efficient collection in SKAT
27. januar 2012
Scoring the danes
Definition of Bad – the very core of the technology: BAD
The toughest decision!
”Customers who did not pay on their debts the last 12 months”.
More efficient collection in SKAT
27. januar 2012
Daily sequences of the scoring
Segmenting Scorecard Cut off Tracks
Grouping of customers
in segments by simple
queries
Calculation of
debitors score based
on the scorecard of
the segment.
Implacement on track 1
0 – 25
26 – 40
41 – 90
91 -100
Companies
Persons
Implacement on track 2
Implacement on track 3
Implacement on track 4
Sole proprieties
Calculated score
placed in intervals
that gives grouping of
debitors.
The group is attached
to a number which
indicates the
collection effort.
5
More efficient collection in SKAT
27. januar 2012
DW
Various payment strategies
”a nice letter”
Telephone
incasso
”Toughness”
After this, it is the tracks that executes - – and saves resources:
More efficient collection in SKAT
27. januar 2012
Analysis of +200 various parameters.
From 70 to 50:
Significant parameters on B2C:
DW
Employment_CAT = 6
MARITAL_STATE
Arrears_amt
Nbr_claims_last_4_agreements
N_cars_owned
N_houses_08
Assets_08
AGE_YRS
COMMUNE_CODE
Debt_National_Train_Company
Debt_National_Broadcast_license
AVG_OWNERSHIP_SHARE
RELATIONSHIP_TO_COMPANY
TOTAL_ARREAR_OPEN_BOD
LARGESTUDENTDEBT
TOTAL_ARREAR_OPEN_BOD
TOTAL_AGREEMENT_BOD_2
TOTAL_HOUSES_VALUE
ARREAR
What are we looking for?
Scoring the danes
6
More efficient collection in SKAT
27. januar 2012
Univariate – followed by Multivariate (Least Angle Regression)
More efficient collection in SKAT
27. januar 2012
Demografic information
(Max points =98)
Intercept information 289
points
Company involvement
(Max points =13)
Income and Asset
information (Max points =
197)
Special debt information
(Max points = 79)
Scorecard Persons
Total Score
7
More efficient collection in SKAT
27. januar 2012
Yvonne Willem Jr.
Dorthe Moped Mullen Jane
Gunnar
More efficient collection in SKAT
27. januar 2012
Willem Jr.
Age 23 years
Lives in Allerød community (201)
Married with Dorthe
They live together
Works in a bank (not involved in any
owner relationship)
Owes: ”600 kr. too much payed wage”
Has never before owed money
Has no tv license, Train tickets or large
student’s debt
Portrait of a good guy
Willem Jr. score = 619 points
8
More efficient collection in SKAT
27. januar 2012
Moped Mullen
Portræt af en synder
28 years
Lives alone on Lolland (rural area)
Single
Is co owner of an MLM company
Has many payment agreements
which he nurses very badly.
Latest challenge in the long row is
alimony to Jane (even if he claims, it
is not his kid)
Has a non-paid TV-license and a
Train fine but no student’s grants to
pay back on.
Moped Mullen Score = 328 points
Portrait of another guy
More efficient collection in SKAT
27. januar 2012
Yvonne
Age 27 years
Lives in Lemvig community
Married and lives with Gunnar
Works in Matas
Has a former payment
agreement (3 parking tickets)
She pays these every month.
Her latest challenge is a debt
on personal tax.
She has a large student’s debt
as she studiet musical
therapist in Aalborg. No unpaid
DSB and TV license fine.
Yvonne score = 584 points
Portræt af Yvonne
9
More efficient collection in SKAT
27. januar 2012
500 Intercept = 289
Score
50%
67%
540 460
33%
Probability to keep
the agreement.
Connection between the score and probability of keeping a
payment agreement
500 points = fifty-fifty chance
} Points that doubles
the odds = 40
11%
89%
620 380 700 300
More efficient collection in SKAT
27. januar 2012
500 Intercept = 289
Score
50%
67%
540 460
33%
500 points = fifty-fifty chance
} Points that double
odds = 40
11%
89%
620 380 700 300
Søren
Kaj
Yvonne
619
584
328
Connection between the score and probability of keeping a
payment agreement Probability to keep
the agreement.
10
More efficient collection in SKAT
27. januar 2012
Persons Score Card
More efficient collection in SKAT
27. januar 2012
Persons Score Card
11
More efficient collection in SKAT
27. januar 2012
More efficient collection in SKAT
27. januar 2012
12
More efficient collection in SKAT
27. januar 2012
More efficient collection in SKAT
27. januar 2012
13
More efficient collection in SKAT
27. januar 2012
Bad rate in danish municipalities
More efficient collection in SKAT
27. januar 2012
Conclusion
These improvements materializes as such:
• A promising automatized handeling of the collection process.
• An effective and swift iterative process thanks to the Modeling software
(being able to do ETL and analysis in one operation).
• Striking reductions in the collection costs and proces cyklus according
to the automatizing.
• Higher service level due to standardizing and better ressource
distribution.
Top Related