8.000 7.000 6.000 Using datamining to collect taxes 5 · is not his kid) Has a non-paid TV-license...

14
1 Using datamining to collect taxes - development and implementation of an automatized collection in the danish public sector Mads Krogh Nielsen Danish ministry of Taxation [email protected] More efficient collection in SKAT 27. januar 2012 The motivation for a shift in regime National Audit Office 2003: ”>The IT based structures are ineffective”. New structure in SKAT centralized collection, 30 centres -> 6 regions Reduction 4 year plan: 2006, 12000 FTE, 2010, 7000 FTE Also great potential: Shift in strategy OECD Compliance approach rather than solely focusing on control. Denmark has already one single ID code for all individuals and companies. Have great opportunity for data analysis. Established cooperation with rather many 3rd party partners such as banks, other authorities, employers mortgage institutions.. - 1.000 2.000 3.000 4.000 5.000 6.000 7.000 8.000 9.000 10.000 2007 2008 2009 2010 2011 2012 2013 A need for improved efficiency in the public sector

Transcript of 8.000 7.000 6.000 Using datamining to collect taxes 5 · is not his kid) Has a non-paid TV-license...

Page 1: 8.000 7.000 6.000 Using datamining to collect taxes 5 · is not his kid) Has a non-paid TV-license and a Train fine but no student’s grants to pay back on. Moped Mullen Score =

1

Using datamining to collect taxes - development and implementation of an automatized collection in the

danish public sector

Mads Krogh Nielsen

Danish ministry of Taxation

[email protected]

More efficient collection in SKAT

27. januar 2012

The motivation for a shift in regime

• National Audit Office 2003: ”>The IT based structures are ineffective”.

• New structure in SKAT – centralized collection, 30 centres -> 6 regions

• Reduction – 4 year plan: 2006, 12000 FTE, 2010, 7000 FTE

Also – great potential:

• Shift in strategy – OECD Compliance approach rather than solely focusing on control.

• Denmark has already one single ID code for all individuals and companies. Have great opportunity for data analysis.

• Established cooperation with rather many 3rd party partners such as banks, other authorities, employers mortgage institutions..

-

1.000

2.000

3.000

4.000

5.000

6.000

7.000

8.000

9.000

10.000

2007

2008

2009

2010

2011

2012

2013

A need for improved efficiency in the public sector

Page 2: 8.000 7.000 6.000 Using datamining to collect taxes 5 · is not his kid) Has a non-paid TV-license and a Train fine but no student’s grants to pay back on. Moped Mullen Score =

2

More efficient collection in SKAT

27. januar 2012

What do we do to achieve this?

1. Higher rate of efficiency - Doing the right things at the right time

2. Maximum use of automation - Intelligent systems – Knowledge based models

- Using data we already have available

3. Automatized collection

More efficient collection in SKAT

27. januar 2012

The EFI system with graphic representation

Decisions

Dialogue

The Nervous system

Adjustment of the

comprehensive

machinery

Proper manning at the

right time and place

Collection

Engine

Ensures a Uniform

and fair collection

Approved

collection

strategies

Page 3: 8.000 7.000 6.000 Using datamining to collect taxes 5 · is not his kid) Has a non-paid TV-license and a Train fine but no student’s grants to pay back on. Moped Mullen Score =

3

More efficient collection in SKAT

27. januar 2012

i

i

Score Score B

0 1 1 2 2(oddsof beingagoodaccount ) + * * ... *n nLog B B Var B Var B Var

0 0( ) * / log(2)

( ) * / log(2)i i

Score B E D B

Score B D B

So how does it work, this modelling?

More efficient collection in SKAT

27. januar 2012

Risk assessment – the projection of risk based on events and/or states that have occured.

First we must agree on a

”Definition of Bad”

Then we find what characterizes

such a person/company

Subsequently we are able to score individuals/companies who

have not yet failed, based on the results of other individuals

[12 months of non-payment] = [Properties of the individual

So how does it work, this modelling?

Page 4: 8.000 7.000 6.000 Using datamining to collect taxes 5 · is not his kid) Has a non-paid TV-license and a Train fine but no student’s grants to pay back on. Moped Mullen Score =

4

More efficient collection in SKAT

27. januar 2012

Scoring the danes

Definition of Bad – the very core of the technology: BAD

The toughest decision!

”Customers who did not pay on their debts the last 12 months”.

More efficient collection in SKAT

27. januar 2012

Daily sequences of the scoring

Segmenting Scorecard Cut off Tracks

Grouping of customers

in segments by simple

queries

Calculation of

debitors score based

on the scorecard of

the segment.

Implacement on track 1

0 – 25

26 – 40

41 – 90

91 -100

Companies

Persons

Implacement on track 2

Implacement on track 3

Implacement on track 4

Sole proprieties

Calculated score

placed in intervals

that gives grouping of

debitors.

The group is attached

to a number which

indicates the

collection effort.

Page 5: 8.000 7.000 6.000 Using datamining to collect taxes 5 · is not his kid) Has a non-paid TV-license and a Train fine but no student’s grants to pay back on. Moped Mullen Score =

5

More efficient collection in SKAT

27. januar 2012

DW

Various payment strategies

”a nice letter”

Telephone

incasso

”Toughness”

After this, it is the tracks that executes - – and saves resources:

More efficient collection in SKAT

27. januar 2012

Analysis of +200 various parameters.

From 70 to 50:

Significant parameters on B2C:

DW

Employment_CAT = 6

MARITAL_STATE

Arrears_amt

Nbr_claims_last_4_agreements

N_cars_owned

N_houses_08

Assets_08

AGE_YRS

COMMUNE_CODE

Debt_National_Train_Company

Debt_National_Broadcast_license

AVG_OWNERSHIP_SHARE

RELATIONSHIP_TO_COMPANY

TOTAL_ARREAR_OPEN_BOD

LARGESTUDENTDEBT

TOTAL_ARREAR_OPEN_BOD

TOTAL_AGREEMENT_BOD_2

TOTAL_HOUSES_VALUE

ARREAR

What are we looking for?

Scoring the danes

Page 6: 8.000 7.000 6.000 Using datamining to collect taxes 5 · is not his kid) Has a non-paid TV-license and a Train fine but no student’s grants to pay back on. Moped Mullen Score =

6

More efficient collection in SKAT

27. januar 2012

Univariate – followed by Multivariate (Least Angle Regression)

More efficient collection in SKAT

27. januar 2012

Demografic information

(Max points =98)

Intercept information 289

points

Company involvement

(Max points =13)

Income and Asset

information (Max points =

197)

Special debt information

(Max points = 79)

Scorecard Persons

Total Score

Page 7: 8.000 7.000 6.000 Using datamining to collect taxes 5 · is not his kid) Has a non-paid TV-license and a Train fine but no student’s grants to pay back on. Moped Mullen Score =

7

More efficient collection in SKAT

27. januar 2012

Yvonne Willem Jr.

Dorthe Moped Mullen Jane

Gunnar

More efficient collection in SKAT

27. januar 2012

Willem Jr.

Age 23 years

Lives in Allerød community (201)

Married with Dorthe

They live together

Works in a bank (not involved in any

owner relationship)

Owes: ”600 kr. too much payed wage”

Has never before owed money

Has no tv license, Train tickets or large

student’s debt

Portrait of a good guy

Willem Jr. score = 619 points

Page 8: 8.000 7.000 6.000 Using datamining to collect taxes 5 · is not his kid) Has a non-paid TV-license and a Train fine but no student’s grants to pay back on. Moped Mullen Score =

8

More efficient collection in SKAT

27. januar 2012

Moped Mullen

Portræt af en synder

28 years

Lives alone on Lolland (rural area)

Single

Is co owner of an MLM company

Has many payment agreements

which he nurses very badly.

Latest challenge in the long row is

alimony to Jane (even if he claims, it

is not his kid)

Has a non-paid TV-license and a

Train fine but no student’s grants to

pay back on.

Moped Mullen Score = 328 points

Portrait of another guy

More efficient collection in SKAT

27. januar 2012

Yvonne

Age 27 years

Lives in Lemvig community

Married and lives with Gunnar

Works in Matas

Has a former payment

agreement (3 parking tickets)

She pays these every month.

Her latest challenge is a debt

on personal tax.

She has a large student’s debt

as she studiet musical

therapist in Aalborg. No unpaid

DSB and TV license fine.

Yvonne score = 584 points

Portræt af Yvonne

Page 9: 8.000 7.000 6.000 Using datamining to collect taxes 5 · is not his kid) Has a non-paid TV-license and a Train fine but no student’s grants to pay back on. Moped Mullen Score =

9

More efficient collection in SKAT

27. januar 2012

500 Intercept = 289

Score

50%

67%

540 460

33%

Probability to keep

the agreement.

Connection between the score and probability of keeping a

payment agreement

500 points = fifty-fifty chance

} Points that doubles

the odds = 40

11%

89%

620 380 700 300

More efficient collection in SKAT

27. januar 2012

500 Intercept = 289

Score

50%

67%

540 460

33%

500 points = fifty-fifty chance

} Points that double

odds = 40

11%

89%

620 380 700 300

Søren

Kaj

Yvonne

619

584

328

Connection between the score and probability of keeping a

payment agreement Probability to keep

the agreement.

Page 10: 8.000 7.000 6.000 Using datamining to collect taxes 5 · is not his kid) Has a non-paid TV-license and a Train fine but no student’s grants to pay back on. Moped Mullen Score =

10

More efficient collection in SKAT

27. januar 2012

Persons Score Card

More efficient collection in SKAT

27. januar 2012

Persons Score Card

Page 11: 8.000 7.000 6.000 Using datamining to collect taxes 5 · is not his kid) Has a non-paid TV-license and a Train fine but no student’s grants to pay back on. Moped Mullen Score =

11

More efficient collection in SKAT

27. januar 2012

More efficient collection in SKAT

27. januar 2012

Page 12: 8.000 7.000 6.000 Using datamining to collect taxes 5 · is not his kid) Has a non-paid TV-license and a Train fine but no student’s grants to pay back on. Moped Mullen Score =

12

More efficient collection in SKAT

27. januar 2012

More efficient collection in SKAT

27. januar 2012

Page 13: 8.000 7.000 6.000 Using datamining to collect taxes 5 · is not his kid) Has a non-paid TV-license and a Train fine but no student’s grants to pay back on. Moped Mullen Score =

13

More efficient collection in SKAT

27. januar 2012

Bad rate in danish municipalities

More efficient collection in SKAT

27. januar 2012

Conclusion

These improvements materializes as such:

• A promising automatized handeling of the collection process.

• An effective and swift iterative process thanks to the Modeling software

(being able to do ETL and analysis in one operation).

• Striking reductions in the collection costs and proces cyklus according

to the automatizing.

• Higher service level due to standardizing and better ressource

distribution.

Page 14: 8.000 7.000 6.000 Using datamining to collect taxes 5 · is not his kid) Has a non-paid TV-license and a Train fine but no student’s grants to pay back on. Moped Mullen Score =

14

More efficient collection in SKAT

27. januar 2012

Thank you for the attention…

[email protected]