Legal Analytics Course - Class 7 - Binary Classification with Decision Tree Learning - Professor...
-
Upload
daniel-martin-katz -
Category
Law
-
view
291 -
download
0
Transcript of Legal Analytics Course - Class 7 - Binary Classification with Decision Tree Learning - Professor...
Class 7Binary Classification & Decision Tree Learning
Legal Analytics
Professor Daniel Martin KatzProfessor Michael J Bommarito II
legalanalyticscourse.com
http://scikit-learn.org/stable/tutorial/machine_learning_map/index.htmlaccess more at legalanalyticscourse.com
Classification to Predict Quantity
Classification to Predict CategoryRegression Methods
Trees, Forests, Knn, etc.
access more at legalanalyticscourse.com
Adapted from Slides ByVictor Lavrenko and Nigel Goddard
@ University of EdinburghTake A Look These 12
access more at legalanalyticscourse.com
72FemaleHuman
3FemaleHorse
36Male
Human
21Male
Human
67Male
Human
29FemaleHuman
54Male
Human
44Male
Human
50Male
Human
42FemaleHuman
6MaleDog
7FemaleHuman
Task = Determine Whether the Agents Will Obtain Employment?
Yes
Nof( )
Job?
Binary Classification (Supervised Learning)
access more at legalanalyticscourse.com
Classification (Supervised Learning)
decision boundary
Yes
Nof( )
Job?
decision boundary
access more at legalanalyticscourse.com
https://www.youtube.com/watch?v=p5rTio1G4ys
Task = Determine Whether the Agents Will Obtain a Loan?
YesPerhapsf( )
Loan?
Multi Class Classification (Supervised Learning)
No
access more at legalanalyticscourse.com
f( )
Multi Class Classification (Supervised Learning)
Loan?
YesPerhapsNo
access more at legalanalyticscourse.com
f( )
Loan?
Yes
Multi Class Classification (Supervised Learning)
No
Maybe
YesPerhapsNo
access more at legalanalyticscourse.com
Task = Determine the Age of the Respective Agents
f( )
Age?
Regression (Supervised Learning)
#
access more at legalanalyticscourse.com
Intro to Decision Tree LearningClassification And Regression Tree (CART)
access more at legalanalyticscourse.com
Decision Trees in Decision Theory
Decision Trees in Machine Learning≠
access more at legalanalyticscourse.com
Uses a set of binary rules applied to calculate a target value
Used for classification (categorical variables) or regression (continuous variables)
Different algorithms are used to determine the“best” split at a node
Introduction to Decision Trees
access more at legalanalyticscourse.com
“CART Approach” to Decision Trees
Classification And Regression Tree (CART)
access more at legalanalyticscourse.com
https://www.youtube.com/watch?v=WOOTNBxbi8caccess more at legalanalyticscourse.com
http://www.r-bloggers.com/a-brief-tour-of-the-trees-and-forests/access more at legalanalyticscourse.com
http://www.r-bloggers.com/classification-tree-models/
https://www.youtube.com/watch?v=_RxqyvRK0Rw&list=PLD0F06AA0D2E8FFBAaccess more at legalanalyticscourse.com
Given Some Data:(X1, Y1), ... , (Xn, Yn)
Now We Have a New Set of X’sWe Want to Predict the Y
access more at legalanalyticscourse.com
Form a Binary Tree thatMinimizes the Error in each leaf of the tree
CART (Classification & Regression Trees)
access more at legalanalyticscourse.com
10
1
11
0
0
0
0
0
1
1 11
00
1
1
1
1
0010
Xi1
Xi2
0
Adapted from Example By Mathematical Monk
We want to build an approach which can lead to the proper classification (labeling) of new data points ( ) that are dropped into this space
10
1
11
0
0
0
0
0
1
1 11
00
1
1
1
1
0010
Xi1
Xi2
0
Adapted from Example By Mathematical Monk
L e t s B e g i n t o Partition the Space
10
1
11
0
0
0
0
0
1
1 11
00
1
1
1
1
0010
Xi1
Xi2
0
1 2
1
2
Adapted from Example By Mathematical Monk
L e t s B e g i n t o Partition the Space
split 1
(a)
10
1
11
0
0
0
0
0
1
1 11
00
1
1
1
1
0010
Xi1
Xi2
0
1 2
1
2
Adapted from Example By Mathematical Monk
This Split Will Be Memorialized in the Tree
split 1
(a)
10
1
11
0
0
0
0
0
1
1 11
00
1
1
1
1
0010
Xi1
Xi2
0
1 2
1
2
Adapted from Example By Mathematical Monk
We Ask the Question isXi1 > 1 ? - with a binary (yes or no) response
split 1
(a)
Xi1 > 1 ?
YesNo
10
1
11
0
0
0
0
0
1
1 11
00
1
1
1
1
0010
Xi1
Xi2
0
1 2
1
2
Adapted from Example By Mathematical Monk
If No - then we are in zone (a) ... we tally the number of zeros and ones
Using Majority Rule do we assign a classification to this rule this leaf
split 1
(a)
Xi1 > 1 ?
YesNo
(0,5)Classify as 1
zone (a)
10
1
11
0
0
0
0
0
1
1 11
00
1
1
1
1
0010
Xi1
Xi2
0
1 2
1
2
Adapted from Example By Mathematical Monk
Here we Classify as a 1 because (0,5) which is 0 zero’s and 5 one’s
split 1
(a)
Xi1 > 1 ?
YesNo
(0,5)Classify as 1
zone (a)
10
1
11
0
0
0
0
0
1
1 11
00
1
1
1
1
0010
Xi1
Xi2
0
1 2
1
2
Adapted from Example By Mathematical Monk
Using a Similar Approach Lets Begin to Fill in the Rest of the Tree
split 1
(a)
Xi1 > 1 ?
YesNo
(0,5)Classify as 1
zone (a)
10
1
11
0
0
0
0
0
1
1 11
00
1
1
1
1
0010
Xi1
Xi2
0
1 2
1
2
Adapted from Example By Mathematical Monk
split 1
(a)
Xi1 > 1 ?
YesNo
(0,5)Classify as 1
zone (a) Xi2 > 1.45 ?
No Yes
split 2
10
1
11
0
0
0
0
0
1
1 11
00
1
1
1
1
0010
Xi1
Xi2
0split 1
split 2
split 3
1 2 2.2
1
2 Xi1 > 1 ?
(0,5) Xi2 > 1.45 ?
(4,1)(2,3)
Classify as 1
Classify as 1 Classify as 0
(a)
zone (a)
1.45YesNo
Adapted from Example By Mathematical Monk
No
(b)
(c)
zone (b) zone (c)
YesNo
Yes
Xi1 > 2 ?
10
1
11
0
0
0
0
0
1
1 11
00
1
1
1
1
0010
Xi1
Xi2
0split 1
split 2
split 3
split 4
1 2 2.2
1
2 Xi1 > 1 ?
(0,5) Xi2 > 1.45 ?
Xi1 > 2.2 ?
(1,4)(5,0)(4,1)(2,3)
Classify as 1
Classify as 1 Classify as 0
(a)
zone (a)
1.45YesNo
Adapted from Example By Mathematical Monk
No
(b)
(c)
(d)
(e)
zone (b) zone (c)
YesNo YesNo
Yes
zone (d)Classify as 0 Classify as 1
zone (e)
Xi1 > 2 ?
10
1
11
0
0
0
0
0
1
1 11
00
1
1
1
1
0010
Xi1
Xi2
0split 1
split 2
split 3
split 4
1 2 2.2
1
2 Xi1 > 1 ?
(0,5) Xi2 > 1.45 ?
Xi1 > 2.2 ?
(1,4)(5,0)(4,1)(2,3)
Classify as 1
Classify as 1 Classify as 0
(a)
zone (a)
1.45YesNo
Adapted from Example By Mathematical Monk
No
(b)
(c)
(d)
(e)
zone (b) zone (c)
YesNo YesNo
Yes
zone (d)Classify as 0 Classify as 1
zone (e)
Xi1 > 2 ?
10
1
11
0
0
0
0
0
1
1 11
00
1
1
1
1
0010
Xi1
Xi2
0split 1
split 2
split 3
split 4
1 2 2.2
1
2 Xi1 > 1 ?
(0,5) Xi2 > 1.45 ?
Xi1 > 2.2 ?
(1,4)(5,0)(4,1)(2,3)
Classify as 1
Classify as 1 Classify as 0
(a)
zone (a)
1.45YesNo
Adapted from Example By Mathematical Monk
No
(b)
(c)
(d)
(e)
zone (b) zone (c)
Yes No YesNo
Yes
zone (d)Classify as 0 Classify as 1
zone (e)
1
1
1
0 1
0
Xi1 > 2 ?
10
1
11
0
0
0
0
0
1
1 11
00
1
1
1
1
0010
Xi1
Xi2
0
1 2
1
2
3
00
00
1
1
1
1
11 10
00
0
11 11
1 1
00
1
1 1
0
A B C
D
E
F
G
How about this one?
In this simple example, we eyeballed the 2D space, partitioned
it and stopped after 4 Splits
access more at legalanalyticscourse.com
For real problems, you need to select criteria
(or a criterion) for deciding where to
partition (split) the data
(2)
access more at legalanalyticscourse.com
For real problems you must develop a stopping condition
or pursue recursive partitioning of the space
(3)
access more at legalanalyticscourse.com
Solutions to these 3 Problems are among the core questions in
algorithm selection / development
access more at legalanalyticscourse.com
From an Algorithmic Perspective - The Task is to Develop a
Method to Partition the Trees
access more at legalanalyticscourse.com
Must Do So Without Knowing the Specific Contours of the Data / Problem in Question
access more at legalanalyticscourse.com
“Although any given solution to an NP-complete problem can be verified quickly (in polynomial time), there is no known efficient way to locate a solution in the first place; indeed, the most notable characteristic of NP-complete problems is that no fast solution to them is known. That is, the time required to solve the problem using any currently known algorithm increases very quickly as the size of the problem grows”
key implication is that one cannot in advance determine
the “optimal tree”
access more at legalanalyticscourse.com
Greedy Optimization Method is used to calculate the MLE
(maximum-likelihood estimation)
access more at legalanalyticscourse.com
Greedy is a Heuristic “makes the locally optimal choice at each stage with the hope of finding a global optimum. In many problems, a greedy strategy does not in general produce an optimal solution, but nonetheless a greedy heuristic may yield locally optimal solutions that approximate a global optimal solution in a reasonable time.”
access more at legalanalyticscourse.com
Legal AnalyticsClass 7 - Binary Classification with Decision Tree Learning
daniel martin katz
blog | ComputationalLegalStudies
corp | LexPredict
michael j bommarito
twitter | @computational
blog | ComputationalLegalStudies
corp | LexPredict
twitter | @mjbommar
more content available at legalanalyticscourse.com
site | danielmartinkatz.com site | bommaritollc.com