Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of...
-
Upload
harrison-wilcoxon -
Category
Documents
-
view
219 -
download
5
Transcript of Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of...
![Page 1: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/1.jpg)
Active Cost-sensitive Learning
(Intelligent Test Strategies)
Charles X. Ling, PhDDepartment of Computer Science
University of Western Ontario, Ontario, Canada
[email protected]://www.csd.uwo.ca/faculty/clingJoint work with Victor Sheng, Qiang
Yang, …
![Page 2: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/2.jpg)
Outline
Introduction Cost-sensitive decision trees Test strategies
Sequential Test Single Batch Test Sequential Batch Test
Conclusions and future work
![Page 3: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/3.jpg)
Outline
Introduction Cost-sensitive decision trees Test strategies
Sequential Test Single Batch Test Sequential Batch Test
Conclusions and future work
![Page 4: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/4.jpg)
Everything has a cost/benefit!
Materials, products, services Disease, working/living condition, waiting, … Happiness, love, life, …
Money, Sex and Happiness: An Empirical Study, by David G. Blanchflower & Andrew J. Oswald, in Journal The Scandinavian Journal of Economics. 106:3, 2004. Pages: 393-415
Lasting/happy marriage is worth about $100,000 in happiness
Utility-based learning: optimization; unifies many issues & is ultimate goal
![Page 5: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/5.jpg)
Everything has a cost/benefit!
In medical diagnosis… Tests have costs: temperature ($1), X-ray ($30), biopsy
($900) Diseases have costs: flu ($100), diabetes (100k), cancer
(108) Misdiagnosis has (different) costs
Cost of false alarm ($500) << cost of missing a cancer ($500,000)
Doctors: balance the cost of tests and misdiagnosis
Our goal: to minimize the total cost Many other similar applications… Model this process
Cost-sensitive learning Intelligent test strategies
Patient Test 1 Test 2 … Test n Cancer?
(Cost) $1 $30 ... $900 FP/FN= 100/300k
001 39 Low … High 1002 35 Med … ? 0003 42 ? … ? 0… … … … … …
New1 ? Med … ? ?
![Page 6: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/6.jpg)
Review of Previous Work
Cost-sensitive learning: a survey (Turney 2000) Active research, also for imbalanced data problem
CS meta learning (wrapper): thresholding, sampling, weighting, …
CS learning algorithms. CSNB, our CS trees …but all consider misclassification costs only
Some work considers test costs only A few previous works consider both test costs and
misclassification costs (Turney 1995, Zubek and Dietterich 2002, Lizotte et al 2003); all computationally expensive
![Page 7: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/7.jpg)
Review of Previous Work
Active learning: actively seeking for extra info
Pool-based: a pool of unlabeled examples, which ones to label
Membership query: Is this instance positive? Feature value acquisition
During training. But “missing is useful!” During testing: our work
Human learning is active in many ways
![Page 8: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/8.jpg)
Review of Previous Work
Diagnosis: wide applications in medicine, mechanical systems, software, …
Most previous AI-based diagnosis systems…
Manually built (partially) Does not incorporate costs/benefit Cannot actively suggest the processes
Our work: cost-sensitive and active; useful for diagnosis and policy setting
![Page 9: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/9.jpg)
Outline
Introduction Cost-sensitive decision trees Test strategies
Sequential Test Single Batch Test Sequential Batch Test
Conclusions and future work
![Page 10: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/10.jpg)
Cost-sensitive Decision Tree
Patient Test 1 Test 2 … Test n Cancer?
(Cost) $1 $30 ... $900 FP/FN= 100/300k
001 39 Low … High 1
002 35 Med … ? 0
003 42 ? … ? 0
… … … … … … 1
T1
T60
0
T2
T3
10
Low Med
<36 >=36
0
1 2
a cb
Advantages: tree structure, comprehensiblity
Objective: minimizing the total cost of tests and misclassification.
![Page 11: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/11.jpg)
Attribute Splitting Criteria Previous methods: C4.5 reduces
the entropy (randomness), performs badly on cost sensitive tasks
New (ICML’04): we reduce the total expected cost
E
E3E2E1
1 2 3
Choose T such that E – (E1+E2+E3) is maxC
C3C2C1
1 2 3
Choose T such that C – (C1+C2+C3+C_Test) is max
![Page 12: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/12.jpg)
Case Study: Heart Disease
Predict coronary artery disease Class 0: less than 50% artery
narrowing; Class 1: more than 50% artery narrowing
~300 patients, collected from hospitals
13 non-invasive tests on patients
![Page 13: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/13.jpg)
13 Tests (Heart Disease)Tests Costs Meaning
age $1 age of the patient
sex $1 sex
cp $1 chest pain type
trestbps
$1 resting blood pressure
chol $7.27 cholesterol in mg/dl
fbs $5.20 fasting blood sugar
restecg $15.50 resting electrocardiography results
thalach $102.90 maximum heart rate
thal $102.90 maximum heart rate reached
exang $87.30 exercise induced angina
oldpeak $87.30 ST depression induced by exercise
slope $87.30 slope of the peak exercise ST segment
ca $100.90 number of major vessels colored by fluoroscopy
![Page 14: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/14.jpg)
Cost-sensitive tree for Heart Disease
1
2
2
3211
11 1
2
2 2
1
2 3
41
2
3
1 2
thal($102.9)
fbs($5.2)
restecg
($15.5)
sex($1)
chol($7.27)
0
cp ($1)
0
slope($87.3)
restecg
($15.5)
age($1)
thal($102.9)
1 0 11
1 0 01 1
1 10 0
21
• Naturally prefer tests with small cost
• Balance cost and discriminating power
• Local heart-failure specialist thinks this tree is reasonable.
![Page 15: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/15.jpg)
Considering Group Discount
Tests Costs Meaning
age $1 age of the patient
sex $1 sex
cp $1 chest pain type
trestbps
$1 resting blood pressure
chol $7.27 cholesterol in mg/dl
fbs $5.20 fasting blood sugar
restecg $15.50 resting electrocardiography results
thalach $102.90 maximum heart rate
thal $102.90 finishing heart rate
exang $87.30 exercise induced angina
oldpeak $87.30 ST depression induced by exercise
slope $87.30 slope of the peak exercise ST segment
ca $100.90 number of major vessels colored by fluoroscopy
Discount: $2.10
Discount: $101.90
Discount: $86.30
![Page 16: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/16.jpg)
1
2
2
3211
11 1
2
2 2
1
2 3
41
2
3
1 2
thal($102.9)
fbs($5.2)
restecg
($15.5)
sex($1)
chol($7.27)
0
cp ($1)
0
slope($87.3)
restecg
($15.5)
age($1)
thal($102.9)
1 0 11
1 0 01 1
1 10 0
21
1
2
2
3211
11 1
2
2 2
1
2 3
41
2
3
1 2
thal($102.9)
fbs($5.2)
restecg
($15.5)
sex($1)
chol($7.27)
0
cp ($1)
0
slope($87.3)
thalach($1)
age($1)
thal($102.9)
1 0 11
1 0 01 1
1 10 0
21
individual cost: $102.9
Before After
Different trees without/with group discount
![Page 17: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/17.jpg)
Algorithm of Cost-sensitive Decision Tree
CSDT(Examples, Attributes, TestCosts) If all examples are positive, return root with label=+ If all examples are negative, return root with label=- If maximum cost reduction <0, return root with label
according to min(PTP+ NFP, NTN+ PFN) Let A be an attribute with maximum cost reduction root A Update TestCosts if discount applies For each possible value vi of the attribute A
Add a new branch A=vi below root Segment the training examples Example_vi into the new
branch Call CSDT(examples_vi, Attributes-A, TestCosts) to build
subtree
![Page 18: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/18.jpg)
Outline
Introduction Cost-sensitive decision trees Test strategies
Sequential Test Single Batch Test Sequential Batch Test
Conclusions and future work
![Page 19: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/19.jpg)
Patient Test 1 Test 2 … Test n Cancer?
(Cost) $1 $30 ... $900 FP/FN= 100/300k
001 39 Low … High 1
002 35 Med … ? 0
003 42 ? … ? 0
… … … … … … 1
T1
T60
0
T2
T3
10
Low Med
<36 >=36
0
1 2
a cb
New1 ? ? … ? ?
Three categories of intelligent test strategies1. Sequential Test: one test, wait, … then predict 2. Single Batch Test: one batch of tests, then predict3. Sequential Batch Test: batch 1, batch 2, … then predictMinimize total cost of tests and misclassification, not trivialOur methods: utilizing the minimum-cost tree structure
![Page 20: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/20.jpg)
Outline
Introduction Cost-sensitive decision trees Test strategies
Sequential Test Single Batch Test Sequential Batch Test
Conclusions and future work
![Page 21: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/21.jpg)
Sequential Test
Use tree structure to guide test sequence
“Optimal” because tree is (locally) optimal
![Page 22: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/22.jpg)
Sequential Test
1
2
2
3211
11 1
2
2 2
1
2 3
41
2
3
1 2
thal($102.9)
fbs($5.2)
restecg
($15.5)
sex($1)
chol($7.27)
0
cp ($1)
0
slope($87.3)
thalach($1)
age($1)
thal($102.9)
1 0 11
1 0 01 1
1 10 0
21
![Page 23: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/23.jpg)
Experimental Comparison
Using 10 datasets from UCI
No. of Attributes
No. of Examples
Class dist. (N/P)
Ecoli 6 332 230/102
Breast 9 683 444/239
Heart 8 161 98/163
Thyroid 24 2000 1762/238
Australia 15 653 296/357
Tic-tac-toe
9 958 332/626
Mushroom
21 8124 4208/3916
Kr-vs-kp 36 3196 1527/1669
Voting 16 232 108/124
Cars 6 446 328/118
![Page 24: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/24.jpg)
Comparing Sequential Test Eager learning: Sequential Test (OST) (ICML’04) Lazy learning: Lazy Sequential Test (LazyOST) (TKDE’05) Cost-sensitive Naïve Bayes (CSNB) (ICDM’04)
40
50
60
70
80
90
100
0.2 0.4 0.6 0.8 1
Ratio of Unknown Attributes
To
tal C
ost
CSNB OST LazyOST
![Page 25: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/25.jpg)
Outline
Introduction Cost-sensitive decision trees Test strategies
Sequential Test Single Batch Test Sequential Batch Test
Conclusions and future work
![Page 26: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/26.jpg)
Single Batch Test Only one batch – not an easy task If too few, important tests not
requested; prediction is not accurate; total cost high
If too many, some tests are wasted; total cost high
The test example may not be classified by a leaf
![Page 27: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/27.jpg)
Single Batch Test Expected cost reduction: if a test is
done, what are the possible outcomes and cost reduction
))](())(()([)()( iRmisciRpicimisciE
R(.): all reachable unknown nodes and leaves
i
j3j2j1
1 2 3
![Page 28: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/28.jpg)
Single Batch Test
A*-like search algorithm Form a candidate list (L) and a batch list (B) Choose a test with maximum positive
expected cost reduction from L, add it to B Update L: add all reachable unknowns to L
Efficient with tree structure until expected cost reduction is 0
![Page 29: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/29.jpg)
L = empty /* list of reachable and unknown attributes */B = empty /* the batch of tests */u = the first unknown attribute when classifying a test caseAdd u into L Loop For each i L, calculate E(i): E(i)= misc(i) – [c(i) + ] E(t) = max E(i) /* t has the maximum cost reduction */ If E(t) > 0 then add t into B, delete t from L, add r(t) into L else exit Loop /* No positive cost reduction */Until L is emptyOutput B as the batch of tests
))(())(( iRmisciRp
Single Batch Test
![Page 30: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/30.jpg)
1
2
2
3211
11 1
2
2 2
1
2 3
41
2
3
1 2
thal($102.9)
fbs($5.2)
restecg
($15.5)
sex($1)
chol($7.27)
0
cp ($1)
0
slope($87.3)
thalach($1)
age($1)
thal($102.9)
1 0 11
1 0 01 1
1 10 0
21
]
Single Batch Test
![Page 31: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/31.jpg)
1
2
2
3211
11 1
2
2 2
1
2 3
41
2
3
1 2
thal($102.9)
fbs($5.2)
restecg
($15.5)
sex($1)
chol($7.27)
0
cp ($1)
0
slope($87.3)
thalach($1)
age($1)
thal($102.9)
1 0 11
1 0 01 1
1 10 0
21
]
Single Batch Test
cp is unknown. cp has positive expected cost reduction. cp is added to the batch. cp’s reachable unknown nodes are added into the candidate list.
![Page 32: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/32.jpg)
1
2
2
3211
11 1
2
2 2
1
2 3
41
2
3
1 2
thal($102.9)
fbs($5.2)
restecg
($15.5)
sex($1)
chol($7.27)
0
cp ($1)
0
slope($87.3)
thalach($1)
age($1)
thal($102.9)
1 0 11
1 0 01 1
1 10 0
21
]
From the candidate list, choose one with maximum positive expected cost reduction. Add it to the batch, and update the candidate list. Repeat. After 7 steps, expected cost reduction is 0.
Single Batch Test
![Page 33: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/33.jpg)
1
2
2
3211
11 1
2
2 2
1
2 3
41
2
3
1 2
thal($102.9)
fbs($5.2)
restecg
($15.5)
sex($1)
chol($7.27)
0
cp ($1)
0
slope($87.3)
thalach($1)
age($1)
thal($102.9)
1 0 11
1 0 01 1
1 10 0
21
]
Single Batch Test
Do all tests in the batch
![Page 34: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/34.jpg)
1
2
2
3211
11 1
2
2 2
1
2 3
41
2
3
1 2
thal($102.9)
fbs($5.2)
restecg
($15.5)
sex($1)
chol($7.27)
0
cp ($1)
0
slope($87.3)
thalach($1)
age($1)
thal($102.9)
1 0 11
1 0 01 1
1 10 0
21
]
Predict by internal node
Single Batch Test
Make a prediction. Some tests are wasted.
![Page 35: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/35.jpg)
Comparing Single Batch Tests
Naïve Single Batch (NSB) (ICML’04) Cost-sensitive Naïve Bayes Single Batch (CSNB-SB) (ICDM’04) Greedy Single Batch (GSB) (TKDE’05) Single Batch Test (OSB) (TKDE’05)
350
400
450
500
550
600
650
700
750
0.2 0.4 0.6 0.8 1
Ratio of Unknown Attributes
Tota
l Cos
t
CSNB-SB NSB GSB OSB
![Page 36: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/36.jpg)
Outline
Introduction Cost-sensitive decision trees Test strategies
Sequential Test Single Batch Test Sequential Batch Test
Conclusions and future work
![Page 37: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/37.jpg)
Sequential Batch Batch 1, batch 2, … , prediction Must include the cost of waiting in tests Wait cost of a batch: maximum wait cost in the
batch Less than the sum
Combines Sequential Test and Single Batch Test If all waiting costs =0, it becomes Sequential Test If all waiting costs very large, Single Batch
![Page 38: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/38.jpg)
Sequential Batch
The wait cost is derived from wait time
age sex cp trestbpscho
lfbs
restecg
thalach
exang
oldpek
slope
ca thal
0.001 0.001 0.001 0.01 4 4 0.5 1 1 1 1 1 1
Test wait time in hours
![Page 39: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/39.jpg)
Sequential Batch Extending the Single Batch to include the batch
cost An additional constraint: cumulative ROI
BatchCosttestCost
ionCostReductROI
No more batches!
![Page 40: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/40.jpg)
Loop L = empty /* list of reachable and unknown attributes */ B = empty /* the batch of tests */ u = the first unknown attribute when classifying a test case Add u into L Loop For each i L, calculate E(i): E(i)= misc(i) – [c(i) + ] E(t) = max E(i) /* t has the maximum cost reduction */ If E(t) > 0 & ROI increases then add t into B, delete t from L, add r(t) into L else exit Loop /* No positive cost reduction */ Until L is emptyIf (B is not empty) then Output B as the current batch of tests; obtain their values at a cost Classify the test example further, until encountering another unknown testElse exit the first Loop
))(())(( iRmisciRp
Sequential Batch
![Page 41: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/41.jpg)
Comparing Sequential Batch Test
120
170
220
270
320
370
420
470
0.2 0.4 0.6 0.8 1Unknow n attribute ratio
Tota
l cost
SingBSeqTSBT
![Page 42: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/42.jpg)
Outline
Introduction Cost-sensitive decision trees Test strategies
Sequential Test Single Batch Test Sequential Batch Test
Conclusions and future work
![Page 43: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/43.jpg)
Future Work Deal with different test examples differently Consider more costs: acquiring new examples
If $10 for each new example, how many do I need? For $10, tell me if this patient has cancer
If test is not accurate (e.g. 90%), how to build trees and how to do tests (will I do it again)?
From cost-sensitive trees, derive medical policy for expensive/risky or cheap/effective tests
![Page 44: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/44.jpg)
Conclusions Cost-sensitive decision tree: effective for
learning with minimal total cost Can be used to model learning from data with costs
Design and compare various test strategies Sequential Test: one test, wait, …: low cost but long wait Single Batch Test: one batch of tests: quick but higher cost Sequential Batch Test: batch, wait, batch, …: best tradeoff
Our methods perform better than previous ones
Can be readily applied to real-world diagnoses
![Page 45: Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649c7b5503460f9492f42d/html5/thumbnails/45.jpg)
References C.X. Ling, Q. Yang, J. Wang, and S. Zhang. Decision Trees with Minimal Costs. ICML'2004. X. Chai, L. Deng, Q. Yang, and C.X. Ling. Test-Cost Sensitive Naive Bayes Classification. ICDM'2004. C.X. Ling, S. Sheng, Q. Yang. “Intelligent Test Strategies for Cost-sensitive Decision Trees. IEEE TKDE, to appear, 2005. S. Zhang, Z. Qin, C.X. Ling, S. Sheng. "Missing is Useful": Missing Values in Cost-sensitive Decision Trees. IEEE TKDE, to appear, 2005. Turney, P.D. 2000. Types of cost in inductive concept learning. Workshop on Cost-Sensitive Learning at ICML’2000. Zubek, V.B., and Dietterich, T. 2002. Pruning improves heuristic search for cost-sensitive learning. ICML’2002. Turney, P.D. 1995. Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm. JAIR, 2:369-409. Lizotte, D., Madani, O., and Greiner R. 2003. Budgeted Learning of Naïve-Bayes Classifiers. In Uncertainty in AI.