Generalized Inverse Classificationmichaeltlash.com/slides/lash_sdm2017_slides.pdfGeneralized Inverse...

GENERALIZED INVERSECLASSIFICATIONSDM ’17

Michael T. Lash1, Qihang Lin2, W. Nick Street2, JenniferG. Robinson3, and Jeffrey Ohlmann2

1Department of Computer Science, 2Deparment of ManagementSciences, 3Department of Epidemiology

www.michaeltlash.com

What is inverse classification?

# The process of making meaningful perturbations to a testinstance such that the probability of a desirable outcome ismaximized.

1

What is inverse classification?

# The process of making meaningful perturbations to a testinstance such that the probability of a desirable outcome ismaximized.

What about the meaningful part of the definition?1

Meaningful Perturbations

Well...lets visit some past work!# Michael T. Lash, Qihang Lin, W. Nick Street, and

Jennifer G. Robinson, “A budget-constrained inverseclassification framework for smooth classifiers”, arXivpreprint arXiv:1605.09068, submitted.

2


Begin with a basic formulation.

2


2


Some regressor

Segment features.

2


Some regressor

Estimate indirectly changeable.

2


Some regressor

Update objective function. Add constraints.

2


Some regressor

Cost-change function

2


Some regressor

Budget

2


Some regressorBounds

2

Main Contributions

1. Relax assumptions about f (·).


3

Main Contributions

1. Relax assumptions about f (·).


Generalized inverse classification 3

Main Contributions

1. Relax assumptions about f (·).2. Quadratic cost-change function.

Some regressor

4

Main Contributions

1. Relax assumptions about f (·).2. Quadratic cost-change function.3. Three real-valued heuristic optimization

methods and two sensitivity analysis-basedoptimization methods.

* Projection operator to maintain feasibility.

5

Optimization Methodology

Heuristic

# Hill Climbing + Local Search (HC+LS)# Genetic Algorithm (GA)# Genetic Algorithm + Local Search (GA+LS)

Sensitivity Analysis

# Local Variable Perturbation – First Improvement (LVP-FI)# Local Variable Perturbation – Best Improvement (LVP-BI)

6

Experiment Decisions and Data

# f (·): Random forest# H(·): Kernel regression# Dataset 1: Student Performance (UCI Machine Learning

Repository).# Dataset 2: ARIC# One f for optimization, separate f for heldout evaluation.

7

Results: Student Performance

8

Results: ARIC

9

Results: ARIC

Need sparsity constraints 9

Conclusions

# Generalized Inverse Classification: can use virtually anylearned f (as shown by experiments w/ Random Forestclassifier).

# Our proposed methods were successful, although thisvaried by dataset.

10

GENERALIZED INVERSECLASSIFICATIONSDM ’17

Michael T. Lash1, Qihang Lin2, W. Nick Street2, JenniferG. Robinson3, and Jeffrey Ohlmann2

1Department of Computer Science, 2Deparment of ManagementSciences, 3Department of Epidemiology

www.michaeltlash.com

10

Causality and Inverse Classification

# Yes! ....# What we’re doing:

1. Imposing our own causal structure (DAG).2. We’re not taking the usual counterfactual approach.

# Future work will focus on incorporating causalmethodology...

11


# Yes! ....

# What we’re doing:1. Imposing our own causal structure (DAG).2. We’re not taking the usual counterfactual approach.


11


# Yes! ....# What we’re doing:

1. Imposing our own causal structure (DAG).2. We’re not taking the usual counterfactual approach.


11

Generalized Inverse Classificationmichaeltlash.com/slides/lash_sdm2017_slides.pdfGeneralized Inverse...

Documents

Transcript of Generalized Inverse Classificationmichaeltlash.com/slides/lash_sdm2017_slides.pdfGeneralized Inverse...