Feature and Variable Selection in Classification

14

Click here to load reader

description

A 20min presentation on the why and how of variable selection with just a touch of feature creation.

Transcript of Feature and Variable Selection in Classification

Page 1: Feature and Variable Selection in Classification

Feature and Variable Selection in Classification

Aaron Karper

University of Bern

Aaron Karper (UniBe) Feature selection 1 / 12

Page 2: Feature and Variable Selection in Classification

Why?

Why not use all the features?

OverfittingInterpretabilityComputational Complexity

Model complexity

erro

r

training errortest error

Aaron Karper (UniBe) Feature selection 2 / 12

Page 3: Feature and Variable Selection in Classification

What are the options?

Ranking

Measure relevance for each feature separately.

The good:

Fast

The bad:

Xor problem.

Aaron Karper (UniBe) Feature selection 3 / 12

Page 4: Feature and Variable Selection in Classification

What are the options?

Ranking

Measure relevance for each feature separately.

The good:

Fast

The bad:

Xor problem.

Aaron Karper (UniBe) Feature selection 3 / 12

Page 5: Feature and Variable Selection in Classification

What are the options?

Xor problem

Aaron Karper (UniBe) Feature selection 4 / 12

Page 6: Feature and Variable Selection in Classification

What are the options?

Filters

Walk in featuresubset space

evaluateproxy measure

trainclassifier

The good:

Flexibility

The bad:

Suboptimalperformance

Aaron Karper (UniBe) Feature selection 5 / 12

Page 7: Feature and Variable Selection in Classification

What are the options?

Wrappers

Walk in featuresubset space

trainclassifier

The good:

Accuracy

The bad:

Slow training

Aaron Karper (UniBe) Feature selection 6 / 12

Page 8: Feature and Variable Selection in Classification

What are the options?

Embedded methods

Integrate feature selection into classifier.

The good:

Accuracy, trainingtime

The bad:

Lacks flexibility

Aaron Karper (UniBe) Feature selection 7 / 12

Page 9: Feature and Variable Selection in Classification

What should I use?

What is the best one?

Accuracy-wise: embedded or wrapper.Complexity-wise: ranking, filters.Why not both?

Aaron Karper (UniBe) Feature selection 8 / 12

Page 10: Feature and Variable Selection in Classification

Examples

Probabilistic feature selection

For model p(c |x) ∝ p(c) p(x |c)

Can be retrofitted withp(c) = p(M) p(c |M) formodel M.More degrees of freedomspread the model thin.Standard optimizationsapply.

possible data

prob

abili

ty

specific modelwide spread model

Aaron Karper (UniBe) Feature selection 9 / 12

Page 11: Feature and Variable Selection in Classification

Examples

Probabilistic feature selection

For model p(c |x) ∝ p(c) p(x |c)

Can be retrofitted withp(c) = p(M) p(c |M) formodel M.More degrees of freedomspread the model thin.Standard optimizationsapply.

possible data

prob

abili

ty

specific modelwide spread model

Aaron Karper (UniBe) Feature selection 9 / 12

Page 12: Feature and Variable Selection in Classification

Examples

Probabilistic feature selection

Akaike information criterion every additional variable needs to explain e times asmuch data.

Bayesian information criterion Unused parameters are marginalized.Minimum descriptor length

Aaron Karper (UniBe) Feature selection 10 / 12

Page 13: Feature and Variable Selection in Classification

Examples

Autoencoder

Deep neural network.Create fixed size informationbottleneck.Train to being able to reconstructoriginal data.

1000

2000

30

500

1000

2000

500

Input

Reconstruction

Bottleneck

Aaron Karper (UniBe) Feature selection 11 / 12

Page 14: Feature and Variable Selection in Classification

Prediction

Predictions

Embedded methods will improve more than other approaches.Others as first step for complexity reasons.

Aaron Karper (UniBe) Feature selection 12 / 12