Download - 2 Task Embedding for Model Recommendation · Model recommendation Feature Extractor Zoo 3 X L(y) recommender Input: Task = (dataset, loss) For each feature extractor architecture

TASK2VEC: Task Embedding for Model RecommendationSubhransu MajiCollege of Information and Computer SciencesUniversity of Massachusetts, Amherst

http://people.cs.umass.edu/smaji

February 19, 2019 @ ICERM, Brown University

https://arxiv.org/abs/1902.03545

http://people.cs.umass.edu/smaji


Task Embedding for Model RecommendationAllesandro,Michael,Rahul,Avinash,Subhransu,Charless,Stefano,Pietro

2

Task={dataset,labels,loss}

Whataresimilartasks?WhatarchitectureshouldIuse?Whatpre-trainingdataset?Whathyperparameters?DoIneedmoretrainingdata?Howdifficultisthistask?...

IfwehaveauniversalvectorialrepresentationoftaskswecanframeallsortsofinterestingCVapplicationsengineeringproblemsasmeta-learningproblems

Model recommendation

3FeatureExtractorZoo

X L(y)

recommender

Input:Task=(dataset,loss)

ForeachfeatureextractorarchitectureF:1.TrainclassifieronF(dataset)2.Computevalidationperformance

Output:bestperformingmodel

BruteForce:

Input:Task=(dataset,loss)

1.Computetaskembeddingt=E(Task)2.PredictbestextractorF=M(t)2.TrainclassifieronF(dataset)3.Computevalidationperformance

Output:bestperformingmodel

Taskrecommendation:

1. Givenatask,trainaclassifierwiththetasklossonfeaturesfromageneric“probenetwork”

2. Computegradientsofprobenetworkparametersw.r.t.taskloss

3. Usestatisticsoftheprobeparametergradientsasthefixeddimensionaltaskembedding

4

Intuition:Fprovidesinformationaboutthesensitivityofthetaskperformancetosmallperturbationsofparametersintheprobenetwork

Ex⇠p̂

KLp

✓

0(y|x)p✓

(y|x) = �✓ · F · �✓ + o(�✓2),

Task embedding using Fisher Information

5

Properties of TASK2VECembedding

(xi, yi), i = 1 . . . n, yi 2 {0, 1}Dataset:

Classifier:

FIMforcrossentropylossforthelastlayer:

Fw =1

N

X

i

pi(1� pi)�(xi)�(xi)T

pi = �

�w

T�(xi)

�

Twolayernetwork

x ! �(x)

1. Invariancetolabelspace2. Encodestaskdifficulty3. Encodestaskdomain4. Encodesusefulfeaturesforthetask

6

Properties of TASK2VECembedding

(xi, yi), i = 1 . . . n, yi 2 {0, 1}Dataset:

Classifier:

FIMforcrossentropylossforthelastlayer:

Fw =1

N

X

i

pi(1� pi)�(xi)�(xi)T

pi = �

�w

T�(xi)

�

D =1

N

X

i

�(xi)�(xi)T

Representativedomainembedding

Properties of TASK2VECembedding1. Binarytasksonunitsquare,

i.e.,eachtileisatask2. 10RandomReLUfeatures,i.e.,

φᵢ=max(0,aᵢx+bᵢy+cᵢ)3. T-SNEtomap10x10FIMto2D

7

Properties of TASK2VECembedding1. Binarytasksonunitsquare,

i.e.,eachtileisatask2. 10RandomReLUfeatures,i.e.,

φᵢ=max(0,aᵢx+bᵢy+cᵢ)3. T-SNEtomap10x10FIMto2D

8Polynomialdegree3

9

Robust Fisher Computation

1. ForrealisticCVtaskswewanttousedeepCNNs(e.g.,ResNet)andestimateFIMforalltheparameters.

2. Challenge:FIMcanbehardtoestimate(noisylosslandscape;highdimensions;smalltrainingset)

3. RobustFIM1. Restrictittoadiagonal2. Restrictitasinglevalueper

filter(CNNlayer)3. Robustestimationvia

perturbation

EstimateΛofaGaussianperturbation:

OptimalΛsatisfies:

“TrivialEmbedding”

10

Similarity measures on the space of tasks

TaskA TaskB

Task={dataset,labels,loss}


Domainsimilarity

11

Unbiasedlookatdatasetbias,TorralbaandEfros,CVPR11


Domainsimilarity

Range/labelsimilarity• e.g.,Taxonomicdistance

12

https://www.pinterest.com/pin/520799144386337065/


Domainsimilarity

Range/labelsimilarity• e.g.,Taxonomicdistance

Transfer“distance”• Fine-tuneonafollowedbyb

13

Taskonomy:DisentanglingTaskTransferLearning,AmirZamir,AlexanderSax,WilliamShen,LeonidasGuibas,JitendraMalik,SilvioSavarese,CVPR18

Distance measures on TASK2VECembedding

Symmetricdistance

Asymmetric“distance”

14

MODEL2VEC: Joint embedding of tasks and models

1. Sofarwehavebeenassociatingmodels(featureextractors)withthetaskstheyaretrainedon.

2. Howabout1. legacy/black-boxfeatureextractors?E.g.,SIFT,HOG,Fishervector2. modelsofdifferentcomplexitytrainedonthesamedataset

3. MODEL2VEC:Jointlyembedfeatureextractors(encodedasone-hot-vectors)andtaskssuchthatsimilarityreflectsameta-taskobjective.1. Needstrainingdata

15

• Tasks[1460]• iNaturalist[207]• CUB200[25]• iMaterialist[228]• DeepFashion[1000]

16

Task Zoo


17

Task Zoo


18

Task Zoo


• Fewtasks>10Ktrainingsamplesbutmosthave100-1000samples

19

Task Zoo

Experiment: TASK2VEC recapitulates iNaturalist taxonomy

Taskembeddingcosinesimilarity

plants

reptilesbirdsmammals

insects

ResNettrainedonImageNetasprobenetwork

20

Experiment: TASK2VEC norm encodes task difficulty

ResNettrainedonImageNetasprobenetwork

21

Task Embeddings Domain Embeddings

Actinopterygii (n)

Amphibia (n)

Arachnida (n)

Aves (n)

Fungi (n)

Insecta (n)

Mammalia (n)

Mollusca (n)

Plantae (n)

Protozoa (n)

Reptilia (n)

Category (m)

Color (m)

Gender (m)

Material (m)

Neckline (m)

Pants (m)

Pattern (m)

Shoes (m)

Experiment: TASK2VEC vs DOMAIN2VEC

22


23

• FeatureZoo[156experts]• ResNet-34pertainedonImageNet• Followedbyfine-tunedontaskswithenoughexamples

Task and Feature Zoo

FeatureExtractorZoo

X L(y)

recommender

The Matrix

24Featureextractors

Tasks The Matrix

25

iNaturalist+CUB

The Matrix

Experts

Tasks

26

ImageNet expert is usually good but on many tasks the best expert handily outperforms the ImageNet expert

27

Data efficiency of TASK2VEC

ImageNetfixed

ImageNetfinetune

Task2vecfinetune

Task2vecfixed

Bruteforcefixed

28

Choice of distance for TASK2VEC

Relativeerrorincreaseovertheoracle(bestchoice)

29

Choice of the probe network on TASK2VEC

Relativeerrorincreaseovertheoracle(bestchoice)

Thank you!

Task2Vec:TaskEmbeddingforMeta-Learning,AlessandroAchille,MichaelLam,RahulTewari,AvinashRavichandran,SubhransuMaji,CharlessFowlkes,StefanoSoatto,PietroPerona(https://arxiv.org/abs/1902.03545)

30