TASK2VEC: Task Embedding for Model RecommendationSubhransu MajiCollege of Information and Computer SciencesUniversity of Massachusetts, Amherst
http://people.cs.umass.edu/smaji
February 19, 2019 @ ICERM, Brown University
https://arxiv.org/abs/1902.03545
Task Embedding for Model RecommendationAllesandro,Michael,Rahul,Avinash,Subhransu,Charless,Stefano,Pietro
2
Task={dataset,labels,loss}
Whataresimilartasks?WhatarchitectureshouldIuse?Whatpre-trainingdataset?Whathyperparameters?DoIneedmoretrainingdata?Howdifficultisthistask?...
IfwehaveauniversalvectorialrepresentationoftaskswecanframeallsortsofinterestingCVapplicationsengineeringproblemsasmeta-learningproblems
Model recommendation
3FeatureExtractorZoo
X L(y)
recommender
Input:Task=(dataset,loss)
ForeachfeatureextractorarchitectureF:1.TrainclassifieronF(dataset)2.Computevalidationperformance
Output:bestperformingmodel
BruteForce:
Input:Task=(dataset,loss)
1.Computetaskembeddingt=E(Task)2.PredictbestextractorF=M(t)2.TrainclassifieronF(dataset)3.Computevalidationperformance
Output:bestperformingmodel
Taskrecommendation:
1. Givenatask,trainaclassifierwiththetasklossonfeaturesfromageneric“probenetwork”
2. Computegradientsofprobenetworkparametersw.r.t.taskloss
3. Usestatisticsoftheprobeparametergradientsasthefixeddimensionaltaskembedding
4
Intuition:Fprovidesinformationaboutthesensitivityofthetaskperformancetosmallperturbationsofparametersintheprobenetwork
Ex⇠p̂
KLp
✓
0(y|x)p✓
(y|x) = �✓ · F · �✓ + o(�✓2),
Task embedding using Fisher Information
5
Properties of TASK2VECembedding
(xi, yi), i = 1 . . . n, yi 2 {0, 1}Dataset:
Classifier:
FIMforcrossentropylossforthelastlayer:
Fw =1
N
X
i
pi(1� pi)�(xi)�(xi)T
pi = �
�w
T�(xi)
�
Twolayernetwork
x ! �(x)
1. Invariancetolabelspace2. Encodestaskdifficulty3. Encodestaskdomain4. Encodesusefulfeaturesforthetask
6
Properties of TASK2VECembedding
(xi, yi), i = 1 . . . n, yi 2 {0, 1}Dataset:
Classifier:
FIMforcrossentropylossforthelastlayer:
Fw =1
N
X
i
pi(1� pi)�(xi)�(xi)T
pi = �
�w
T�(xi)
�
D =1
N
X
i
�(xi)�(xi)T
Representativedomainembedding
Properties of TASK2VECembedding1. Binarytasksonunitsquare,
i.e.,eachtileisatask2. 10RandomReLUfeatures,i.e.,
φᵢ=max(0,aᵢx+bᵢy+cᵢ)3. T-SNEtomap10x10FIMto2D
7
Properties of TASK2VECembedding1. Binarytasksonunitsquare,
i.e.,eachtileisatask2. 10RandomReLUfeatures,i.e.,
φᵢ=max(0,aᵢx+bᵢy+cᵢ)3. T-SNEtomap10x10FIMto2D
8Polynomialdegree3
9
Robust Fisher Computation
1. ForrealisticCVtaskswewanttousedeepCNNs(e.g.,ResNet)andestimateFIMforalltheparameters.
2. Challenge:FIMcanbehardtoestimate(noisylosslandscape;highdimensions;smalltrainingset)
3. RobustFIM1. Restrictittoadiagonal2. Restrictitasinglevalueper
filter(CNNlayer)3. Robustestimationvia
perturbation
EstimateΛofaGaussianperturbation:
OptimalΛsatisfies:
“TrivialEmbedding”
10
Similarity measures on the space of tasks
TaskA TaskB
Task={dataset,labels,loss}
Similarity measures on the space of tasks
Domainsimilarity
11
Unbiasedlookatdatasetbias,TorralbaandEfros,CVPR11
Similarity measures on the space of tasks
Domainsimilarity
Range/labelsimilarity• e.g.,Taxonomicdistance
12
https://www.pinterest.com/pin/520799144386337065/
Similarity measures on the space of tasks
Domainsimilarity
Range/labelsimilarity• e.g.,Taxonomicdistance
Transfer“distance”• Fine-tuneonafollowedbyb
13
Taskonomy:DisentanglingTaskTransferLearning,AmirZamir,AlexanderSax,WilliamShen,LeonidasGuibas,JitendraMalik,SilvioSavarese,CVPR18
Distance measures on TASK2VECembedding
Symmetricdistance
Asymmetric“distance”
14
MODEL2VEC: Joint embedding of tasks and models
1. Sofarwehavebeenassociatingmodels(featureextractors)withthetaskstheyaretrainedon.
2. Howabout1. legacy/black-boxfeatureextractors?E.g.,SIFT,HOG,Fishervector2. modelsofdifferentcomplexitytrainedonthesamedataset
3. MODEL2VEC:Jointlyembedfeatureextractors(encodedasone-hot-vectors)andtaskssuchthatsimilarityreflectsameta-taskobjective.1. Needstrainingdata
15
• Tasks[1460]• iNaturalist[207]• CUB200[25]• iMaterialist[228]• DeepFashion[1000]
16
Task Zoo
• Tasks[1460]• iNaturalist[207]• CUB200[25]• iMaterialist[228]• DeepFashion[1000]
17
Task Zoo
• Tasks[1460]• iNaturalist[207]• CUB200[25]• iMaterialist[228]• DeepFashion[1000]
18
Task Zoo
• Tasks[1460]• iNaturalist[207]• CUB200[25]• iMaterialist[228]• DeepFashion[1000]
• Fewtasks>10Ktrainingsamplesbutmosthave100-1000samples
19
Task Zoo
Experiment: TASK2VEC recapitulates iNaturalist taxonomy
Taskembeddingcosinesimilarity
plants
reptilesbirdsmammals
insects
ResNettrainedonImageNetasprobenetwork
20
Experiment: TASK2VEC norm encodes task difficulty
ResNettrainedonImageNetasprobenetwork
21
Task Embeddings Domain Embeddings
Actinopterygii (n)
Amphibia (n)
Arachnida (n)
Aves (n)
Fungi (n)
Insecta (n)
Mammalia (n)
Mollusca (n)
Plantae (n)
Protozoa (n)
Reptilia (n)
Category (m)
Color (m)
Gender (m)
Material (m)
Neckline (m)
Pants (m)
Pattern (m)
Shoes (m)
Experiment: TASK2VEC vs DOMAIN2VEC
22
• Tasks[1460]• iNaturalist[207]• CUB200[25]• iMaterialist[228]• DeepFashion[1000]
23
• FeatureZoo[156experts]• ResNet-34pertainedonImageNet• Followedbyfine-tunedontaskswithenoughexamples
Task and Feature Zoo
FeatureExtractorZoo
X L(y)
recommender
The Matrix
24Featureextractors
Tasks The Matrix
25
iNaturalist+CUB
The Matrix
Experts
Tasks
26
ImageNet expert is usually good but on many tasks the best expert handily outperforms the ImageNet expert
27
Data efficiency of TASK2VEC
ImageNetfixed
ImageNetfinetune
Task2vecfinetune
Task2vecfixed
Bruteforcefixed
28
Choice of distance for TASK2VEC
Relativeerrorincreaseovertheoracle(bestchoice)
29
Choice of the probe network on TASK2VEC
Relativeerrorincreaseovertheoracle(bestchoice)
Thank you!
Task2Vec:TaskEmbeddingforMeta-Learning,AlessandroAchille,MichaelLam,RahulTewari,AvinashRavichandran,SubhransuMaji,CharlessFowlkes,StefanoSoatto,PietroPerona(https://arxiv.org/abs/1902.03545)
30
Top Related