Tensor Contraction for Parsimonious Deep Netstensorlab.cms.caltech.edu/users/...presentation.pdf ·...
Transcript of Tensor Contraction for Parsimonious Deep Netstensorlab.cms.caltech.edu/users/...presentation.pdf ·...
Tensor Contraction for Parsimonious Deep Nets
Jean Kossaifi, Aran Khanna, Zachary C. Lipton, TommasoFurlanello and Anima Anandkumar
AMAZON AI
Traditionalapproaches
• DATA=>CONV=>RELU=>POOL=>Activationtensor• Flatteningloosesinformation• Canweleveragedirectlytheactivationtensorbeforetheflattening?• Potentialspacesavings• Performanceimprovement
TensorContraction
• Tensorcontraction:contractalongeachmodetoobtainalowrank,compacttensor
TensorContractionLayers(TCL)
• Takeactivationtensorasinput
• FeeditthroughaTCL
• Outputalow-rankactivationtensor
TensorContractionLayers(TCL)
• Compactrepresentationlessparameters
• Measuredintermsofpercentageofspacesavingsinthefullyconnectedlayers:
1–𝑛𝑇𝐶𝐿
𝑛𝑜𝑟𝑖𝑔𝑖𝑛𝑎𝑙
• Similarandsometimesbetterperformance
PerformanceoftheTCL
• Trainedend-to-end
• OnImageNetwithVGG:• 65.9%spacesavings• performancedropof0.6%only
• OnImageNetwithAlexNet:• 56.6%spacesavings• Performanceimprovementof0.5%
Low-ranktensorregression
TensorRegressionNetworks,J.Kossaifi,Z.C.Lipton,A.Khanna,T.Furlanello andA.Anandkumar,ArXiv pre-publication
Performanceandrank
PerformanceoftheTRL
TRLrank Performance(in%)Top-1 Top-1 Spacesavings
baseline 77.1 93.4 0
(200,1,1,200) 77.1 93.2 68.2
(150,1,1,150) 76 92.9 76.6
(100,1,100) 74.6 91.7 84.6
(50,1,1,50) 73.6 91 92.4
ResultsonImageNetwithaResNet-101
• 92.4%spacesavings,4%decreaseinTop-1accuracy• 68.2%spacesavings,nodecreaseinTop-1accuracy
Implementation
• MXNet asaDeepLearningframeworkhttp://mxnet.io/
• TensorLy fortensoroperationshttps://tensorly.github.io
• Comingsoon:mxnet backendforTensorLytensoroperationonGPUandCPU
Conclusionandfuturework• TensorcontractionandtensorregressionforDeepNeuralNets• Addasanadditionallayerorreplaceone• Lessparameters• Similarorevenbetterperformance
Futurework• Fastertensoroperations• Exploremoretensoroperations/networksarchitectures