Deep Convolutional Ranking for Multilabel Image...
Transcript of Deep Convolutional Ranking for Multilabel Image...
![Page 1: Deep Convolutional Ranking for Multilabel Image Annotationvicente/recognition/2016/presentations/deep-ranking.pdfdenotes the number of positive labels for each image •m maybe is](https://reader033.fdocuments.us/reader033/viewer/2022050215/5f60c44823ab9443f4274f33/html5/thumbnails/1.jpg)
Deep Convolutional Ranking for
Multilabel Image Annotation
Xie, Jiayu Liu, Xiao
![Page 2: Deep Convolutional Ranking for Multilabel Image Annotationvicente/recognition/2016/presentations/deep-ranking.pdfdenotes the number of positive labels for each image •m maybe is](https://reader033.fdocuments.us/reader033/viewer/2022050215/5f60c44823ab9443f4274f33/html5/thumbnails/2.jpg)
• features based on Deep Neural Networks
• a significant performance gain could be obtained by combining convolutional architectures with approximate top-k ranking objectives
• multiple labels to one image
Introduction
![Page 3: Deep Convolutional Ranking for Multilabel Image Annotationvicente/recognition/2016/presentations/deep-ranking.pdfdenotes the number of positive labels for each image •m maybe is](https://reader033.fdocuments.us/reader033/viewer/2022050215/5f60c44823ab9443f4274f33/html5/thumbnails/3.jpg)
NUS-WIDE: Image from Flickr, 81 tags Each image is resized to 256*256, then 220*220 patches are extracted from the whole image, at the center and the four corners to provide an augmentation of the dataset
Dataset
Flickr tag: uk hairy nature field animal scotland ginger cow cattle unfound coo mugdock highlandcowNUS-WIDE tag:animal cloud cow grass
![Page 4: Deep Convolutional Ranking for Multilabel Image Annotationvicente/recognition/2016/presentations/deep-ranking.pdfdenotes the number of positive labels for each image •m maybe is](https://reader033.fdocuments.us/reader033/viewer/2022050215/5f60c44823ab9443f4274f33/html5/thumbnails/4.jpg)
Five convolutional layers, some followed by max-pooling layers three fully-connected layers with final softmax
Network Architecture
![Page 5: Deep Convolutional Ranking for Multilabel Image Annotationvicente/recognition/2016/presentations/deep-ranking.pdfdenotes the number of positive labels for each image •m maybe is](https://reader033.fdocuments.us/reader033/viewer/2022050215/5f60c44823ab9443f4274f33/html5/thumbnails/5.jpg)
SoftmaxThe output of f (x) is a scoring function of the data point x
• fj(xi) means the activation value for image xi and class j
• c+ denotes the number of positive labels for each image
• m maybe is batch-size
Multilabel Ranking Losses (1/3)
![Page 6: Deep Convolutional Ranking for Multilabel Image Annotationvicente/recognition/2016/presentations/deep-ranking.pdfdenotes the number of positive labels for each image •m maybe is](https://reader033.fdocuments.us/reader033/viewer/2022050215/5f60c44823ab9443f4274f33/html5/thumbnails/6.jpg)
Pairwise Ranking• The output of f (x) is a scoring function of the data point x
• fj(xi) means the activation value for image xi and class j
• c+ is the positive labels and c_ is the negative labels
• rank the positive labels have higher scores than negative labels
Multilabel Ranking Losses (2/3)
![Page 7: Deep Convolutional Ranking for Multilabel Image Annotationvicente/recognition/2016/presentations/deep-ranking.pdfdenotes the number of positive labels for each image •m maybe is](https://reader033.fdocuments.us/reader033/viewer/2022050215/5f60c44823ab9443f4274f33/html5/thumbnails/7.jpg)
Weighted Approximate Ranking (WARP)• The output of f (x) is a scoring function of the data point x
• fj(xi) means the activation value for image xi and class j
• c+ is the positive labels and c_ is the negative labels
• rank the positive labels have higher scores than negative labels
Multilabel Ranking Losses (3/3)
![Page 8: Deep Convolutional Ranking for Multilabel Image Annotationvicente/recognition/2016/presentations/deep-ranking.pdfdenotes the number of positive labels for each image •m maybe is](https://reader033.fdocuments.us/reader033/viewer/2022050215/5f60c44823ab9443f4274f33/html5/thumbnails/8.jpg)
Baseline
Visual Feature:• GIST (Gradient-domain Image STitching): 960d• SIFT (Scale-Invariant Feature Transform): 5000d *6
2 methods (dense, Harris corner) *3 descriptors (SIFT, CSIFT, RGBSIFT)
• HOG (Histogram of Oriented Gradient): 5000d• Color (RGB histogram): 512dUse Kernel PCA to reduce each dimensionality to 500d (36,472 ⇒ 4,500 dimensional feature vector)
Learning Algorithms:• Visual Feature + kNN• Visual Feature + SVM
![Page 9: Deep Convolutional Ranking for Multilabel Image Annotationvicente/recognition/2016/presentations/deep-ranking.pdfdenotes the number of positive labels for each image •m maybe is](https://reader033.fdocuments.us/reader033/viewer/2022050215/5f60c44823ab9443f4274f33/html5/thumbnails/9.jpg)
Experiments
![Page 10: Deep Convolutional Ranking for Multilabel Image Annotationvicente/recognition/2016/presentations/deep-ranking.pdfdenotes the number of positive labels for each image •m maybe is](https://reader033.fdocuments.us/reader033/viewer/2022050215/5f60c44823ab9443f4274f33/html5/thumbnails/10.jpg)
Result (WARP)
![Page 11: Deep Convolutional Ranking for Multilabel Image Annotationvicente/recognition/2016/presentations/deep-ranking.pdfdenotes the number of positive labels for each image •m maybe is](https://reader033.fdocuments.us/reader033/viewer/2022050215/5f60c44823ab9443f4274f33/html5/thumbnails/11.jpg)
Results
![Page 12: Deep Convolutional Ranking for Multilabel Image Annotationvicente/recognition/2016/presentations/deep-ranking.pdfdenotes the number of positive labels for each image •m maybe is](https://reader033.fdocuments.us/reader033/viewer/2022050215/5f60c44823ab9443f4274f33/html5/thumbnails/12.jpg)
Questions?