· outperforms the current state-of-the-art on Stanford Cars dataset. To further explore if the...

Hyper-class Augmented and Regularized Deep Learning for Fine-grained Image Classification

Saining Xie1, Tianbao Yang2 Xiaoyu Wang3, Yuanqing Lin4

1University of California, San Diego. 2University of Iowa. 3Snapchat Research. 4NEC Labs America, Inc.

Fine-grained image classification (FGIC) is challenging because (i) fine-grained labeled data is much more expensive to acquire (usually requir-ing domain expertise); (ii) there exists large intra-class and small inter-class variance. In this paper, we propose a systematic framework of learn-ing a deep CNN that addresses the challenges from two new perspectives:(i) identifying easily annotated hyper-classes inherent in the fine-graineddata and acquiring a large number of hyper-class-labeled images from read-ily available external sources, and formulating the problem into multi-tasklearning, to address the data scarcity issue. We use two common types ofhyper-classes to augment our data, with one being the super-type hyper-classes that subsume a set of fine-grained classes, and another being namedfactor-type hyper-classes (e.g., different view-points of a car) that explainthe large intra-class variance. (ii) a novel learning model by exploiting a reg-ularization between the fine-grained recognition model and the hyper-classrecognition model to mitigate the issue of large intra-class variance and im-prove the generalization performance. The proposed approach also closelyrelates to attribute-based learning, since one can consider that factor-typehyper-classes are (or can be generalized to) object attributes. We demon-strate the success of the proposed framework on two small-scale fine-graineddatasets (Stanford Dogs and Stanford Cars) and on a large-scale car datasetthat we collected.

……."(a) Super-type

……."(b) Factor-type

Figure 1: Two types of relationships between hyper-classes and fine-grained classes.

hyper-class regularized learning. As a factor-type hyper-class can be con-sidered as a hidden variable for generating the fine-grained class, thereforewe model Pr(y|x) by

Pr(y|x) =K

∑v=1

Pr(y|v,x)Pr(v|x) (1)

where Pr(v|x) is the probability of any factor-type hyper-class v and Pr(y|v,x)specifies the probability of any fine-grained class given the factor-type hyper-class and the input image x. If we let h(x) denote the high level features ofx, we model the probability Pr(v|x) by a softmax function

Pr(v|x) = (exp(u>v h(x)))/(K

∑v′=1

exp(u>v′h(x))) (2)

where {uv} denote the weights for the hyper-class classification model.Given the factor-type hyper-class v and the high level features h of x, theprobability Pr(y|v,x) is computed by

Pr(y = c|v,x) = (exp(w>v,ch(x)))/(C

∑c=1

exp(w>v,ch(x))) (3)

where {wv,c} denote the weights of factor-specific fine-grained recognitionmodel. Putting together (2) and (3), we have the following predictive prob-ability for a specific fine-grained class, and we use this equation to make thefinal predictions

Pr(y = c|x) =K

∑v=1

exp(w>v,ch(x))

∑Cc=1 exp(w>v,ch(x))

exp(u>v h(x))∑

Kv′=1 exp(u>v′h(x))

We introduce the following regularization between {wv,c} and {uv},

R({wv,c},{uv}) =β

∑v=1

∑c=1‖wv,c−uv‖2

The regularization is responsible for transferring the knowledge to the per-viewpoint category classifier and thus helps mode the intra-class variance in

This is an extended abstract. The full paper is available at the Computer Vision Foundationwebpage.

Figure 2: Network Structures

softmax loss on hyper-class data

softmax loss on fine-grained data

inputs lower level features

high level features classifiers

……."

(a) HA-CNN

softmax loss on hyper-class data

softmax loss on fine-grained data

inputs lower level features

high level features classifiers

……."

(b) HAR-CNN

the fine-grained task. The only difference for super-type hyper-class regu-larized deep learning is on Pr(y|v,x), which can be simply modeled by

Pr(y = c|vc,x) = (exp(w>vc,ch(x)))/(C

∑c=1

exp(w>vc,ch(x)))

since the super-type hyper-class vc is implicitly indicated by the fine-grainedlabel c. The regularization then becomes

R({wvc,c},{uv}) =β

∑c=1‖wvc,c−uvc‖2

Experimental Results

Table 1: Accuracy on Stanford-Cars [2] dataset.

Method Accuracy(%)LLC 69.5ELLF 73.9

ImageNet-Feat-LR 54.1CNN 68.6

HA-CNN-Random 69.8FT-CNN 83.1

HA-CNN (ours) 76.7HAR-CNN (ours) 80.8

FT-HA-CNN (ours) 83.5FT-HAR-CNN (ours) 86.3

Table 2: Accuracy on Stanford-Dogs dataset.

Method AccUGA 57.0Gnostic Fields 47.7CNN 42.3HA-CNN (ours) 48.3HAR-CNN (ours) 49.4

Table 3: Accuracy on Large-scale Cars dataset.

Method AccImageNet-Feat-LR 42.8

CNN 81.6

HA-CNN (ours) 82.4HAR-CNN (ours) 83.6

Our network structure and experiment settings are built upon Alex-Net [3].experimental results demonstrate that, when training on small-scale datasetfrom scratch and without any fine-tuning, the proposed approach enablesus to train a model that yields reasonably good performance. When inte-grated in the ImageNet fine-tuning process [1], our approach significantlyoutperforms the current state-of-the-art on Stanford Cars dataset. To furtherexplore if the proposed framework is still useful when training on large-scale FGIC, we collect a large dataset and perform experiments similar tothose for the Stanford Cars data.

[1] Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. Richfeature hierarchies for accurate object detection and semantic segmenta-tion. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEEConference on, pages 580–587. IEEE, 2014.

[2] Michael Stark Jonathan Krause, Jia Deng and Li Fei-Fei. Collecting alarge-scale dataset of fine-grained cars. The Second Workshop on Fine-Grained Visual Categorization, 2013.

[3] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet clas-sification with deep convolutional neural networks. In Advances in neu-ral information processing systems (NIPS), pages 1097–1105, 2012.

· outperforms the current state-of-the-art on Stanford Cars dataset. To further explore if the...

Documents

Transcript of · outperforms the current state-of-the-art on Stanford Cars dataset. To further explore if the...

Cisco UCS outperforms hp blade servers on east west latency

Perfect Pay - Walker Digital Table Systems · Perfect Pay outPerforMs all other baccarat tables. Perfect Pay outperforms all other ... Because game protection is built into the system,

FOR THE EASTERN DISTRICT OF PENNSYLVANIA DEUTSCHE … · (“FGIC”), Ambac Assurance Corporation (“Ambac”), and MBIA Insurance Corporation (“MBIA”) (collectively, the “Certificate

I HART CGM Study Dexcom CGM Outperforms Abbott FreeStyle ...

Outperforms everyone else. Just like you

Supporting Information S1. The benchmark dataset consists ... · 1 Supporting Information S1. The benchmark dataset consists of a positive dataset and a negative dataset . The positive

Ground-Aware Point Cloud Semantic Segmentation for ... · cloud dataset and a newly collected dataset specifically designed for autonomous driving show that our method outperforms

How the Crowd Outperforms Traditional Security Testing

yyyyyyyyyyyyyyyyy - axiaadvisory.com · Growth Outperforms Value Outperforms Market Commentary 1st Quarter 2015 U.S. Equity Markets Russell 200 Value Qtr ‐2.08% U.S. Equity Style

Email Outperforms Many Online Activities Across Age Groups

Predicting CEO Success: When Potential Outperforms Experience · 2020. 12. 22. · Predicting CEO Success: When Potential Outperforms Experience In the science of CEO selection, past

> Industrial outperforms other sectors > Occupier demand ... · 1 LJ Hooker Commercial Industrial Market Monitor 1st Half 2017 1ST HALF 2017 > Industrial outperforms other sectors

Routine Hospital-based SARS-CoV-2 Testing Outperforms State …gelman/research/unpublished/... · 2021. 1. 13. · Routine Hospital-based SARS-CoV-2 Testing Outperforms State-based

Show, Discriminate, and Tell: A Discriminatory Image ...web.stanford.edu/class/cs224d/reports/PengLuo.pdfMS COCO dataset shows that our model consistently outperforms its counterpart

Mutual Fund Outperforms Using Tax Bill Before Dec. 31

The highD Dataset: A Drone Dataset of Naturalistic Vehicle ...

Effective dataset construction in computer visionvision.is.tohoku.ac.jp/.../mva2015-effective-dataset-construction.pdfEffective Dataset Construction ... EDPMs increase the e xpressi

FGIC Settlement Agreement

Modern machine learning outperforms GLMs at predicting spikes · 1 1 Modern machine learning outperforms GLMs at predicting spikes 2 Ari S. Benjamin1*, Hugo L. Fernandes2, Tucker

Automation of TRANSIL assays outperforms traditional ... · PDF fileAutomation of TRANSIL assays outperforms traditional methods in preclinical research in terms of speed, cost-effectiveness