CNN-Based Incremental Learning Strategies for Face Recognition

CNN-Based Incremental Learning Strategies for Face Recognition Vincenzo Lomonaco and Davide Maltoni Biometric System Lab – DISI - University of Bologna Vincenzo Lomonaco and Davide Maltoni DISI - University of Bologna Emails: {vincenzo.lomonaco, davide.maltoni}@unibo.it Websites: www.vincenzolomonaco.com, http://bias.csr.unibo.it/maltoni/ Contacts [1] Franco, A., Maio, D., Maltoni, D., Scienze, C.L., Bologna, U.: The Big Brother Database: Evaluating Face Recognition in Smart Home Environments. Image (Rochester, N.Y.). 142–150 (2009). [2] Franco, A., Maio, D., Maltoni, D.: Incremental template updating for face recognition in home environments. Pattern Recognit. 43, 2891–2903 (2010). [3] Maltoni, D., Lomonaco, V.: Semi-supervised Tuning from Temporal Coherence. Tech. Report. DISI - Univ. of Bologna. http://arxiv.org/pdf/1511.03163v3.pdf. 1–14 (2015). References In the last decade, Convolutional Neural Networks (CNNs) have shown to perform incredibly well on face recognition tasks being able to deal with large occlusions, extremely low resolutions, strong illumination variations, etc. However,partly because of their complex training and tricky hyper-parameters tuning, CNNs have been scarcely studied in the context of incremental learning. In this work we compare different incremental learning strategies for CNN- based architectures in the context of face recognition. One possible approach to deal with this incremental scenario is to store all the previously seen past data, and retrain the model from scratch as soon as a new batch of data is available. However, this solution is often impractical for many real world systems where memory and computational resources are subject to stiff constraints. A different approach to address this issue, is to update the model based only on the new available batch of data. The BigBrother dataset (SETB) [1] has been created starting from 2 DVDs made commercially available at the end of the 2006 edition of the “Big Brother”reality show produced for the ItalianTV. It consists of 14,675 (70×70) gray-scale images of faces belonging to 7 subject often characterized by bad lighting, poor focus, occlusions, and non-frontal pose. In addition to the typical training and test sets, it provides an additional large set of images called “updating set” for incremental learning/tuning purposes and split in 54 days. Figure 1. The seven subjects of the Big Brother dataset (SETB). • Forgetting can be a very detrimental issue: hence, when possible (i.e., transfer learning from the same domain), it is preferable to use CNN as a fixed feature extractor to feed an incremental classifier. • If the features are not optimized (transfer learning from a different domain), the tuning of low level layers may be preferable and the learning strength (i.e., learning rate, number of iteration, etc.) can be used to control forgetting. • Training a CNN from scratch can be advantageous if the problem patterns (and feature invariances) are highly specific and a sufficient number of samples are available. Figure 2. Accuracy of the different strategies tested on the SETB of the Big Brother dataset. Figure 3. The impact of the learning rate on forgetting. Final Acc. % LeNet7 CaffeNet + SVM VGG + SVM CaffeNet + FT VGG + FT 34 Days Split 82.35% 80.10% 96.96% 73.23% 91.39% Orig. Days Split 75.33% 75.13% 96.73% 70.23% 89,58% Cumulative Days 90.50% 86.79% 97.65% 84.26% 95,51% Gain +7.03% +4.97% +0.23% +3.00% +1.81% Loss -8.15% -6.69% -0.69% -11.03% -4.12% Table 1. Accuracy gain and loss of the 34 Days split with respect of the Original and Cumulative days split. Abstract Experiments and results Big Brother dataset Incremental Learning Strategies Conclusions Pre-trained CNN + SVM Pre-trained CNN + Finetuning Ad-hoc CNN trained from scratch SVM 3 2 1

Upload
vincenzo-lomonaco
Category

Science
view
221
download
2

Embed Size (px):

Transcript of CNN-Based Incremental Learning Strategies for Face Recognition

CNN-Based Incremental Learning

Strategies for Face Recognition

Vincenzo Lomonaco and Davide Maltoni

Biometric System Lab – DISI - University of Bologna

Vincenzo Lomonaco and Davide Maltoni

DISI - University of Bologna

Emails: {vincenzo.lomonaco, davide.maltoni}@unibo.it

Websites: www.vincenzolomonaco.com, http://bias.csr.unibo.it/maltoni/

Contacts[1] Franco, A., Maio, D., Maltoni, D., Scienze, C.L., Bologna, U.: The Big Brother Database: Evaluating Face

Recognition in Smart Home Environments. Image (Rochester, N.Y.). 142–150 (2009).

[2] Franco, A., Maio, D., Maltoni, D.: Incremental template updating for face recognition in home environments.

Pattern Recognit. 43, 2891–2903 (2010).

[3] Maltoni, D., Lomonaco, V.: Semi-supervised Tuning from Temporal Coherence. Tech. Report. DISI - Univ. of

Bologna. http://arxiv.org/pdf/1511.03163v3.pdf. 1–14 (2015).

References

In the last decade, Convolutional Neural Networks (CNNs) have shown to

perform incredibly well on face recognition tasks being able to deal with large

occlusions, extremely low resolutions, strong illumination variations, etc.

However, partly because of their complex training and tricky hyper-parameters

tuning, CNNs have been scarcely studied in the context of incremental learning.

In this work we compare different incremental learning strategies for CNN-

based architectures in the context of face recognition.

One possible approach to deal with this incremental scenario is to store all the

previously seen past data, and retrain the model from scratch as soon as a

new batch of data is available. However, this solution is often impractical for

many real world systems where memory and computational resources are

subject to stiff constraints.

A different approach to address this issue, is to update the model based only

on the new available batch of data.

The BigBrother dataset (SETB) [1] has been created starting from 2 DVDs made

commercially available at the end of the 2006 edition of the “Big Brother” reality

show produced for the Italian TV.

It consists of 14,675 (70×70) gray-scale images of faces belonging to 7 subject

often characterized by bad lighting, poor focus, occlusions, and non-frontal

pose.

In addition to the typical training and test sets, it provides an additional large

set of images called “updating set” for incremental learning/tuning purposes

and split in 54 days.

Figure 1. The seven subjects of the Big Brother dataset (SETB).

• Forgetting can be a very detrimental issue: hence, when possible (i.e.,

transfer learning from the same domain), it is preferable to use CNN as a

fixed feature extractor to feed an incremental classifier.

• If the features are not optimized (transfer learning from a different domain),

the tuning of low level layers may be preferable and the learning strength

(i.e., learning rate, number of iteration, etc.) can be used to control

forgetting.

• Training a CNN from scratch can be advantageous if the problem patterns

(and feature invariances) are highly specific and a sufficient number of

samples are available.

Figure 2. Accuracy of the different strategies tested on the SETB of the Big Brother dataset.

Figure 3. The impact of the learning rate on forgetting.

Final Acc. % LeNet7 CaffeNet + SVM VGG + SVM CaffeNet + FT VGG + FT

34 Days Split 82.35% 80.10% 96.96% 73.23% 91.39%

Orig. Days

Split

75.33% 75.13% 96.73% 70.23% 89,58%

Cumulative

Days

90.50% 86.79% 97.65% 84.26% 95,51%

Gain +7.03% +4.97% +0.23% +3.00% +1.81%

Loss -8.15% -6.69% -0.69% -11.03% -4.12%

Table 1. Accuracy gain and loss of the 34 Days split with respect of the Original and Cumulative days split.

Abstract Experiments and results

Big Brother dataset

Incremental Learning Strategies

Conclusions

Pre-trained CNN + SVM

Pre-trained CNN + Finetuning

Ad-hoc CNN trained from scratch

SVM

Deep Face Recognition - csuohio.educis.csuohio.edu/~sschung/CIS660/DeepFaceRecognition_parkhi15.pdfrecent works on face recognition have proposed numerous variants of CNN architectures

Bilinear CNN Models for Fine-grained Visual …vis- CNN Models for Fine-grained Visual Recognition Tsung-Yu Lin Aruni RoyChowdhury Subhransu Maji University of Massachusetts, Amherst

Incremental Recognition of Pedestrians f 92217

Optimized CNN Based Image Recognition Through Target ...

Logo Recognition Using CNN Features - Home - Springer€¦ · Logo Recognition Using CNN Features Simone Bianco, Marco Buzzelli, Davide Mazzini(B), and Raimondo Schettini DISCo (Dipartimento

【BMVC2016】Recognition of Transitional Action for Short-Term Action Prediction using Discriminative Temporal CNN Feature

Traffic Sign Detection with CNN Approach - neonjam.cc · Traffic Sign Detection with CNN Approach Ge Luo 1, ... traffic sign recognition ... of computer vision tasks, ...

P-CNN: Pose-based CNN Features for Action Recognition · namic human pose features in cases when reliable pose es-timation is available. We extend the work [19] and design a new CNN-based

Facial Recognition with Deep Learning...However, the same face recognition system using simple CNN fails signiﬁcantly when it comes to differentiating Facial Recognition with Deep

(Incremental) Dialogue Act Segmentation and Recognition

Incremental Learning for Place Recognition in Dynamic ...publications.idiap.ch/downloads/reports/2007/luo_iros07.pdf · presents a discriminative incremental learning approach to

Incremental Multi-Source Recognition with Non ... - IRCAM

P-CNN: Pose-based CNN Features for Action Recognition · 2015-09-24 · P-CNN: Pose-based CNN Features for Action Recognition Guilhem Cheron´ y Ivan Laptev Cordelia Schmidy INRIA

Deep Convolutional Neural Networks (preliminary experiments) · 1. Scanning window + CNN CNN - Outstanding recognition accuracy of holistic image recognition accuracy [Krizhevsky

Neural Network Convolutional with recognition Face · 2015-06-03 · Using Convolutional Neural Network (CNN) to recognize person on the image Face recognition with CNN Face recognition

Pattern Recognition - icst.pku.edu.cn · regularization for human action recognition in RGBD videos. Guilhem Chéron et al. proposed a pose-based CNN descriptor for human action recognition

A Face Emotion Recognition Method Using Convolutional ... · recognition, a face expression recognition method based on a convolutional neural network (CNN) and an image edge detection

Trained 3D models for CNN based object recognition

Weed Recognition in Agriculture: A Mask R-CNN Approach

Boosting Face in Video Recognition via CNN based Key Frame ...

CNN-Based Incremental Learning Strategies for Face Recognition

Science

Transcript of CNN-Based Incremental Learning Strategies for Face Recognition