Automatic Attendance System using CNN

52
AUTOMATIC ATTENDANCE SYSTEM By: Pinaki Ranjan Sarkar Under the guidance of: Dr. Gorthi R.K.S.S. Manyam & Dr. Deepak Mishra

Transcript of Automatic Attendance System using CNN

Page 1: Automatic Attendance System using CNN

AUTOMATIC ATTENDANCE SYSTEM

By: Pinaki Ranjan SarkarUnder the guidance of:

Dr. Gorthi R.K.S.S. Manyam &Dr. Deepak Mishra

Page 2: Automatic Attendance System using CNN

OUTLINE

▪ Motivation▪ Motivation

▪ Objective

▪ System Requirements

▪ Design Details

▪ Tried methods

▪ Inspiration▪ Inspiration

▪ Main design

▪ Status so far

▪ Future work

Page 3: Automatic Attendance System using CNN

MOTIVATION

▪ Taking attendance in large classes is:▪ Taking attendance in large classes is:

▪ Cumbersome

▪ Repetitive

▪ Consumes valuable class time

▪ What if we make an efficient face detection and recognition system for ▪ What if we make an efficient face detection and recognition system for this task?

Page 4: Automatic Attendance System using CNN

OBJECTIVES

▪ Automatic user identification via face detection and recognition. ▪ Automatic user identification via face detection and recognition.

▪ Develop and implement an efficient face detection and recognition system.

▪ End-to-end face recognition system using deep learning.

Page 5: Automatic Attendance System using CNN

DIFFICULTIES

▪ Large pose variation▪ Large pose variation

▪ Hidden faces & tiny faces

▪ Different illumination conditions, occlusions

Page 6: Automatic Attendance System using CNN

SYSTEM REQUIREMENTS

▪ Hardware:▪ Hardware:

▪ A camera

▪ PC or Raspberry pi

▪ Software:

▪ Matlab 2013+

▪ Python 2.7▪ Python 2.7

▪ Lasagne API

Page 7: Automatic Attendance System using CNN

DESIGN DETAILSDatabase

Face Detection

Face Recognition

Abhi - 1Priya – 1

Ayushi – 0Pinaki – 0Akshay – 1

Sidd - 1Sidd - 1

All are using CNN!!

Page 8: Automatic Attendance System using CNN

GOING DEEP INTO FACE RECOGNITION

▪ Various methods are employed to recognize a person in wild.▪ Various methods are employed to recognize a person in wild.

▪ Comparing to traditional handcrafted features such as high dimensional LBP, Active Appearance Model(AAM), Active Shape Model(ASM) or Bayesian face, Gaussian face etc.; automatically learnt deep features based on personal identity are more advantageous.

▪ In most deep learning based face recognition methods the inputs to the deep model are aligned face images.deep model are aligned face images.

Page 9: Automatic Attendance System using CNN

TRIED METHODS

▪ Tal Hassner, Shai Harel, Eran Paz, Roee Enbar, "Effective Face Frontalization▪ Tal Hassner, Shai Harel, Eran Paz, Roee Enbar, "Effective Face Frontalizationin Unconstrained Images”, CVPR-2015

Page 10: Automatic Attendance System using CNN

TRIED METHODS

▪ Tal Hassner, Shai Harel, Eran Paz, Roee Enbar, "Effective Face Frontalization▪ Tal Hassner, Shai Harel, Eran Paz, Roee Enbar, "Effective Face Frontalizationin Unconstrained Images”, CVPR-2015

Page 11: Automatic Attendance System using CNN

TRIED METHODS

▪ Zhu, Xiangyu, et al. "Face alignment across large poses: A 3d ▪ Zhu, Xiangyu, et al. "Face alignment across large poses: A 3d solution." CVPR-2016

Page 12: Automatic Attendance System using CNN

TRIED METHODS

▪ Zhu, Xiangyu, et al. "Face alignment across large poses: A 3d ▪ Zhu, Xiangyu, et al. "Face alignment across large poses: A 3d solution." CVPR-2016

Page 13: Automatic Attendance System using CNN

TRIED METHODS

▪ Zhu, Xiangyu, et al. "Face alignment across large poses: A 3d ▪ Zhu, Xiangyu, et al. "Face alignment across large poses: A 3d solution." CVPR-2016

Page 14: Automatic Attendance System using CNN

TRIED METHODS

▪ I have tried to implement some more papers but they failed when we are dealing with large pose.

▪ I have tried to implement some more papers but they failed when we are dealing with large pose.

▪ Instead of AAM, 3D fitted model (3D frontalisation doesn’t show significant improvements over simple 2D alignment*), we used Deep learning techniques to recognize a face using only personal identity clues.

* Banerjee, Sandipan, et al. "To Frontalize or Not To Frontalize: Do We Really Need Elaborate Pre-Processing to Improve Face Recognition Performance?." arXiv preprint arXiv:1610.04823 (2016).

Page 15: Automatic Attendance System using CNN

INSPIRATION

▪ Our work is inspired by some of the state-of-the-art papers.▪ Our work is inspired by some of the state-of-the-art papers.

▪ DeepFace: Closing the Gap to Human-Level Performance in Face Verification, CVPR-2014

▪ FaceNet: A Unified Embedding for Face Recognition and Clustering, CVPR-2015

▪ DeepID3: Face Recognition with Very Deep Neural Networks, CVPR-2015

▪ Supervised Transformer Network for Efficient Face Detection, ECCV-2016

▪ Towards End-to-End Face Recognition through Alignment Learning, arXiv-2017

▪ Spatial transformer networks. NIPS-2015

▪ Finding Tiny Faces. arXiv-2016

Page 16: Automatic Attendance System using CNN

MAIN DESIGN

▪ The complete architecture has two stages▪ The complete architecture has two stages

▪ Face Detection

▪ Face Recognition

Page 17: Automatic Attendance System using CNN
Page 18: Automatic Attendance System using CNN

ARCHITECTURE FOR DETECTION

▪ They have provided an in-depth analysis of image resolution, object scale, and spatial context for the purposes of finding small faces.

▪ Still the detailed study of the paper is pending as I have found this paper very recently. I will briefly describe their architecture in the next slide

Page 19: Automatic Attendance System using CNN

ARCHITECTURE FOR DETECTION

Page 20: Automatic Attendance System using CNN

ARCHITECTURE FOR DETECTION

Page 21: Automatic Attendance System using CNN

ARCHITECTURE FOR DETECTION

Page 22: Automatic Attendance System using CNN

WHERE IT FAILS?

▪ For out of plane rotation this proposed method works fine but when 2D ▪ For out of plane rotation this proposed method works fine but when 2D rotation comes into picture then their method suffers from less accuracy.

▪ Some of the failures are shown in the next slide

Page 23: Automatic Attendance System using CNN

11/14 True detection1 False detection

1/14 True detection1 False detection

Page 24: Automatic Attendance System using CNN
Page 25: Automatic Attendance System using CNN

ARCHITECTURE FOR RECOGNITION

Localization Network

Transformparameters

Recognition Recognition Network

Features

Augmented image128 X 128

Transformer

Aligned face64 X 64

Spatial Transformer Network

Page 26: Automatic Attendance System using CNN

SPATIAL TRANSFORMER NETWORK

▪ Intuition behind STN▪ Intuition behind STN

Page 27: Automatic Attendance System using CNN

SPATIAL TRANSFORMER NETWORK

▪ Intuition behind STN▪ Intuition behind STN

Sampling

Page 28: Automatic Attendance System using CNN

SPATIAL TRANSFORMER NETWORK

Page 29: Automatic Attendance System using CNN

SPATIAL TRANSFORMER NETWORK

Page 30: Automatic Attendance System using CNN

SPATIAL TRANSFORMER NETWORK

▪ According to the original DeepMind paper, the spatial transformer can ▪ According to the original DeepMind paper, the spatial transformer can be used to implement any parametrizable transformation including translation, scaling, affine, projective.

▪ Suppose that for the ith target point pti = (xt

i ; yti ; 1) in the output image,

a grid generator generates its source coordinates (xsi ; y

si ; 1) in the input

image according to transformation parameters.

Projective transformation equation

Page 31: Automatic Attendance System using CNN

SPATIAL TRANSFORMER NETWORK

▪ Sampler: (Mathematical Formulation)▪ Sampler: (Mathematical Formulation)

Page 32: Automatic Attendance System using CNN

SPATIAL TRANSFORMER NETWORK

▪ We use the bilinear kernel so that:▪ We use the bilinear kernel so that:

Page 33: Automatic Attendance System using CNN

SPATIAL TRANSFORMER NETWORK

▪ We use the bilinear kernel so that:▪ We use the bilinear kernel so that:

▪ So overall transformer model will be:

Page 34: Automatic Attendance System using CNN

SPATIAL TRANSFORMER NETWORK

▪ We use the bilinear kernel so that:▪ We use the bilinear kernel so that:

▪ So overall transformer model will be:

This is equivalent to convolving a sampling kernel k with the source image of H X W dimension

Page 35: Automatic Attendance System using CNN

SPATIAL TRANSFORMER NETWORK

▪ We use the bilinear kernel so that:▪ We use the bilinear kernel so that:

▪ So overall transformer model will be:

This is equivalent to convolving a sampling kernel k with the source image of H X W dimension

▪ All the blocks should be differentiable.

Page 36: Automatic Attendance System using CNN

SPATIAL TRANSFORMER NETWORK

Page 37: Automatic Attendance System using CNN

SPATIAL TRANSFORMER NETWORK

During the backward propagation, we need to calculate the gradient of Vi with respect to each of the eight transformation parameters.

Page 38: Automatic Attendance System using CNN

SPATIAL TRANSFORMER NETWORK

During the backward propagation, we need to calculate the gradient of Vi with respect to each of the eight transformation parameters.

Page 39: Automatic Attendance System using CNN

SPATIAL TRANSFORMER NETWORK

During the backward propagation, we need to calculate the gradient of Vi with respect to each of the eight transformation parameters.

Page 40: Automatic Attendance System using CNN

SPATIAL TRANSFORMER NETWORK

Page 41: Automatic Attendance System using CNN

SPATIAL TRANSFORMER NETWORK

Page 42: Automatic Attendance System using CNN

SPATIAL TRANSFORMER NETWORK

Page 43: Automatic Attendance System using CNN

SPATIAL TRANSFORMER NETWORK

Where,

Page 44: Automatic Attendance System using CNN

SPATIAL TRANSFORMER NETWORK

▪ The similarity transformation is defined here▪ The similarity transformation is defined here

in which α is the rotation angle, λ is the scaling factor, and t1; t2 are the horizontal and vertical translation displacements respectively. Analogously, the gradients of Vi respected to α and λ are shown below:

Page 45: Automatic Attendance System using CNN

SPATIAL TRANSFORMER NETWORK

▪ The similarity transformation is defined here▪ The similarity transformation is defined here

in which α is the rotation angle, λ is the scaling factor, and t1; t2 are the horizontal and vertical translation displacements respectively. Analogously, the gradients of Vi respected to α and λ are shown below:

Page 46: Automatic Attendance System using CNN

SPATIAL TRANSFORMER NETWORK

▪ The similarity transformation is defined here▪ The similarity transformation is defined here

in which α is the rotation angle, λ is the scaling factor, and t1; t2 are the horizontal and vertical translation displacements respectively. Analogously, the gradients of Vi respected to α and λ are shown below:

Page 47: Automatic Attendance System using CNN

STATUS SO FAR

▪ STN is implemented and tested on Labeled Face in Wild (LFW) dataset.▪ STN is implemented and tested on Labeled Face in Wild (LFW) dataset.

▪ Out of 5423 classes, we took only 1000 classes because of the limitation in computation.

▪ During training we did data augmentation with random 2D-Affine transformationon face data to increase the training size.

▪ We had 15399 training images, 3501 testing images and 2100 validation imagesduring training.

▪ We introduced a CNN architecture to extract deep features from the transformed face.

Page 48: Automatic Attendance System using CNN

STATUS SO FAR

▪ Output of STN network▪ Output of STN network

Page 49: Automatic Attendance System using CNN

STATUS SO FARConv

Conv

Pool & Actv

Conv

Conv

Conv

Pool & Actv

Pool & Actv

Pool & Actv

Dense

Dense

Dense

Actv

STN Architecture

Conv

Pool & Actv

Pool & Actv

Dense

Dense

Dense

Actv

Actv

Recognition Architecture

Page 50: Automatic Attendance System using CNN

STATUS SO FARConv

Conv

Pool & Actv

Conv

Conv

Conv

Pool & Actv

Pool & Actv

Pool & Actv

Dense

Dense

Dense

Actv

STN Architecture

Conv

Pool & Actv

Pool & Actv

Dense

Dense

Dense

Actv

Actv

Recognition Architecture

Page 51: Automatic Attendance System using CNN

FUTURE WORK

▪ Try to validate the architecture in real data (taken from classroom)▪ Try to validate the architecture in real data (taken from classroom)

▪ Without training a new CNN model, compare recognition accuracy with the ImageNet winning pre-trained models.

▪ Adding 2D rotation invariance face detection with the recent model.

Page 52: Automatic Attendance System using CNN

THANK YOU!