Personalized Object Recognition for Augmenting Human Memoryhosubl/WAHM16_presentation.pdf · / 16...

/ 16

Personalized Object Recognition for Augmenting Human MemoryHOSUB LEE 1, CAMERON UPRIGHT 2, STEVEN ELIUK 2, ALFRED KOBSA 1

1: UNIVERSITY OF CALIFORNIA, IRVINE

2: SAMSUNG RESEARCH AMERICA

2016-09-12 WAHM 2016 1

/ 16

SummaryPersonalized Object Recognition System◦ Users can easily create their own image classifiers via Google Glass (and server)

◦ Users may utilize it to augment their memory (e.g., memory enhancer)

2016-09-12 WAHM 2016 2

Google Glass

ML model

Server

Classification request

Classification result

What is this?

This is “Jessica” Training request (w/ new data)

Update ML model(w/ new data)

/ 16

Introduction (1/2)Wearable Computing: Google Glass◦ Users can easily collect image data about their surroundings through Google Glass camera

Deep Learning: Convolutional Neural Networks (CNN)◦ CNN, one sort of neural networks, can mimic how the human brain perceives images

◦ CNN-based image classifiers have reached near-human accuracy levels

2016-09-12 WAHM 2016 3

Google Glass Convolutional Neural Networks

Camera

/ 16

Introduction (2/2)Problem◦ Most deep learning applications thus far have been developed for the general population

◦ Some users may want to build their “own” applications

◦ People with memory problems: “What was this?”

◦ Professor who gives a lecture to 300 students: “What was your name?”

Solution◦ We collect user-generated image data via Google Glass

◦ We train personalized deep learning model (CNN) on the user-generated image data

◦ We run image classifier based on the personalized deep learning model upon user request

2016-09-12 WAHM 2016 4

/ 16

Related WorkWearable Visual Recognition System◦ Wearable personal imaging device recognizing human faces (Steve Mann, 1997)

◦ Object recognition system recognizing American Sign Language (Thad Starner et al, 1998)

◦ Wearable system recognizing 24 different types of objects (Antonio Torralba et al, 2003)

◦ Image recognition app for Google Glass (AlchemyAPI, 2013)

◦ Emotion recognition software for Google Glass (Fraunhofer, 2014)

◦ Google Glass application retrieving meta information from images (Way et al, 2015)

Limitations◦ Prototypes were cumbersome to wear

◦ No considerations on personalized machine learning models

◦ Just concept, no implementations

2016-09-12 WAHM 2016 5

Steve Mann, 1997 AlchemyAPI, 2013

Communication units

Google Glass, but ML model for public

/ 16

Personalized Object Recognition System: DeepEyeSYSTEM ARCHITECTURE

WORKFLOW – TRAINING AND CLASSIFICATION

EXPERIMENT

2016-09-12 WAHM 2016 6

/ 16

DeepEye: System ArchitectureClient-Server Model◦ Client: Google Glass

◦ Collect images and send them to the server with a specific task type (training or classification)

◦ Server: Linux workstation w/ Caffe deep learning framework

◦ [Training] train (or update) the CNN using finetuning whenever new image data is available

◦ [Classification] classify an image through the most recently trained CNN

2016-09-12 WAHM 2016 7

Google Glass

ML model

Server

Classification request

Classification result

What is this?

This is “Jessica” Training request (w/ new data)

Update ML model(w/ new data)

TrainingClassification

/ 16

DeepEye: Workflow – Training (1/3)Labeling◦ User enters the name of the target object (i.e., its label) through Google Voice Input

Data Collection◦ DeepEye begins to take a photo of the object every five seconds

◦ DeepEye then transmits the collected image w/ the task type (caffe::train) to the server

◦ Process is repeated until the user has explicitly terminated the training task

2016-09-12 WAHM 2016 8

Initial Screen Labeling via voice Data Collection

/ 16

DeepEye: Workflow – Training (2/3)Training: Finetuning◦ Train a new model by recycling the fully trained model on a larger dataset

◦ Exploit the pre-trained CNN’s parameter values representing generic visual features like edges

◦ Focus on updating parameters representing object-specific (high-level) features for our image data

2016-09-12 WAHM 2016 9

Finetuning CNN

Generic features: edges

High-level features: shapes

/ 16

DeepEye: Workflow – Training (3/3)Training: Finetuning (cont’d)

2016-09-12 WAHM 2016 10

Training Process

/ 16

DeepEye: Workflow – ClassificationClassification◦ User takes a photo of the object by clicking Google Glass touch pad

◦ DeepEye sends the image w/ the task type (caffe::classify) to the server

◦ Server uses the latest trained CNN to execute the Caffe classification command on the image

◦ Server then sends the classification result (w/ probability) back to DeepEye

◦ DeepEye displays the result to the user through Google Glass’s heads-up display

2016-09-12 WAHM 2016 11

Classification

/ 16

DeepEye: Experiment (1/2)10 Class Object Recognition◦ We evaluated the prediction power of the trained CNNs via DeepEye in a real world scenario

Training Data◦ We selected 10 personal objects of a member of our research team

◦ We collected 100 images for each class, and augmented them by creating four variations

◦ Rotated by 90, 180, and 270 degrees, and one mirrored

Validation Data◦ We also collected 30 additional images for each class (w/ different photographing conditions)

2016-09-12 WAHM 2016 12

Training and Validation Data (sample)

/ 16

DeepEye: Experiment (2/2)Validation Accuracy◦ For up to 7 different objects, the trained CNNs showed a near perfect performance

◦ Accuracy was slightly diminished as the number of object categories increases from 8 to 9

◦ The final trained CNN’s validation accuracy was 97% with a loss of 0.116

Training Time◦ It took about 7 minutes to train the final model on our GPU environment (GeForece GTX 970)

2016-09-12 WAHM 2016 13

Validation Accuracy

/ 16

Discussion and Future WorkGoogle Glass◦ Google Glass emits a lot of heat when it continuously utilizes the camera function

◦ Google Glass battery drains quickly (< 2 hours)

Scalability and Applicability◦ Tested on small datasets only (100 class object recognition?)

◦ Tested for object recognition task only (face recognition?)

Effectiveness◦ Need to assess the usability of the system for people with memory disorders

◦ Need to verify whether the system can improve their memory and cognitive abilities

2016-09-12 WAHM 2016 14

/ 16

ConclusionIn This Paper◦ We developed a personalized object recognition system for augmenting human memory

◦ Wearable computing + deep learning

◦ We utilized finetuning approach to efficiently train personalized deep learning models

◦ We plan to test the system with more complex object recognition tasks

◦ We also plan to verify its effectiveness in augmenting human memory and perception

2016-09-12 WAHM 2016 15

/ 16

Thank You!ANY QUESTIONS?

2016-09-12 WAHM 2016 16

Personalized Object Recognition for Augmenting Human Memoryhosubl/WAHM16_presentation.pdf · / 16...

Documents

Transcript of Personalized Object Recognition for Augmenting Human Memoryhosubl/WAHM16_presentation.pdf · / 16...