Image Recognition using Hierarchical Temporal Memory

Post on 18-Jan-2016

83 views 1 download

Tags:

description

Image Recognition using Hierarchical Temporal Memory. Radoslav Škoviera Ústav merania SAV Fakulta matematiky, fyziky a informatiky UK. Image Recognition. Applications: Digital image databases, surveillance, industry, medicine - PowerPoint PPT Presentation

Transcript of Image Recognition using Hierarchical Temporal Memory

Image Recognition using Hierarchical Temporal Memory

Radoslav ŠkovieraÚstav merania SAV

Fakulta matematiky, fyziky a informatiky UK

Image Recognition

• Applications: Digital image databases, surveillance, industry, medicine

• Tasks: Object recognition, automatic annotation, content based image search

• Input: Digital Image– Single object– Scene (multiple objects – clutter, occlusion, merging)

• Output: Description of the input image– Keywords, scene semantics, similar images

• Subtasks: image segmentation, feature extraction, classification

Motivation

• Image recognition – Very easy for us humans (and [other] animals)– Computers can‘t do it neither quickly, nor

accurately enough, yet• Good motivation for the researchers in the

field of AI – bio-inspired models

Hierarchical Temporal Memory (HTM)

• Developed by Jeff Hawkins and Dileep George (Numenta)• Hierarchical tree-shaped network• Bio-inspired – based on large scale model of the

neocortex• Consists of basic operational units – nodes

– Each node uses the same two-stage learning algorithm:1) Spatial Learning (Pooling)2) Temporal Learning (Pooling)

– Learning is performed layer-by-layer– Nodes have receptive fields – each (except for the top node)

can look only at a portion of the input image

Spatial Learning

• Observe common patterns in the input space (training images)

• Group them into clusters of spatially simillar patterns

• Use only one representative of each cluster– Generate „codebook“

• Input space and spatial noise reduction

Temporal Learning

• Uses time sequences to learn correlations of spatial patterns

Temporal Learning

Temporal Learning

Temporal Learning

• In each training step, TAM is increased at the locations corresponding with the co-occurring codebook patterns according to the update function defined as follows:

Inference & Classification

• Uses simlar dataflow as learning• Two stages of inference in each node:– Spatial inference – find the closest pattern in the codebook– Temporal inference – calculate membership into temporal

groups

• Classification – HTM itself does not classify images, it only transforms input space into another (hopefully more inviariant) space– External classifier must be used

ATM Security

• ATM (automatic teller machine) semiatomatic fraud detection system– Detection of masked individuals interacting with

the ATM through the ATM‘s camera – possibility of illegal activity

• Pilot system implemented and tested in an experimental environment

• Using Kinect as an input device

Kinect

• RGB camera developed for the XBOX game console– Capable of providing depth image for the scene and a

„skeleton“ if a person is detected on the scene

Experiment Setup

Face Image Segmentation using Kinect

Face Image Segmentation using Kinect

• Two image classes: normal and anomalous faces

ATM Security – Results• Image set inflated with translated, rotated and mirrored copies of the

original images• k-NN classifier in the input space was compared with the combination

of the HTM and k-NN and HTM and SVM classifier

• Scenario 1: The whole data set was used and• Scenario 2: Translated images were excluded from the training set

New features and algorithmsfor the HTM

• New temporal pooler

• Images transformed to different image spaces– different image features

• Various settings for the temporal pooler• SOM as spatial pooler

Testing of new image features

• Dataset: selected images from Caltech 256– 10 classes, 30 testing and 30 training images per class

• Single layer network– With 1-NN classifier as top node– Image features extracted from image patches

corresponding to the receptive fields of nodes

Results

%

TE window step size in pixels

s1 s2 s4 s8

RGBCA 42,87 41,61 40,86 38,00

med 42,50 41,33 41,00 38,17

GreyCA 40,13 39,63 38,41 34,68

med 39,67 39,33 37,83 35,67

CannyCA 40,35 42,33 43,66 43,55

med 40,50 41,83 43,00 43,00

LabCA 44,92 44,17 44,23 43,17

med 44,83 44,50 43,67 43,67

GLDCA 45,95 46,01 46,43 46,10

med 46,00 46,12 46,17 46,00

problems - background

problems - background

Thank you for your attention