Using associative memory principles to enhance perceptual ability of vision systems (Giving a...

44
Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing in Video June 28, 2004, Washington, DC, USA Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology National Research Council Canada www.cv.iit.nrc.ca/~dmitry/pinn

Transcript of Using associative memory principles to enhance perceptual ability of vision systems (Giving a...

Page 1: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

Using associative memory principles to enhance perceptual ability of

vision systems (Giving a meaning to what you see)

CVPR Workshop on Face Processing in Video June 28, 2004, Washington, DC, USA

Dr. Dmitry GorodnichyComputational Video Group

Institute for Information Technology National Research Council Canada

www.cv.iit.nrc.ca/~dmitry/pinn

Page 2: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

Designing visual memory using attractor-based neural networks with

application to perceptual vision systems

CVPR Workshop on Face Processing in Video June 28, 2004, Washington, DC, USA

Dr. Dmitry GorodnichyComputational Video Group

Institute for Information Technology National Research Council Canada

www.cv.iit.nrc.ca/~dmitry/pinn www.perceptual-vision.com/memory

Page 3: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

3. Associative memory for video (Dr. Dmitry Gorodnichy)

The unique place of this research

You are here

Computer vision

Pattern recognition

Your eyes, y

our brain

Page 4: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

4. Associative memory for video (Dr. Dmitry Gorodnichy)

Talk overview

1. On neurobiology side- How it works in brain: From eye retina, to primary visual cortex, to

neurons, to synapses

2. Memories as attractors of associative neural network- Finding Best learning rule to tune the synapses

3. On computer vision path- Evolution of Perceptual Vision User Interface Systems:

From Face Detection to Face Tracking to Face Localization to Face Recognition

4. Putting it all together: Visual memory for analyzing faces:- What makes processing in video special

- Canonical face representation- Memories of faces as attractors of the network

Page 5: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

5. Associative memory for video (Dr. Dmitry Gorodnichy)

How we (humans) do it: see & memorize what we see ?

Dorsal (“where”) stream:V1,V2,V3… deal with object localization

Ventral (“what”) stream: V1, V2, V4, inferior temporal cortex (TE/IT)deals with object recognition

Refs: Perus, Ungerleider, Haxby, Riesenhuber, Poggio …

•Seeing

Page 6: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

6. Associative memory for video (Dr. Dmitry Gorodnichy)

How we (humans) do it: see & memorize what we see ? (cntd)

• In brain: 1010 to 1013 interconnected neurons

• Neurons are either in rest or activated (modelled as units taking values) Yi={+1,-1}, depending on value of other neurons Yj and the strength of synaptic connections Cij

•Brain is thus modelled as a network of binary neurons evolving in time from an initial state (e.g. stimulus coming from retina) until it reaches a stable state - attractor

•The attractors of the network is what we actually remember – associative memory.

•Recognizing / Memorizing

Refs: Hebb’49, Little’74,’78, Willshaw’71, …

Page 7: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

7. Associative memory for video (Dr. Dmitry Gorodnichy)

Recognition / memorization: formally

• Main question:

How to compute Cij so that a) the desired patterns Vm become attractors, i.e. VVm ~CV ~CVm

and

b) network exhibits best associative (error-correction) properties, i.e.- largest attraction radius (tolerated noise)largest attraction radius (tolerated noise)- largest number of prototypes M stored- largest number of prototypes M stored

??Refs: Hebb’49, McCalloch-Pitts‘43, Amari’71,’77, Hopfield’82,Sejnowski’89, Willshaw’71

•Attractor-based neural networks

Page 8: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

8. Associative memory for video (Dr. Dmitry Gorodnichy)

Learning rules: From biologically plausible to mathematically justifiable

Neurophysiological Postulate: “If two neurons on either side of a synapse are

activated, then the strength of the synapse is strengthened”

“When a child is born, she knows nothing. As she repeatedly observed, she learns” – Postulate from Montessory approach to enfant development.

Models

Hebb: (C = 1/N VVT) , Generalized Hebb:

Better however:

or even

Refs: Hebb’49, Hopfield’82, Sejnowski’77, Willshaw’71

How to update weights

mj

mi

mij VV

NC

1

mij

mij

mij CCC 1

),( mj

mi

mij VVFC

),,( 1 mj

mi

mij

mij VVCFC

),( 1 mmmij VCFC

Page 9: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

9. Associative memory for video (Dr. Dmitry Gorodnichy)

C = VV+ • Obtained mathematically from stability condition: VVm =CV =CVm

• With reduced self-connection (Cii = 0.15 Cii ), it is guaranteed

[Gorodnichy’97]

to retrieve M=0.5N patterns from 8% noise

M=0.7N patterns from 2% noise (for comparison: Hebb rule stops retrieving when M=0.14N)

• Widrow-Hoff’s (delta) rule is the iterative approximation of it.

• Hebb rule is the special case of it for orthogonal prototypes.

Refs: Amari’71,’77, Kohonen’72, Personnaz’85, Kanter-Sompolinsky’86,Gorodnichy‘95-’99

Pseudo-inverse as the best learning rule

Page 10: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

10. Associative memory for video (Dr. Dmitry Gorodnichy)

… besides that it yields the best retrieval for this type of networks.

• It is non-iterative – good for fast (real-time) learning • It is also fast in retrieval. • The performance of the network can be examined and improved analytically.

Guaranteed to converge.

• It can deal with continuous stream of data, never being saturated: if dynamic desuturation is used-> maintaining the capacity of 0.2N (with complete retrieval)-> providing means for forgetting obsolete data-> setting the basis for designing of adaptive filters

• All this makes the network very suitable for real-time memorization and recognition, as needed for video processing tasks.

• Finally, there's a free CPP code which you can compile and try yourself!

At PINN website: www.cv.iit.nrc.ca/~dmitry/pinn

-

What’s else good about PI rule

Page 11: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

11. Associative memory for video (Dr. Dmitry Gorodnichy)

These Neural Network are known as…

•pseudo-inverse networks - for using Moore-Penrose pseudoinverse V+ in computing the synapces

•projection networks - for synaptic (weight) matrix C=VV+ being the projection matrix on the space of prototypes

•Hopfield-like networks - for being binary and fully-connected in the stage of learning

•recurrent networks - for evolving in time, based on external input and internal memory

•attractor-based networks - for storing patterns as attractors (i.e. stable states of the network)

•dynamic systems - for allowing the dynamic systems theory to be applied

•associative memory - for being able to memorize, recall and  forget patterns, just as much as humans do.

Page 12: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

12. Associative memory for video (Dr. Dmitry Gorodnichy)

Analytical examination

By looking at the synaptic

weights Cij, one can say a lot …

about the properties of memory:

- how many main attractors (stored memories) it has.

- how good the retrieval is.

Page 13: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

13. Associative memory for video (Dr. Dmitry Gorodnichy)

Attraction Radius as function of weights

Theoretical result:

(for direct attraction radius)

Page 14: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

14. Associative memory for video (Dr. Dmitry Gorodnichy)

Dynamics of the network

The behaviour of the network is governed by the energy functions

• However :

-> They are fewThey are few, when D>0.1 [Gorodnichy&Reznik’97]-> They are detected automaticallyThey are detected automatically

The network always converges: as long as Cij=CjiThe network always converges: as long as Cij=Cji

• Cycles are possible, when D<1 :

Page 15: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

15. Associative memory for video (Dr. Dmitry Gorodnichy)

Update flow neuro-processing

->-> is very fast is very fast (as only few neurons are actually changing in one iteration)

-> detects cycle automatically -> detects cycle automatically

-> suitable for parallel implementation-> suitable for parallel implementation

[Gorodnichy&Reznik’94]:

“Process only those neurons which change during the evolution”, i.e.

instead of N multiplications:

do only few of them :

Page 16: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

16. Associative memory for video (Dr. Dmitry Gorodnichy)

How video information is processed ?

• As we know how to memorize, the question is

what should be memorized?

What type of video information needs to be processed

?

Lets see what mother nature (neurobiology) tells us

Page 17: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

17. Associative memory for video (Dr. Dmitry Gorodnichy)

Visual Processing mechanisms

• Images are of very low resolution except in the fixation point.

• The eyes look at points which attract visual attention.

• Saliency is: in a) motion, b) colour, c) disparitydisparity, d) intensity.

• These channels are processed independently in brain Intensity means: frequencies, orientation, gradient .

• Brain process the sequences of images rather than one image. - Bad quality of images is compensated by the abundance of images.

• Colour & motion are used for segmentation.

• Intensity is used for recognition.

• Bottom-up (image driven) visual attention is very fast and precedes top-down (goal-driven) attention: 25ms vs 1sec.

Refs:Itti,….

Page 18: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

18. Associative memory for video (Dr. Dmitry Gorodnichy)

Visual recognition mechanism

- What to learn: generality vs specifics, invariance vs selectivity

- Affine transformations in 2D (rotation in image plane, scale) are easily dealt with.

- No 3D model stored. Instead, several view-based 2D models stored

- One neural network per view.

In context of face recognition:

- Faces are stored in canonical representation

- 2D transformations are easy in image/video processing!

- Video allows to wait (until a face is a position in which it was stored)

Refs: Poggio,…

Page 19: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

19. Associative memory for video (Dr. Dmitry Gorodnichy)

                                                                                       

                                            

                       

                                                              

                                              

                     

                          

                             

                           

                             

                           

                          

                         

Orientation selectivity,Top-down vs bottom up detection

From [Riesenhuber-Poggio, Nature Neuroscience,2000]

Page 20: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

20. Associative memory for video (Dr. Dmitry Gorodnichy)

On computer vision side

Page 21: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

21. Associative memory for video (Dr. Dmitry Gorodnichy)

Perceptual Vision System

Goal: To detect, track and recognize face and facial movements of the user.

x y , z PUI

monitor

binary eventON

OFFrecognition /

memorizationUnknown User!

Setup:

+ face close to camera (within hand

distance)

+ approximately front-faced oriented

+ limited number of users and motions

- off-the shelf camera (low quality,

low resolution)

- Desktop computer (with limited

processing power)

Page 22: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

22. Associative memory for video (Dr. Dmitry Gorodnichy)

What can be “perceived”: Face processing tasks

““Something yellow moves”Something yellow moves”

Face Segmentation

Facial Event Recognition

Face Memorization

Face Detection

Face Tracking(crude)

Face Classification

Face Localization(precise)

Face Identification

“It’s a face”

“It’s at (x,y,z,”

“Lets follow it!”

“It’s face of a child”“S/he smiles, blinks”

“Face unknown. Store it!” “It’s Mila!”

“I look and see…”

Page 23: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

23. Associative memory for video (Dr. Dmitry Gorodnichy)

Computer Vision results achieved

• 1998. Proof-of-concept PUI: colour-based tracking [Bradski]– Unlikely to be used for precise tracking…

• 1999-2002.Several good skin colour models developed(HSV,UCS,YCrCb*)

– Unlikely to get better than that…

• 2002. Subpixel-accuracy convex-shape nose tracking [Nouse]

• 1999-2001. Motion-based segmentation & localization– 2001. Non-linear change detection– 2003. Second-order change detection [Double-blink]

• 2001. Viola-Jones face detection using Haar-like wavelets

• 2004. Stereotracking using nose and projective vision [Gorodnichy,Roth-IVC]

Page 24: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

24. Associative memory for video (Dr. Dmitry Gorodnichy)

Face Detection and Tracking

Page 25: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

25. Associative memory for video (Dr. Dmitry Gorodnichy)

Face Detection and Tracking (lights off)

Page 26: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

26. Associative memory for video (Dr. Dmitry Gorodnichy)

Face Detection and Tracking (lights on)

Page 27: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

27. Associative memory for video (Dr. Dmitry Gorodnichy)

Demand and applications

Internet, tendencias & tecnología

La nariz utilizada como mouse

En el Instituto de Tecnología de la Información, en Canadá, se desarrolló un sistema llamado Nouse que permite manejar softwares con movimientos del rostro. El creador de este programa, Dmitry Gorodnichy, explicó vía e-mail a LA NACION LINE cómo funciona y cuáles son sus utilidades

Si desea acceder a más información, contenidos relacionados, material audiovisual y opiniones de nuestros lectores ingrese en : http://www.lanacion.com.ar/03/05/21/dg_497588.asp Copyright S. A. LA NACION 2003. Todos los derechos reservados.

Page 28: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

28. Associative memory for video (Dr. Dmitry Gorodnichy)

On importance of nose

Test: The user rotates his head only! (the shoulders do not move)

Precision / convenience is such that it allows one to use nose as mouse (or a joystick handle) – to Nouse

Page 29: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

29. Associative memory for video (Dr. Dmitry Gorodnichy)

NouseTM : range and speed of tracking

Page 30: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

30. Associative memory for video (Dr. Dmitry Gorodnichy)

Stereotracking with nose feature

Page 31: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

31. Associative memory for video (Dr. Dmitry Gorodnichy)

Second-order change detection

• Detecting change in a change [Gorodnichy’03]

• Non-linear change detection deals with changes due illumination changes [ Durucan’02]

Page 32: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

32. Associative memory for video (Dr. Dmitry Gorodnichy)

Eye Blink Detection

• Previously very difficult in moving heads

• With second-order change detection became possible

• Is currently used to enable people with brain injury face-to-face communication [AAATE’03]

Page 33: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

33. Associative memory for video (Dr. Dmitry Gorodnichy)

Something (special) about video

Importance:

- Video is becoming ubiquitous. Cameras are everywhere.

- For security, computer–human interaction, video-conferencing, entertainment …

Constraints:

- Real-time processing is required.

- Low resolution: 160x120 images or mpeg-decoded.

- Low-quality: week exposure, blurriness, cheap lenses

Essence:

- It is inherently dynamic! temporal info to make up for bad quality

- It has parallels with biological vision! it can be processed efficiently

Page 34: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

34. Associative memory for video (Dr. Dmitry Gorodnichy)

Applicability of 160x120 video

• According to face

anthropometrics(studied on BioID database)

• Tested with

Perceptual User interfaces

Face size

½ image ¼ image 1/8 image 1/16 image

In pixels 80x80 40x40 20x20 10x10

Between eyes-IOD 40 20 10 5

Eye size 20 10 5 2

Nose size 10 5 - -

FS b

FD b -

FT b -

FL b - -

FER b -

FC b -

FM / FI - -

– goodb – barely applicable

- – not good

Page 35: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

35. Associative memory for video (Dr. Dmitry Gorodnichy)

Choosing the face model.

On importance of eyes:

• Eyes are the most salient features on a face.

• Besides, there two of them, which makes the excellent

reference frame out of them

• They also the best (and the only) stable landmarks on a face

which can be used a reference.

Intra-ocular distance (IOD) makes a very

convenient unit of measurement!

Eye –centered face model

On resolution:

• Lowest resolution possible, not to inflict overfitting due

to the present noise

(and there’s a lot of noise in video!)

Page 36: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

36. Associative memory for video (Dr. Dmitry Gorodnichy)

Eye-centered face representations

Suitable for Face Analysis from video

d

24

2. .IOD

Suitable for Face Recognition in

travel documents [ICAO’02]

Size 24 x 24 is sufficient for face memorization & recognition and is optimal for low-quality video and for fast processing.

Page 37: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

37. Associative memory for video (Dr. Dmitry Gorodnichy)

From image pixels to feature vectors

• When the eyes are detected, and a face is converted to a canonical

representation, it is easy to memorize to recognize

• Using (orientational, frequency) features: Gabor filters ?

• As faces are already rectified (to the same scale and orientation), no need

for complex transformations.

Just deal with illumination changes.

• Converting 24x24 face to binary feature vector:

A) Vi =Ixy - Iave , N=24x24=576

B ) Vi,j =sign(Ii - Ij ), N= 244

C ) Vi,j =Haar-like(i,j,k,l ) much more

• Some pixels may be ignored (corners, eye location)

Page 38: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

38. Associative memory for video (Dr. Dmitry Gorodnichy)

Closer to experiments

• Network size of N=576 stores– M=N/2 states with …– M=N/4 states with 25%N error correction

• Faces are extracted using OpenCV Viola-Jones function

• Another way: from blinking as in [avbpa03]:

Page 39: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

39. Associative memory for video (Dr. Dmitry Gorodnichy)

Visual memory for user perception

• What can be retrieved:– user identity, – face orientation, – facial expression

Page 40: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

40. Associative memory for video (Dr. Dmitry Gorodnichy)

Retrieving orientation

Page 41: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

41. Associative memory for video (Dr. Dmitry Gorodnichy)

A few more demos: taped and live…

… as time allows…

• Watch how memory is being filled out, as you learn new prototypes

Page 42: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

42. Associative memory for video (Dr. Dmitry Gorodnichy)

Conclusions ?

Computer vision

Pattern recognition

Neuro-biology

Page 43: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

43. Associative memory for video (Dr. Dmitry Gorodnichy)

Conclusions

• A lot has been done in PR, CV, NB. – How to know all of these ?…– How to use all of these ?… Or which way you’d prefer?

• Attractor-based network - great tool: – very easy to understand what it is doing– very suitable for live real-time video processing– Very much within the lines of biological vision– You are invited to try it yourself! – from our website

• Other contributions:– Canonical face representation for FPIV

• Is that possible to work, while on parental leave with two kids?

Page 44: Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

44. Associative memory for video (Dr. Dmitry Gorodnichy)

Dealing with a stream of data

Dynamic desaturationDynamic desaturation:

-> maintains the capacity of -> maintains the capacity of 0.2N0.2N (with complete retrieval) (with complete retrieval)-> allows to store data in real-time -> allows to store data in real-time (no need for iterative learning methods!) -> -> provides means for forgetting obsolete dataprovides means for forgetting obsolete data-> is the basis for the design of -> is the basis for the design of adaptive filtersadaptive filters