Spatio-temporal saliency model to predict eye movements in video free viewing Gipsa-lab, Grenoble...

25
Spatio-temporal saliency model to predict eye movements in video free viewing Gipsa-lab, Grenoble Département Images et Signal CNRS, UMR 5216 S. Marat, T. Ho Phuoc, L. Granjon, N. Guyader, D. Pellerin, A. Guérin- Dugué GDR-vision 12/06/2008

description

Salient region attracts attention and so the eyes Saliency depends mainly on two factors: Bottom-up : task-independent, depending on intrinsic features of the stimuli Top-down : task-dependant, integrating high-level processes (cognitive state,...) 3/24 Introduction

Transcript of Spatio-temporal saliency model to predict eye movements in video free viewing Gipsa-lab, Grenoble...

Page 1: Spatio-temporal saliency model to predict eye movements in video free viewing Gipsa-lab, Grenoble Département Images et Signal CNRS, UMR 5216 S. Marat,

Spatio-temporal saliency model to predict eye movements in video free viewing

Gipsa-lab, Grenoble Département Images et Signal

CNRS, UMR 5216

S. Marat, T. Ho Phuoc, L. Granjon, N. Guyader, D. Pellerin, A. Guérin-Dugué

GDR-vision 12/06/2008

Page 2: Spatio-temporal saliency model to predict eye movements in video free viewing Gipsa-lab, Grenoble Département Images et Signal CNRS, UMR 5216 S. Marat,

Plan Introduction

Model

Experiment and results

Conclusion

2/24

Page 3: Spatio-temporal saliency model to predict eye movements in video free viewing Gipsa-lab, Grenoble Département Images et Signal CNRS, UMR 5216 S. Marat,

Salient region attracts attention and so the eyes

Saliency depends mainly on two factors: Bottom-up : task-independent, depending on intrinsic

features of the stimuli Top-down : task-dependant, integrating high-level

processes (cognitive state, ...)

3/24

Introduction

Introduction

Page 4: Spatio-temporal saliency model to predict eye movements in video free viewing Gipsa-lab, Grenoble Département Images et Signal CNRS, UMR 5216 S. Marat,

Spatio-temporal saliency model

Achromatic stimuli

Simulates some parts of the human visual system: retina, primary visual cortex (V1)

Two pathways : static and dynamic

4/24

Model

Model

Ms Md

Page 5: Spatio-temporal saliency model to predict eye movements in video free viewing Gipsa-lab, Grenoble Département Images et Signal CNRS, UMR 5216 S. Marat,

5/24

Model

Model

Ms Md

Spatio-temporal saliency model

Achromatic stimuli

Simulates some parts of the human visual system: retina, primary visual cortex (V1)

Two pathways : static and dynamic

Page 6: Spatio-temporal saliency model to predict eye movements in video free viewing Gipsa-lab, Grenoble Département Images et Signal CNRS, UMR 5216 S. Marat,

6/24

Model

Model

Ms Md

Spatio-temporal saliency model

Achromatic stimuli

Simulates some parts of the human visual system: retina, primary visual cortex (V1)

Two pathways : static and dynamic

Page 7: Spatio-temporal saliency model to predict eye movements in video free viewing Gipsa-lab, Grenoble Département Images et Signal CNRS, UMR 5216 S. Marat,

7/24

Model

Model

Spatio-temporal saliency model

Achromatic stimuli

Simulates some parts of the human visual system: retina, primary visual cortex (V1)

Two pathways : static and dynamic

Page 8: Spatio-temporal saliency model to predict eye movements in video free viewing Gipsa-lab, Grenoble Département Images et Signal CNRS, UMR 5216 S. Marat,

Two outputs :

Magnocellular-like: Low spatial frequencies, band pass filter,

whitens spectrum, provides global information

Parvocellular-like: High spatial frequencies, high pass filter,

whitens spectrum, enhances frame contrast

8/24

Retina model

Model _ retina model

Page 9: Spatio-temporal saliency model to predict eye movements in video free viewing Gipsa-lab, Grenoble Département Images et Signal CNRS, UMR 5216 S. Marat,

9/24

Retina model

Model _ retina model

« Parvocellular-like » « Magnocellular-like »

Ms Md

Page 10: Spatio-temporal saliency model to predict eye movements in video free viewing Gipsa-lab, Grenoble Département Images et Signal CNRS, UMR 5216 S. Marat,

Visual stimuli are processed in different frequency bands and orientation in V1 Static: 6 orientations, 4

frequency bands Dynamic: 6 orientations, 3

frequency bands (lower)

10/24

Cortical-like filters

Model _ cortical-like filters

Ms Md

Page 11: Spatio-temporal saliency model to predict eye movements in video free viewing Gipsa-lab, Grenoble Département Images et Signal CNRS, UMR 5216 S. Marat,

11/24

Static pathway

Model _ static pathway

Static pathway:

Interactions: strengthens the contours Short: between cells of

overlapping receptive fieldLong: between collinear cells

Normalization

Summation in all orientation and frequency bands: static saliency Ms Md

Page 12: Spatio-temporal saliency model to predict eye movements in video free viewing Gipsa-lab, Grenoble Département Images et Signal CNRS, UMR 5216 S. Marat,

12/24

Dynamic pathway

Model _ dynamic pathway

Dynamic pathway:

2 motion estimation steps: Dominant motion compensationLocal motion estimation using the

same bank of cortical filters as static pathway

Temporal filtering

Dynamic saliency: module of motion vector Ms Md

Page 13: Spatio-temporal saliency model to predict eye movements in video free viewing Gipsa-lab, Grenoble Département Images et Signal CNRS, UMR 5216 S. Marat,

13/24

Dynamic pathway

Model _ dynamic pathway

Dynamic pathway:

2 motion estimation steps: Dominant motion compensationLocal motion estimation using the

same bank of cortical filters as static pathway

Temporal filtering

Dynamic saliency: module of motion vector Ms Md

Page 14: Spatio-temporal saliency model to predict eye movements in video free viewing Gipsa-lab, Grenoble Département Images et Signal CNRS, UMR 5216 S. Marat,

Multiplicative fusion

14/24

Fusion and example of saliency maps

Model _ fusion and example of saliency maps

),(),(),( yxMyxMyxM dsand

Original video

Md Mand

Ms

Page 15: Spatio-temporal saliency model to predict eye movements in video free viewing Gipsa-lab, Grenoble Département Images et Signal CNRS, UMR 5216 S. Marat,

Purpose : compare model results with human eye positions

Free viewing, eye positions recorded by Eyetracker Eyelink II

15 subjects20 clips of 30s composed of different snippets

strung together

Stimulus size = 720x576 pixels, 40°x30° field of view

15/24

Experiment and results

Experiment and result

Snippet 1 Snippet 2 Snippet k-1 Snippet kSnippet k-2

[Itti] : R. Carmi and L. Itti, « Visual causes versus correlates of attentional selection in dynamic scenes », Vision Research, vol.46, 2008

Mh

Page 16: Spatio-temporal saliency model to predict eye movements in video free viewing Gipsa-lab, Grenoble Département Images et Signal CNRS, UMR 5216 S. Marat,

Criterion : Normalized Scanpath Saliency (NSS) [Itti]

16/24

Global analysis

Experiment _ global analysis

mapsaliency model mapdensity position eyehuman

),,(),,(),,()(),,(

m

h

kyxM

mmh

MM

kyxMkyxMkyxMkNSSm

0.540.440.54Real eye movementsMdnMsnSDMsnH

Naives saliency maps

Ms: staticMd: dynamicMand: fusion

MsnH: entropyMsnSD: standard-deviationMdn: absolute difference

[Itti] : R. J. peters and L. Itti, « Applying computational tools to predict gaze direction in interactive visual environments », ACM Trans. On Applied Perception, vol.5, 2008

Saliency maps Ms Md Mand

Real eye movements 0.68 0.87 0.96

Page 17: Spatio-temporal saliency model to predict eye movements in video free viewing Gipsa-lab, Grenoble Département Images et Signal CNRS, UMR 5216 S. Marat,

NSS as a function of frame

17/24

Temporal analysis

Experiment _ temporal analysis

Snippet 1 Snippet 2 Snippet N

Average on the kth frame of each snippet

1…k …

Frame rate = 25 fps

Page 18: Spatio-temporal saliency model to predict eye movements in video free viewing Gipsa-lab, Grenoble Département Images et Signal CNRS, UMR 5216 S. Marat,

NSS as a function of frame

18/24

Temporal analysis

Experiment _ temporal analysis

Frame rate = 25 fps

Snippet 1 Snippet 2 Snippet N

Average on the kth frame of each snippet

1…k …

Page 19: Spatio-temporal saliency model to predict eye movements in video free viewing Gipsa-lab, Grenoble Département Images et Signal CNRS, UMR 5216 S. Marat,

NSS as a function of frame

19/24

Temporal analysis

Experiment _ temporal analysis

Frame rate = 25 fps

Snippet 1 Snippet 2 Snippet N

Average on the kth frame of each snippet

1…k …

Page 20: Spatio-temporal saliency model to predict eye movements in video free viewing Gipsa-lab, Grenoble Département Images et Signal CNRS, UMR 5216 S. Marat,

NSS as a function of frame

20/24

Temporal analysis

Experiment _ temporal analysis

Frame rate = 25 fps

Snippet 1 Snippet 2 Snippet N

Average on the kth frame of each snippet

1…k …

Page 21: Spatio-temporal saliency model to predict eye movements in video free viewing Gipsa-lab, Grenoble Département Images et Signal CNRS, UMR 5216 S. Marat,

NSS as a function of frame

21/24

Temporal analysis

Experiment _ temporal analysis

Dispersion of eye positions as a function of frame

Frame rate = 25 fps

Page 22: Spatio-temporal saliency model to predict eye movements in video free viewing Gipsa-lab, Grenoble Département Images et Signal CNRS, UMR 5216 S. Marat,

NSS as a function of frame

22/24

Temporal analysis

Experiment _ temporal analysis

Dispersion of eye positions as a function of frame

Frame rate = 25 fps

10-13th frame ≈ 400-520 ms

Page 23: Spatio-temporal saliency model to predict eye movements in video free viewing Gipsa-lab, Grenoble Département Images et Signal CNRS, UMR 5216 S. Marat,

NSS as a function of frame

23/24

Temporal analysis

Experiment _ temporal analysis

Dispersion of eye positions as a function of frame

Frame rate = 25 fps

Page 24: Spatio-temporal saliency model to predict eye movements in video free viewing Gipsa-lab, Grenoble Département Images et Signal CNRS, UMR 5216 S. Marat,

New model of spatio-temporal saliency, biologically inspired

Retina filter with two outputs Interactions Same bank of cortical-like filters for static and dynamic pathways

This model is reliable to predict the first fixations

references : S. Marat, T. Ho Phuoc, L. Granjon, N. Guyader, D. Pellerin, A. Guérin-Dugué, « Spatio-temporal saliency model to predict

eye movements in video free viewing », Proc. Eusipco 2008 S. Marat, T. Ho Phuoc, L. Granjon, N. Guyader, D. Pellerin, A. Guérin-Dugué, « Modelling spatio-temporal saliency to

predict gaze direction for short videos », submitted in International Journal of Computer vision

24/24

Conclusion

Conclusion

Page 25: Spatio-temporal saliency model to predict eye movements in video free viewing Gipsa-lab, Grenoble Département Images et Signal CNRS, UMR 5216 S. Marat,

Thanks for your attention !