Visual Summary of Egocentric Photostreams by Representative Keyframes (BSc Ricard Mestre)

Visual Summary of Egocentric Photostreams by Representative

Keyframes

Author: Ricard MestreSupervisor: Xavier GiróDate: Tuesday, 17th of February 2015

1

Contents● Collaboration● Motivation and goals● State of the art● Methodology● Evaluation● Conclusions and future work

2

Collaboration Collaboration with UB group BCNPCL (Barcelona Percepture Computer Laboratory)

3


4

Motivation and goals

● Lifelogging with Narrative Clip

● Up to 2000 images/day

● A visual summary can help the memory of Alzheimer affected people

5

Motivation and goals● Extract a visual summary

of a day

○ Clustering strategy for event detection

○ Automatic selection of representative frames

6


7

State of the art

Chandrasekar et al, “Efficient retrieval from large-scale egocentric visual data using a sparse graph representation” (CVPR Workshop 2014)

8

http://dx.doi.org/10.1109/CVPRW.2014.84





State of the art

Lu and Grauman, ”Story-driven summarization for egocentric video” (CVPR 2013)

9

http://dx.doi.org/10.1109/CVPR.2013.350


10

Methodology

Feature extraction Clustering Division-fusion Keyframe extraction

11

Feature extraction

● Convolutional Neural Networks (CNN) trained with ImageNet.

12

Jia et al, “Caffe: Convolutional Architecture for Fast Feature Embedding” (ACM MM 2014)

http://dx.doi.org/10.1145/2647868.2654889

Methodology


13

Clustering● Obtain separated events● Agglomerative clustering

14

cutoff parameter

Talavera, E., Dimiccoli, M., Bolaños, M., Aghaei, M., & Radeva, P. (2015). “R-Clustering for Egocentric Video Segmentation”. In 7th Iberian Conference on Pattern Recognition and Image Analysis (ACCEPTED).

Clustering: linkage method

● Different linkage methods

● Our case: average linkage

15

Methodology


16

Division● Long events with short events inside

● Groundtruth labelling

17

1 2 3

Fusion● Short clusters (less than 5 images) are not

representative

● Join the short events into larger ones

19

?

?

Example of good segmentation

21

Example of good segmentation

22

Example of bad segmentation

23

Methodology


24

Keyframe extraction● Criterion: visual similarity-based keyframe● Graph-based approach:

25Similarity GraphAdjacency Matrix

Random walk● One pedestrian moving along the graph● The most visited the most representative

26

Minimum distance● Adjacency matrix approach● The minimum distance the most representative

27

Example of summary

28

Example of summary

29

Contents● Collaboration● Motivation and goals● State of the art● Methodology● Evaluation

○ Database○ Clustering○ Keyframe extraction

● Conclusions and future work30

Evaluation: Database● 5 days● 3 users● 4005 images● Groundtruth available

31

Talavera, E., Dimiccoli, M., Bolaños, M., Aghaei, M., & Radeva, P. (2015). “R-Clustering for Egocentric Video Segmentation”. In 7th Iberian Conference on Pattern Recognition and Image Analysis (ACCEPTED).


○ Database○ Clustering

■ Jaccard index■ Linkage effect■ Relabelling effect

○ Keyframe extraction● Conclusions and future work 32

Evaluation: Clustering

● Jaccard index:

33

Linkage effect

34

Relabelling effect

35


○ Database○ Clustering○ Keyframe extraction

■ Blind taste test■ Representative quality of keyframe■ Summary validations

● Conclusions and future work 36

Evaluation: keyframe extraction● User Surveys:

○ Representative quality of keyframe

○ Quality of summary

37

● Methodology: Blind taste test

38Lu and Grauman, ”Story-driven summarization for egocentric video” (CVPR 2013)

Figure: brandchannel.com

Blind taste test: quality of keyframe

http://dx.doi.org/10.1109/CVPR.2013.350

http://www.brandchannel.com/home/post/2012/05/18/Pepsi-Coke-Taste-Test-051812.aspx

Representative quality of keyframe

41

Do you think that the image of the left/center/right can represent the event?

Example of multi-event segmentation

42

Representative quality of keyframe

43

What image is more representative of the event, in your opinion?

Blind taste test: quality of summary

44

Summary validations

45

Can this set of images represent the complete day?

Summary validations

46

Which summary is the best, in your opinion?


47

Conclusions and future work● New methodology taking into account visual and

temporal information

● Keyframe extraction through graph-based approaches

48

Conclusions and future work● 0.53 Jaccard index of segmentation

● 88-86% user acceptance with our summaries

● 58% users choose our summaries as best option

49

Conclusions and future work● Temporal information causes important improvements

● First method of summary extraction for high temporal resolution sets

50

Conclusions and future work

● Apply object detection

● Different criteria of representativity

● Clinical application of this work

51

Conclusions and future work

52

Planned submission:

March 30, 2015

Thanks for your attention!

53

Visual Summary of Egocentric Photostreams by Representative Keyframes (BSc Ricard Mestre)

Technology

Transcript of Visual Summary of Egocentric Photostreams by Representative Keyframes (BSc Ricard Mestre)