MIC-TJU at MediaEval Violent Scenes Detection (VSD) 2014

Tongji University

Bowen Zhang, Yun Yi and Hanli Wang

Content

Introduction

System Description

Shot Boundary Detection

Video and Audio Features4.

Results5.

Content

Introduction

System Description

Results5.

Violent Scene Detection (VSD) contains two sub-

task: Main Task and Generalization Task.

Train set: 24 movies from Hollywood

Test set:

Main Task: 5 movies from Hollywood

Generalization Task: 86 web videos

Introduction

Challenges: Difficulties in feature detection since no shot boundary

given Camera jitters make it hard to track trajectories

Introduction

Shot from Main Task Shot from Generalization Task

Motivation

Use shot boundary detection algorithm to detect shot

boundary.

Use camera motion elimination technique to remove

camera jitters.

Introduction

Content

Introduction

System Description

Results5.

Our approach: Shot boundary detection

Trajectory based video features

Appearance video feature

Audio feature

System Description

Trajectory based features

Appearance feature

Audio feature

Content

Introduction

System Description

Results5.

Based on difference of histograms Adaptive threshold

Shot boundary detection

Content

Motivation

System Description

Results5.

Video and audio features

Video and audio features Trajectory based features Camera motion elimination Appearance feature Audio feature

Saliant Trajectory Camera motion elimitation Classification

Why? Choose good points for tracking.

Saliant trajectory appraoch: SIFT keypoints detection

Dense optical flow

Multiple spacial scale tracking

Trajectory descriptions HOG (96=2×2×8×3)

HOF (108=2×2×9×3)

MBH (192=2×2×8×3×2)

(a) Frame k (b) Frame k+1 (c) HOG

(d) HOF (e) MBHx (f) MBHyIn Munsell system, orientation is indicated by color and magnitude by saturation.

Action may be fused by camera motion.

Background detection (Pixel-Based Adaptive Segmenter method)[2]

Keypoints match

Homography estimation (Random Sample Consensus)

Frame rectification

Camera motion elimitation

(a) Frame k (b) Background

(c) Keypoint match

Camera motion elimitation

(a) Frame k (b) Optical flow

(c) Warped optical flow (d) Trajectory

Dense SIFT

Dense Grid: patches with 4 pixel steps and 5 scales

SIFT: Calculate SIFT on dense grid.

Appearance feature

Dense Grid

The time window for each MFCC is 32 ms.

There is 50% overlap between two adjacent windows.

We integrate delta and double-delta of 20 dimensions MFCC vector

into the original MFCC vector to generate a 60-dimension MFCC

vector.

Audio feature

Content

Motivation

System Description

Results5.

Configuration of submitted runs

Conclusion

Run Trajectory based Features

Appearance Feature

Audio Feature

Fusion Weights

Run 1 HOG, HOF, MBH ‐ MFCC Late Fusion 4:1

Run 2 HOG, HOF, MBH Dense SIFT MFCC Double Fusion 4:1

Run 3 HOG, HOF, MBH Dense SIFT MFCC Double Fusion 1:1

Run 4 HOG, HOF, MBH Dense SIFT MFCC Late Fusion 4:1:1

Run 5 HOG, HOF, MBH Dense SIFT MFCC Late Fusion 4:1:1

Configuration of submitted runs

In the late fusion, an arithmetic sum of scores outputted from SVM

for video features (trajectory based features and appearance feature)

and audio feature is calculated.

In double fusion, we firstly early fuse video features and then late fuse

video features and audio feature.

The weight setting segmented by colon in Table stands for the weights

applied to different kinds of features during late fusion.

Conclusion

Results

Metrics of result is MAP2014.

Dense SIFT improves score.

Generalization Task outperform Main Task, because shots in

Generalization Task do not change as frequent as that in Main Task.

Conclusion

Run Main Task Generalization Task

Run 1 44.17% 56.01%

Run 2 43.07% 56.52%

Run 3 44.60% 55.56%

Run 4 39.23% 56.62%

Run 5 38.50% 56.00%

Thank you!

MIC-TJU at MediaEval Violent Scenes Detection (VSD) 2014

Software

Transcript of MIC-TJU at MediaEval Violent Scenes Detection (VSD) 2014

Mediaeval Studies 78

MediaEval 2015 - Recod @ MediaEval 2015: Diverse Social Images Retrieval

VSD Project (VSD#1348)

VSD/A - VSD/B

MediaEval 2015 - UNIZA System for the "Emotion in Music" task at MediaEval 2015

MediaEval 2015 - The Placing Task at MediaEval 2015

Mediaeval Church Vaulting

CHAPTER IN BIG DATA PROCESSING 7 SYSTEMS - TJU

DZS VSD and DZM VSD - Atlas Copco · DZS 100 – 400 VSD+ The DZS 100 VSD +, DZS 200 VSD and DZS 400 VSD are a range of Class 0 certified dry claw vacuum pumps; single stage, oil-free,

MediaEval 2015 - CERTH at MediaEval 2015 Synchronization of Multi-User Event Media Task

Mediaeval Mason

1.4a Procurement Issues (Dr. Zhang-TJU).ppt

E-Air VSD range · 131 166 191 237 360 466 H450 VSD H250 VSD H185 VSD E-Air VSD: Pressure range adjustable with PACE functionality Every E-Air VSD comes with PACE technology (Pressure

The Mediaeval Fair

A Mediaeval Burglary

ROBERT TOWNSON DISCOGRAPHY...81. vsd-5311 korngold: sinfonietta 82. vsd-5312 switch 83. vsd-5313 oscar 84. vsd-5314 la femme nikita 85. vsd-5315 the hard way 86. vsd-5316 love field

Vsd Manual

MediaEval 2015 - DMUN at the MediaEval 2015 C@merata Task: the Stravinsqi Algorithm

MediaEval Workshop 2011

MediaEval 2015 - OHSU @ MediaEval 2015: Adapting Textual Techniques to Multimedia Search