Video Analysis Video Search System - 國立臺灣大學winston/papers/tv05.poster.g.ppt.pdf ·...

Columbia UniversityVideo Search System

Evaluation

Shih-Fu Chang, Winston Hsu, Lyndon Kennedy, Akira Yanagawa, Eric Zavesky, Dong-Qing Zhang, in collaboration with IBM Research

http://www.ee.columbia.edu/dvmm

Text+Story+Anchor Removal

+cue-X Reranking+CBIR

+Concept Search

0.114


+cue-X reranking+CBIR

0.111


+cue-X reranking

0.107

Text+Story+Anchor Removal0.095

Text+Story0.087

Text0.039

Components

(in automatic search TRECVID 2005)

MAP

Objectives and Unique Features Fusion of detection and searching techniques at different levels across modalities Combination of parts-based graph model and fixed-feature classifiers for concept detection Information-theoretic Cue-X clustering for story segmentation and search re-ranking Semantic search over multi-modal metadata (ASR/MT and semantic concepts) Automatic discovery of query topic classes and user search patterns

http://www.ee.columbia.edu/cuvidsearch

Video AnalysisContent Extraction

automaticstory

segmentation

video,speech,

text

near-duplicatedetection

concept detectionfeature

extraction (text,video, prosody)

concept search

text search

Image matching

storybrowsing

Near-duplicatesearch

Interactivesearch

automatic/manualsearch

cue-Xre-ranking

mining querytopic classes

user searchpatternmining

Cue-X Information-theoretic Framework

• Robust story segmentation over TRECVID 05 data,F1=0.87 (ARB), 0.84 (CHN), 0.52 (ENG) w/o using text• Cue-X discovers optimal relevant clusters andre-orders the search results. Shown effective in videosearch.

Parts-Based Detector for Objects & Near-DuplicatesNear-duplicates

Concept Search and Query expansion

jenningstonighttext

audio

Person: 99%

Clinton: 90%US Flag: 80%

Podium: 70%

Give speech: 75%Press conference: 70%

“Find shots of Bill Clintonspeaking with a US flagvisible behind him.”

Query

Term extractionQuery expansion

(Web, WordNet etc)

MetadataIndex

Concept detection:

visual

• Populate visual concepts over a large lexicon• Shots represented as smoothed concept vectors• Match text queries to concept vectors by

incorporating semantic relevance (WordNet) anddetector robustness

Find Event D

Find Event E

Find Object F

Find Object G

AutomaticJointsemantics-performancegrouping

Find Person A

Find Person B

Find Person C

Query Semantics Search Performance

Manuallydefinedqueryclasses

Automatic Discovery of Query Classes

• Learn distinct query topic classes by clustering querysemantics and search performances jointly

• Parts-based statistical graph modelseffective for detecting objects/sceneswith dominant spatio-feature cues.

• Improvement of 10% when fusedwith fixed-feature classifiers.

• Near-duplicate browser as mosteffective tool in our interactivesearch.

• Tested over 170+ hours of multi-lingual news video, TRECVID 2005

• Story-based searches improves> 100% accuracy in automaticsearch

• Near-duplicate browsing doublesthe relevant shots in interactivesearch

• Cue-X re-ranking and conceptsearch proved very useful

• Web-based interactivesearch engine deployment

• Standard database(MySQL) and modularprocessing add-ons

• Personal user profilesand activity logs

Columbia on-line Video Search Engine

System/Demo

(close to the top performance of automatic search)

VideoTextAudio

Key:

Cue-X clusters automatically discoveredvia Information Bottleneck principle &Kernel Density Estimation

Y= story boundary

Y=“demonstration”

low-level features

… …

Information Bottleneck principle

Y= search relevance

Video Analysis Video Search System - 國立臺灣大學winston/papers/tv05.poster.g.ppt.pdf ·...

Documents

Transcript of Video Analysis Video Search System - 國立臺灣大學winston/papers/tv05.poster.g.ppt.pdf ·...