Video Analysis Video Search System - 國立臺灣大學winston/papers/tv05.poster.g.ppt.pdf ·...
Transcript of Video Analysis Video Search System - 國立臺灣大學winston/papers/tv05.poster.g.ppt.pdf ·...
Columbia UniversityVideo Search System
Evaluation
Shih-Fu Chang, Winston Hsu, Lyndon Kennedy, Akira Yanagawa, Eric Zavesky, Dong-Qing Zhang, in collaboration with IBM Research
http://www.ee.columbia.edu/dvmm
Text+Story+Anchor Removal
+cue-X Reranking+CBIR
+Concept Search
0.114
Text+Story+Anchor Removal
+cue-X reranking+CBIR
0.111
Text+Story+Anchor Removal
+cue-X reranking
0.107
Text+Story+Anchor Removal0.095
Text+Story0.087
Text0.039
Components
(in automatic search TRECVID 2005)
MAP
Objectives and Unique Features Fusion of detection and searching techniques at different levels across modalities Combination of parts-based graph model and fixed-feature classifiers for concept detection Information-theoretic Cue-X clustering for story segmentation and search re-ranking Semantic search over multi-modal metadata (ASR/MT and semantic concepts) Automatic discovery of query topic classes and user search patterns
http://www.ee.columbia.edu/cuvidsearch
Video AnalysisContent Extraction
automaticstory
segmentation
video,speech,
text
near-duplicatedetection
concept detectionfeature
extraction (text,video, prosody)
concept search
text search
Image matching
storybrowsing
Near-duplicatesearch
Interactivesearch
automatic/manualsearch
cue-Xre-ranking
mining querytopic classes
user searchpatternmining
Cue-X Information-theoretic Framework
• Robust story segmentation over TRECVID 05 data,F1=0.87 (ARB), 0.84 (CHN), 0.52 (ENG) w/o using text• Cue-X discovers optimal relevant clusters andre-orders the search results. Shown effective in videosearch.
Parts-Based Detector for Objects & Near-DuplicatesNear-duplicates
Concept Search and Query expansion
jenningstonighttext
audio
Person: 99%
Clinton: 90%US Flag: 80%
Podium: 70%
Give speech: 75%Press conference: 70%
“Find shots of Bill Clintonspeaking with a US flagvisible behind him.”
Query
Term extractionQuery expansion
(Web, WordNet etc)
MetadataIndex
Concept detection:
visual
• Populate visual concepts over a large lexicon• Shots represented as smoothed concept vectors• Match text queries to concept vectors by
incorporating semantic relevance (WordNet) anddetector robustness
Find Event D
Find Event E
Find Object F
Find Object G
AutomaticJointsemantics-performancegrouping
Find Person A
Find Person B
Find Person C
Query Semantics Search Performance
Manuallydefinedqueryclasses
Automatic Discovery of Query Classes
• Learn distinct query topic classes by clustering querysemantics and search performances jointly
• Parts-based statistical graph modelseffective for detecting objects/sceneswith dominant spatio-feature cues.
• Improvement of 10% when fusedwith fixed-feature classifiers.
• Near-duplicate browser as mosteffective tool in our interactivesearch.
• Tested over 170+ hours of multi-lingual news video, TRECVID 2005
• Story-based searches improves> 100% accuracy in automaticsearch
• Near-duplicate browsing doublesthe relevant shots in interactivesearch
• Cue-X re-ranking and conceptsearch proved very useful
• Web-based interactivesearch engine deployment
• Standard database(MySQL) and modularprocessing add-ons
• Personal user profilesand activity logs
Columbia on-line Video Search Engine
System/Demo
(close to the top performance of automatic search)
VideoTextAudio
Key:
Cue-X clusters automatically discoveredvia Information Bottleneck principle &Kernel Density Estimation
Y= story boundary
Y=“demonstration”
low-level features
… …
Information Bottleneck principle
Y= search relevance