Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman...
-
Upload
charla-singleton -
Category
Documents
-
view
215 -
download
1
Transcript of Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman...
![Page 1: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/1.jpg)
Bridge Semantic Gap: A Bridge Semantic Gap: A Large Scale Concept Large Scale Concept Ontology for Multimedia Ontology for Multimedia (LSCOM)(LSCOM)
Guo-Jun QiBeckman InstituteUniversity of Illinois at Urbana-Champaign
![Page 2: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/2.jpg)
LSCOM (Large Scale Concept LSCOM (Large Scale Concept Ontology for Multimedia)Ontology for Multimedia)A broadcast news video dataset
200+ news videos/ 170 hours
61,901 shots
Language
◦ English/Arabic/Chinese
![Page 3: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/3.jpg)
Why broadcast News Why broadcast News ontology?ontology?Critical mass of users, content
providers, applicationsGood content availability
(TRECVID LDC FBIS)Share Large set of core concepts
with other domains
![Page 4: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/4.jpg)
LSCOM ProvidesLSCOM ProvidesRichly annotated video content
for accomplishing required access and analysis functions over massive amount of video content
Large scale useful well-defined semantic lexicon◦More than 3000 concepts◦374 annotated concepts◦Bridging semantic gap from low-level
features to high-level concepts
![Page 5: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/5.jpg)
A LSCOM conceptA LSCOM concept
000 - ParadeConcept ID: 000Name: ParadeDefinition: Multiple units of marchers, devices, bands, banners or Music.Labeled: Yes
![Page 6: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/6.jpg)
LSCOM HierarchyLSCOM Hierarchy http://www.lscom.org/ontology/index.html
Thing.Individual..Dangerous_Thing...Dangerous_Situation....Emergency_Incident.....Disaster_Event......Natural_Disaster....Natural_Hazard.....Avalance.....Earthquake.....Mudslide.....Natural_Disaster.....Tornado...Dangerous_Tangible_Thing....Cutting_Device
![Page 7: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/7.jpg)
Definition: What’s the Definition: What’s the ontology? (Wikipedia)ontology? (Wikipedia)An ontology is a formal
representation of the knowledge by a set of concepts within a domain and the relationships between those concepts. It is used to reason about the properties of that domain, and may be used to describe the domain.
![Page 8: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/8.jpg)
OntologyOntologyRepresents the visual knowledge
base in a structure way◦Graph structure◦Tree (hierarchy) structure
Images/videos can be effectively learned and retrieved by the coherence between concepts◦Logical coherence◦Statistical coherence
![Page 9: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/9.jpg)
An Ontology Hierarchy: An Ontology Hierarchy: Military VehicleMilitary Vehicle
![Page 10: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/10.jpg)
An example from An example from WikipediaWikipedia
![Page 11: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/11.jpg)
Ontology Tree for LSCOMOntology Tree for LSCOM
![Page 12: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/12.jpg)
A Light Scale Concept A Light Scale Concept Ontology for Multimedia Ontology for Multimedia Understanding (LSCOM-Lite)Understanding (LSCOM-Lite)The aim is to break the semantic
space using a few concepts (39 concepts).
Selection Criteria◦Semantic Coverage
As many as semantic concepts in News videos could be covered by the light concept set.
◦Compactness These concept should not semantically overlap.
◦Modelability These concepts could be modeled with a
smaller semantic gap.
![Page 13: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/13.jpg)
Selected concept Selected concept dimensionsdimensionsDivide the semantic space into a
multimedia-dimensional space, where each dimension is nearly orthogonal◦Program Category◦Setting/Scene/Site◦People◦Objects◦Activities◦Events◦Graphics
![Page 14: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/14.jpg)
Histogram of LSCOM-Lite Histogram of LSCOM-Lite ConceptsConcepts
![Page 15: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/15.jpg)
Some example keyframesSome example keyframes
![Page 16: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/16.jpg)
ApplicationsApplications
Application I: Conceptual Fusion
(most basic – early fusion)
Application II: Cross-Category
Classification (inter-class relation)
Application III: Event Dynamic in
Concept Space
![Page 17: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/17.jpg)
Application I: Conceptual Application I: Conceptual FusionFusion
Video
Concept 1
Concept 2
Concept 3
Concept n
Visual Features
Classifier
…
![Page 18: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/18.jpg)
LSCOM 374 ModelsLSCOM 374 Models
374 LIBSVM models◦http://www.ee.columbia.edu/ln/dvmm/col
umbia374/◦Feature used (MPEG-7 descriptors)
Color Moments Edge Histogram Wavelet Texture
◦LIBSVM – a library for support vector machine at http://www.csie.ntu.edu.tw/~cjlin/libsvm/
![Page 19: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/19.jpg)
Application II: cross-category Application II: cross-category classification with concept classification with concept transfertransfer
G.-J. Qi et al. Towards Cross-Category Knowledge Propagation for Learning Visual Concepts, in CVPR 2011
![Page 20: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/20.jpg)
Instance-Level Concept Instance-Level Concept CorrelationCorrelation
+1
-1
+1
-1
Mountain Castle
Mountain and castle
Castle o
nly Mountain only
![Page 21: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/21.jpg)
Transfer FunctionTransfer Function
Mountain, Castle
Mountain
Castle
None of them
![Page 22: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/22.jpg)
Model Concept RelationsModel Concept Relations
![Page 23: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/23.jpg)
Automatically construct Automatically construct ontology in a data-driven ontology in a data-driven mannermanner
![Page 24: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/24.jpg)
An application III – Event An application III – Event Dynamics in Concept SpaceDynamics in Concept Space
![Page 25: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/25.jpg)
Event Detection with Event Detection with Concept DynamicsConcept Dynamics
W. Jiang et al, Semantic event detection based on visual concept prediction, ICME, Germany, 2008.
![Page 26: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/26.jpg)
Open ProblemsOpen ProblemsCross-Dataset Gap
◦ Generalize LSCOM dataset to other dataset (e.g., non-news video dataset)
Cross-Domain Gap◦ Text script associated with news videos
Can help information extraction for visual concepts?
Automatic ontology construction◦ Task dependent v.s. task independent◦ Data driven v.s. preliminary knowledge (e.g.,
WordNet)◦ Incorporate prior human knowledge (logic relation
etc.)
![Page 27: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/27.jpg)
TRECVID CompetitionTRECVID CompetitionTask 1: High-Level Feature
Extraction◦Input: subshot◦Output: detection results for 39
LSCOM-Lite concepts in the subshot
![Page 28: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/28.jpg)
High-Level Feature High-Level Feature ExtractionExtractionEach concept assumed to be binary
(absent or present) in each subshotSubmission: Find subshots that
contain a certain concept, rank them by the detection confidence score, and submit the top 2000.
Evaluations: NIST evaluated 20 medium frequent concepts from 39 concepts using a 50% random samples of all the submission pools
![Page 29: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/29.jpg)
20 Evaluated Concepts20 Evaluated Concepts
![Page 30: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/30.jpg)
Evaluation Metric: Average Evaluation Metric: Average PrecisionPrecisionRelevant subshots should be
ranked higher than the irrelevant ones.
R is the number of relevant images in total, Rj is the number of relevant images in top j images, Ij indicates if the jth image is irrelevant or not.
1
1Average Precision
Njj
j
RI
R j
![Page 31: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/31.jpg)
ResultsResults
![Page 32: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/32.jpg)
TRECVID CompetitionTRECVID CompetitionTask II: Video Search
◦Input: text-based 24 topics◦Output: relevant subshots in the
database
![Page 33: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/33.jpg)
Topics to searchTopics to search
![Page 34: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/34.jpg)
Topics to search (cont’d)Topics to search (cont’d)
![Page 35: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/35.jpg)
Topics to searchTopics to search
![Page 36: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/36.jpg)
Three Types of Search Three Types of Search Systems Systems
![Page 37: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/37.jpg)
Results: Automatic RunsResults: Automatic Runs
![Page 38: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/38.jpg)
Results: Manual RunsResults: Manual Runs
![Page 39: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/39.jpg)
Results: Interactive RunsResults: Interactive Runs
![Page 40: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/40.jpg)
Machine Problem 7: Shot Machine Problem 7: Shot Boundary Detection in Boundary Detection in VideosVideos
![Page 41: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/41.jpg)
GoalsGoalsDetect the abrupt content
changes between consecutive frames.◦Scene changes◦Scene cuts
![Page 42: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/42.jpg)
StepsStepsStep 1: Measuring the change of
content between video frames◦Visual/Acoustic measurements
Step 2: Compare the content distance between successive frames. If the distance is larger than a certain threshold, then a shot boundary may exist.
![Page 43: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/43.jpg)
Measuring Content based on Measuring Content based on Visual InformationVisual Information256 dimensional Color Histogram
◦In RGB space, normalize the r, g, b in [0,1]
◦Color spacenr
ng
8X8 histogram
![Page 44: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/44.jpg)
Color HistogramsColor HistogramsDivide each image into four
parts, each part has a 8X8 histogram, and 256 dim features in total.
![Page 45: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/45.jpg)
Acoustic FeaturesAcoustic Features
12 cepstral coefficients
Energy (sum of square of raw signals)
Zero crossing rates (ZCR)
ZCR = sum(|sign(S(2:N))-sign(S(1:N-
1))|)Hints: normalize energy to avoid it
over-dominating when computing distances between successive frames
![Page 46: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/46.jpg)
DatasetsDatasetsTwo videos of little over one
minuteManually label the shot boundary
![Page 47: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/47.jpg)
What to submitWhat to submitSource codeReport
◦compare shot boundary detection results returned by your algorithm with the manually labeled boundaries
◦Compare ◦Explain your choice of threshold◦Explain the differences between the
acoustic-based and visual-based detection results
![Page 49: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.](https://reader030.fdocuments.us/reader030/viewer/2022032707/56649e525503460f94b47d83/html5/thumbnails/49.jpg)
Thanks! Thanks! Q&AQ&A