Using Multiple Synchronized Views Presenter: Teklu Urgessa Efficient Video Browsing.
-
Upload
ami-reynolds -
Category
Documents
-
view
216 -
download
0
Transcript of Using Multiple Synchronized Views Presenter: Teklu Urgessa Efficient Video Browsing.
Using Multiple Synchronized Views
Presenter: Teklu Urgessa
Efficient Video Browsing
Authors
Arnon Amir, Savitha Srinivasan and Dulce Ponceleon
IBM Almaden Research Center
Key Words for PublicationVideo RetrievalMultimedia BrowsingVideo BrowsingSynchronized ViewsAudio Time Scale Modification(TSM)Fast PlaybackVideo BrowserAnima-visualization TechniquesSlide ShowAdaptive Accelerating ……
Table of contentsIntroductionTraditional MethodsProblems with Traditional MethodsAdvanced Technology/Methods Technology for Visual Technology for AudioSummaryReference
1. Introduction
Text Browsing Vs MMBBrowsing Text Documents: Simple and fastBrowsing multimedia documents is not as
easy text browningIt is complex and time consumingProduction and application of video
contents is increasing from time to me.The need for efficient way of video
browsing is very crucialThe paper deals with different methods of
efficient video browsing.
Growth of Digital Contents
Increasing Demand in Video Content
Factors for the Fast Growth DC
Digital Video Becomes commonFrom Our
Smart phonesNotebooksWebcamsDigital camera and camcordersSecurity and monitoring cameras
Advanced Streaming TechnologyFast Internet AccessMPEG-4 formatDiversity in the application areas of Video
Application Areas: Where Videos are Important
EntertainmentEducation and TrainingDistance Learning: Online Distance
LearningMedical and Technical ManualsAdvertisements...
Problem/ChallengesAs the amount of video-rich (multimedia)
data grows:Finding and accessing becomes critical
problem from large video repositoriesGiven Need of Users: Quick and Efficient
Retrieval
Need of Research in the Area of the Efficient Video Retrieval
Major research activities/efforts were underway in the last decade to find out best and efficient methods of video indexing, searching and retrieval.
Nature of Video Retrieval Research: Multidisciplinary
Areas of research:Computer Vision Pattern Recognition Speech Recognition Information Retrieval…
Basic Concepts: Searching and BrowsingBoth Activities are tightly CoupledSearching: needs specific entries i.e. you
can search for specific company or a personBrowsing: A generic approach; Eg. Korean
Foods or HousesA combination of both can also happen:
First search the broader concept and the browse to reach at the specific concept and vice versa.
2.Traditional Methods Finding a Video Data
Search through categoriesSimilar to Internet shopping mall
We search for big categoriesThen smaller categories…and so on…
User should choose which to browseShould check whether the selected data
matches what user needsManual categorization and annotationOne by one?
Time consuming!
Problem with Traditional video search and browsing technologiesThe Authors stated thatToo complicated
Lack of efficient algorithmTime consuming
Multimedia calculation are complex and demanding
InaccuracyVideo data is increasing exponentially
manual Cataloging is a big limitationManual cataloging is error prone: lacks
accuracy due subjectivity
3.Advanced Technologies for Image and Video retrieval
MPEG-7 StandardsSpeech indexingShot Boundary DetectionTime Scale Modification of Audio SignalsStoryboards, Moving Storyboards and
AnimationAdaptive Accelerating Fast PlaybackStreaming Synchronized Views
MPEG-7: Multimedia Description StandardStandardized by :
International Standard Organization (ISO)International Electro-technical Commission (IEC)ISO/IEC 15938 (Multimedia content description
interface)Not a video encoding format of moving pic like
MPEG-1-4MPEG uses XML to store metadata/description
The description can be attached to timecode in multimedia in order to tag particular event.
By this tagAble to index and search efficiently
Yet, improvement is needed
Illustration: Independence between Description and Content
Source http://en.wikipedia.org/wiki/File:Mpeg7image1.svg :
How it works
Source http://en.wikipedia.org/wiki/File:Mpeg7image1.svg :
Speech IndexingSearch through speech transcripts
Finds familiar metaphor of free text search
Automatic speech recognition (ASR)Indexed transcript → semantic information
Main advantage : RepresentationSpeech is built of words
Shot Boundary DetectionShot Boundary Detection(SBD) algorithm
Completely automaticKey frames are selected and extracted
Saved as JPEG filesHigh Accuracy and Efficiency
Still, fault detection problem is unsolved
Definitions Basic ConceptsFrame: composed of picture elements just
like a chess boardKey frame: Represents shotsShot: Group of frames which represents
similar framesStart key frame End key frame Animation
3 levels of Video Browsing
SBDKey to Efficient Video Visualization is
accurate detection of boundaries A shot is continues Sequence of frames as
captured by the cameraOften represented by single key frame in
the storyboardShot Boundaries: Changes between shotsCreated during editing phase (Hard cut,
Fade, Dissolves)Can be gradual or abrupt
SBD AlgorithmsFour shot boundary detection algorithms1.Color Histogram Differences: the best and
most balanced “older” algorithm: Hard Cut editing
2.Edge Change Ratio: the recently proposed algorithm:
used for Hard cut, Fade and Editing
3. Standard Deviation of Pixel Intensities: For fade 4.Contrast: For dissolve
Time Scale Modification of Audio Signals
Efficient video browsing needs efficient audio browsingExcept images, most digital contents are
audibleFaster audio browsing is necessaryTSM : allow speeding up or slowing down
audio w/t noticeable distortion By skip pitch periods to speed up duplicate
when you want to slow down Human speech signals are quasi-periodicChanging total play time: deleting or inserting
small audio segment
Improvement of TSM
Time Scale Modification(TSM) algorithm
•Waveform Synchronous Overlap(WSOLA)
Time-Domain Harmonic Scaling(TDHS) technique
Time-Domain, Pitch Synchronous Overlap Add
Foundation and general formulation
Simple time Domain
Modern speech TSM algorithm
Pointer Interval Controlled Overlap Add
Optional and applicable to all MPEG4 audio coding Scheme
Used in the paper
Synchronous Overlap-Add SOLA
Storyboards, Moving Storyboards and AnimationStoryboard
a set of one or more pages, each consists of a two dimensional array of key-frames, sorted in chronological order.
Animationa quick slide show, where each of the key-frames is
shown for a fixed short period (e.g., 0.6 seconds)Moving Storyboard (MSB)
the animated key frames, fully synchronized with the original audio track. Each key-frame is shown for the entire duration of the associated shot.
Example.http://www.youtube.com/watch?v=-l4Xzak9LpM
Example of Storyboard
Adaptive Accelerating Fast PlaybackVery fast video playback (without audio)Ordinary fast forward depends only on
speedThere is a chance to miss important scene
Accelerates until new scene is metRequires less computation load
ConclusionMultimedia Browsing not as simple text
browsingStudies on efficient video browsing is still
underway Active accelerating fast playback
Most useful at analyzing surveillance videosSBD: Useful for visual contents TSM: Useful for Audio contents
Efficient Video retrieval implements the above technologies
Questions1. Explain The different Levels of MPEG-7
description method of Visual Content2. What Method is appropriate for Efficient Audio
Retrieval3. Is MPEG-7 a content compressing tool? If No
why?, Who standardized it what is the name of its standard@
4. What Method is efficient way visual content retrieval
5. Explain the difference that exists among Shot, Key frame and shot boundary
References Shot Boundary Detection
http://muvis.cs.tut.fi/sbd.htmlhttp://link.springer.com/chapter/10.1007%2F11795131_95?
LI=true#page-1 Key frame
http://en.wikipedia.org/wiki/Key_frame Synchronous Overlap-Add
http://www.surina.net/article/time-and-pitch-scaling.html
Growth of Digital Information Created and Replicatedhttp://www.emc.com/leadership/digital-universe MPEG-7 standardhttp://www.en.wikipedia.org/wiki/File:Mpeg7image1.svg PSOLA (Pitch Synchronous Overlap and Add)http://en.wikipedia.org/wiki/PSOLA