ENEE408G Capstone -- Multimedia Signal Processing (F'05) Video Analysis, Processing & Communications...
-
Upload
valerie-lewis -
Category
Documents
-
view
234 -
download
3
Transcript of ENEE408G Capstone -- Multimedia Signal Processing (F'05) Video Analysis, Processing & Communications...
ENEE408G Capstone -- Multimedia Signal Processing (F'05)
Video Analysis, Processing & CommunicationsVideo Analysis, Processing & Communications
Fall’05 Instructor: Carol Espy-Wilson
Electrical & Computer Engineering
University of Maryland, College Park
http://www.ece.umd.edu/class/enee408g/
ENEE408G Fall 2005ENEE408G Fall 2005Lecture-9Lecture-9
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [2]
Last LectureLast Lecture
Motion estimation and compensation
Hybrid video coding/compression– MPEG-1 video coding
Today:– Video content analysis
– Different coding approaches and representations of video
– Overview: video capturing and display
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [3]
Video Content AnalysisVideo Content Analysis
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [4]
Introduction to Video Content AnalysisIntroduction to Video Content Analysis
Teach computer to “understand” video content– Define features that computer can learn to measure and compare
color (RGB values or other color coordinates) motion (magnitude and directions) shape (contours) texture and patterns
– Give example correspondences so that computer can learn build connections between feature & higher-level
semantics/concepts statistical classification and recognition techniques
Video understanding– Break a video sequence into chunks, each with consistent content ~ “shot”
– Group similar shots into scenes that represent certain events
– Describe connections among scenes via story boards or scene graphs
– Associate shot/scene with representative feature/semantics for future query
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [5]
Video Understanding (step-1)Video Understanding (step-1)
– Break a video sequence into chunks, each with consistent content ~ “shot”
From Yeung-Yeo-Liu: STG (Princeton)
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [6]
Video Understanding (step-2)Video Understanding (step-2)
– Group similar shot into scenes
From Yeung-Yeo-Liu: STG (Princeton)
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [7]
Video Understanding (step-3)Video Understanding (step-3)
– Describe connections among scenes via story boards or scene graphs
From Yeung-Yeo-Liu: STG (Princeton)
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [8]
Video Temporal SegmentationVideo Temporal Segmentation
A first step toward video content understanding
Two types of transitions
– “Cut” ~ abrupt transition
– Gradual transition Fade out and Fade in; Dissolve; Wipe
– Demo
Detecting transitions
– Detecting cut is relatively easier ~ check frame-wise difference
– Detecting dissolve and fade by checking linearity f0 (1 – t/T) + f1 * t/T
– Detecting wipe ~ more difficult via projection, edge pattern, or linearity of color
histogram
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [9]
Types of TransitionsTypes of Transitions
– [above] Transition types offered by Adobe Premiere
– See also transition demos provided by PowerPoint
From talks by Joyce-Liu (Princeton)
Video transition collection (Rob Joyce) www.ee.princeton.edu/~robjoyce/research/transitions/
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [11]
Compressed-Domain ProcessingCompressed-Domain Processing
Use I & P frames only to reduce computation and to enhance robustness in scene change detection– … I b b P b b P b b P b b I b b P …
Working in compressed domain– Process video by only doing partial decoding (inverse VLC,
etc.) without a full decoding (IDCT) to save computation
Low resolution version already provide enough information for transition detection– DC-image (spatially-reduced versions of the original image)
(i,j) pixcel in the dc image is the average value of the (i.j) block in the original image
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [12]
DC ImageDC Image– Put DC of each block together
– Already contains most information of the video
DC Frame
Example From Joyce-Liu (Princeton)
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [13]
Fast Extraction of DC Image From MPEG-1Fast Extraction of DC Image From MPEG-1
I frame– Take DC coeff. from each block and put together
P/B frame– Fast approximation of reference block’s DC
– Adding DC of the motion compensation residue recall DCT is a linear transform
[ ( )] [ ( )] [ ( )]DCT P DCT P DCT Pcur ref diff00 00 00
[ ( )] [ ( )]DCT Ph w
DCT Prefi i
ii
00 001
4
64
1 2
3 4
C
RUM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
© 2
00
2)
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [14]
Compressed-Domain Scene Change DetectionCompressed-Domain Scene Change Detection
Compare nearby frames– Take pixel-wise difference of nearby DC-frames
– Or take pixel-wise difference of every N frames to accumulate more changes => useful for detecting gradual transitions
Observe the pixel-wise difference for different frame pairs– Peaks @ cuts, and plateaus @ gradual transitions Figure from Yeo-Liu
CSVT’95 paper
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
© 2
00
2)
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [15]
Scene Change Detection (cont’d)Scene Change Detection (cont’d)
Figure from Yeo-Liu CSVT’95 paper
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
© 2
00
2)
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [18]
Wipe Detection (cont’d)Wipe Detection (cont’d) More diverse and fancy wipes
Linear change in color histogram
wipe
G k
H k
m
n
Bin 1
Bin 2
Bin 3
From talks by Joyce-Liu (Princeton)
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [19]
Color HistogramColor Histogram
What is color histogram?– Count the # of pixels with the same color
– Plot color-value vs. corresponding pixel#
Similarly for luminance histogram
Give idea of the dominate color and color distribution– Ignore the exact spatial location of each color value
– Useful in image and video analysis
Color histogram can be used to:– Detect gradual shot transition esp. for fancy wipes
– Measure content similarity between images / video shots
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [20]
Overview of Video Capturing & Display Overview of Video Capturing & Display
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [21]
Video CameraVideo Camera
Frame-by-frame capturing
CCD sensors (Charge-Coupled Devices)– 2-D array of solid-state sensors
– Each sensor corresponding to a pixel
– Store in a buffer and sequentially read out
– Widely used small and light
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [22]
Video DisplayVideo Display
CRT (Cathode Ray Tube)– Large dynamic range
– Bulky for large display CRT physical depth has to be similar to screen width
LCD Flat-panel display– Use electrical field to change the optical properties hence the
brightness/color of liquid crystal
– Generating the electrical field by an array of transistors: active-matrix thin-film transistors by plasma
“Active-matrix display” (also known as TFT) has a transistor located at each pixel, allowing display be switched more frequently and less current to control pixel luminance. Passive matrix LCD has a grid of conductors with pixels located at the grid intersections
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [23]
Composite vs. Component VideoComposite vs. Component Video
Component video– Three separate signals for tristimulus color representation or luminance-
chrominance representation
– Pro: higher quality
– Con: need high bandwidth and synchronization
Composite video– Multiplex into a signal signal
– Historical reason for transmitting color TV through monochrome channel
– Pro: save bandwidth
– Con: cross talk
S-video: luminance sig. + single multiplexed chrominance sig.
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [24]
Analog Video RasterAnalog Video Raster
Line-by-line “Raster Scan”– Represent line-by-line image frame with 1-D analog
waveform
– Synchronization signal for horizontal and vertical retrace
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [25]
Forming Picture on TV Tube (Monochrome)Forming Picture on TV Tube (Monochrome)
How many lines?
From B.Liu EE330S’01 Princeton
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [26]
How Many TV Lines?How Many TV Lines?
Determined by spatial freq. response of HVS(Recall Lecture-2)
dot
dot
Cannot resolve if
distance > 2000 x separation
(~ 0.03 degree viewing angle)
From B.Liu EE330S’01 Princeton
N = 500 for D=4H
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [27]
Progressive vs. Interlaced scanProgressive vs. Interlaced scanFrom B.Liu EE330S’01 Princeton
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [28]
Analog Color TV SystemsAnalog Color TV Systems
Historical notes – Color TV system had to be compatible with earlier monochrome TV system
3 formats– NTSC ~ North American + Japan/Taiwan
– PAL ~ Western Europe + Asia(China) + Middle East
– SECAM ~ Eastern Europe + France
– What format in your home country?
From Wang’s Preprint Fig.1.5
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [29]
Comparison of Three Analog TV SystemsComparison of Three Analog TV Systems
– Spatial and temporal resolution
– Color coordinate
– Signal bandwidth
– Multiplexing of luminance, chrominance, and audio
(From Wang’s Book Preprint)
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [30]
NTSCNTSC
4:3 aspect ratio (width:height)
525 lines/frame, 2:1 interlace at field rate 59.94Hz– 483 active lines per frame; vertical retrace takes time of 9 lines
– rest for broadcaster’s info. like closed caption
YIQ color coordinate for transmission– RGB primary slightly different from PAL
– Orthogonal chrominance I ~ orange-to-cyan; Q ~ green-to-purple (need less
bandwidth)
Multiplexing over 6M Hz total bandwidth– Artifacts due to cross talk between luminance and chrominance
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [31]
NTSC 6MHz Bandwidth NTSC 6MHz Bandwidth From Wang’s BookPreprint Fig.1.6(b)
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [32]
Analog Video RecordingAnalog Video Recording
Comparison of common formats
From Wang’s BookPreprint Table 1.2
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [33]
Digital Video FormatsDigital Video Formats
ITU-R BT.601 recommendation Downsampled chrominance
– Y Cb Cr coordinate and four subsampling formats
Inter. Telecomm. Union – Radio sector
Wang’sBookPreprint Fig.1.8
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [34]
Summary: Source FormatsSummary: Source Formats
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [35]
Other Digital Video Coding StandardsOther Digital Video Coding Standards
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [36]
H.26x for Video TelephonyH.26x for Video Telephony Remote face-to-face communication: A dream for years
H.26x – Video coding targeted low bit rate– Through ISDN or regular analog telephone line ~ on the order of 64kbps
– Need roughly symmetric complexity on encoder and decoder
H.261 (early 1990s)– Similar to simplified MPEG-1 ~ block-based DCT/MC hybrid coder
– Integer-pel motion compensation with I/P frame only ~ no B frames
– Restricted picture size/fps format and M.V. range
H.263 (mid 1990s) and H.263+/H.263++ (late 1990s)– Support half-pel motion compensation & many options for improvement
H.264 (latest 2001-): also known as H.26L / JVT / MPEG4 part10
– Hybrid coding framework with many advanced techniques
– Focusing on greatly improving compression ratio at a cost of complexity
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [37]
MPEG-2MPEG-2
Extend from MPEG-1
Target at high-resolution high-bit-rate applications– Digital video broadcasting, HDTV, …
– Also used for DVD
Support scalability
Support interlaced video – Frame pictures vs. Field pictures
– New prediction modes for motion compensation related to interlaced video Use previously encoded fields to do M.E.-M.C.
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [38]
Scalability in Video CodecsScalability in Video Codecs
Scalability: provide different quality in a single stream– Stack up more bits on base layer to provide improved quality
Possible ways for achieving scalabilities– SNR Scalability ~ Multiple–quality video services
Basic vs. premium quality
– Spatial Scalability ~ Multiple-dimension displays Display on PDA vs. PC vs. Super-resolution display
– Temporal Scalability ~ Multiple frame rates
– Frequency Scalability ~ Blurred version to sharp, detailed version
Layered coding concept facilitates:– Unequal error protection – Efficient use of resources
– Different needs from customers – Multiple services
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [39]
SNR ScalabilitySNR Scalability
Two layers with same spatio-temporal resolution but different qualities
base-layerencoder
base-layerdecoder
enhancement-layerencoder
mul
tipl
exer+ -
Video inBase-layerbitsteam
Enhancement-layerbitsteam
Outputbitsteam
From R.Liu Seminar Course @ UMCP
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [41]
MPEG-4MPEG-4
Many functionalities targeting a variety of applications
Introduced object-based coding strategy– For better support of interactive applications & graphics/animation video
– Require encoder to perform object segmentation difficult for general applications
Introduced error resilient coding techniques– “Streaming video profile” for wireless multimedia applications
Part-10 is converged into H.264– Focused on improving compression ratio and error resilience
– Stick with Hybrid coding framework
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [42]
Object-based Coding in MPEG-4Object-based Coding in MPEG-4
Interactive functionalities
Higher compression efficiency by separately handling – Moving objects
– Unchanged background
– New regions
– M.C.-failure regions
=> “Sprite” encoding
Object segmentationneeded (not easy )– Based on color, motion,
edge, texture, etc.
Revised from R.Liu Seminar Course @ UMCPU
MC
P E
NE
E4
08
G S
lide
s (c
rea
ted
by
M.W
u &
R.L
iu ©
20
02
)
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [43]
Object-based Coding in MPEG-4 (cont’d)Object-based Coding in MPEG-4 (cont’d)
From Wang’s book preprint Fig. 13.30
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [45]
MPEG-7MPEG-7
“Multimedia Content Description Interface”
– Not a video coding/compression standard like previous MPEG
– Emphasize on how to describe the video content for efficient indexing, search, and retrieval
Standardize the description mechanism of content– Descriptor, Description Scheme, and Description Definition Languages
– See the Bonus Part in Design Project 2 for demos of MPEG-7 visual descriptor: Color, Texture, Shape, ……
Figure from MPEG-7 Document N4031 (March 2001)
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [46]
SummarySummary
Video content analysis
Video capturing and display
Different video coding standards
This week’s Lab session: video processing
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
© 2
00
2)
ENEE408G Capstone -- Multimedia Signal Processing (F'05)Lec9 – Video Proc/Analysis/Comm.
4/7/04 [47]
Reading AssignmentReading Assignment
Chapter 7 “Data Compression” (handout)– Sec. 7.6 => H.261 & H.263
– Sec. 7.7.5 & 7.7.6 => MPEG-4 & MPEG-7
Tutorial on MPEG Video Coding (handout)– IEEE Signal Processing Magazine, Sept. 1997
Yeo-Liu paper on DC-image and scene change detection– Electronic version @ course web page
Chapter 9 “Content Analysis” (by Steinmetz et al.)– Hard-copy handout; Sec. 9.1, 9.2, 9.3.2-9.3.3
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
© 2
00
2)