Contentack/teaching/1e8/introductory_lectures.pdf · Vision : The Human Visual System (HVS) Light...
Transcript of Contentack/teaching/1e8/introductory_lectures.pdf · Vision : The Human Visual System (HVS) Light...
1
1E8 Introduction to Electrical EngineeringImage and Video ProcessingOr Electronics is not all about CircuitsDr. Anil Kokaram
www.mee.tcd.ie/~ack -> [email protected] have been learning about Resistors, Inductors, Capacitors and now electric circuit design in 1E7 [do NOT miss those labs]But electronics is more than circuit designThis course in a way shows you that Engineering is more about problem solving than one particular discipline
ContentIntroduction to Image and Video ProcessingHuman Visual PerceptionCleaning Dirty Pictures (Motion Picture Restoration)Image and Video CompressionDigital Compositing in the Movies
2
What we’re really doing …How does DVD/DTV work?What the hell is MPEG and JPEG exactlyWhat is Digital Cinema? Who cares?What is digital compositing and how do they use it in the movies?What do people doing research actually do? And what have they done lately anyway? Who cares?
The Digital Image
The basic idea is to represent the continuous coloursin a real image with a set of numbers from a fixed range at a subset of locationsThe smallest element of a digital picture is called a Pixel (Picture Element)Typically, a television image is created with 576 lines of 720 pixels eachA film frame is represented with over 2048 lines of 2880 pixels each
3
The Digital Image
101 109 11099 90 94
112 123 108
123 131 141121 112 118134 145 132
38 46 6575 66 8688 99 100
r
g
b
The Digital Imageh
k
I([h,k])
0
I([50,50])=[40 70 200]
4
The Rise of Digital Visual Media
Began in early 1960’s with NASARise of DIP coincides with availability of good picture reproduction/printing. [Why do DIP if you can’t see the results?]Last 5 years has seen exponential increase in Dvisualdevices. DV cameras, D camerasDTV for last 4 years. Free SKY in IrelandDigital Video (Versatile) Disc allows movies to be played on a CD-like deviceInternet streaming video, Real Networks, PacketVideo (video for mobile phones), PDAs
Complex systemsDevices and media require increasingly complex system designImage and Video compression one of the key technologies that enabled consumer devices like D CamerasDesigner needs to understand compromises to be made in handling visual mediaALOT of electronic circuit design is about making hardware for Digital Video Processing in Mobile Phones these days
5
Motivation: Impact of Digital Media
TV Services merging with Internet? DVD Replacing VHS. Digital Television works.Watching TV on your PCGames consoles = TV = PC = internet access = DVD player [Sony PlayStation II]Cheap, high quality broadcastingEquality. Small producers can reach the same audience as large conglomeratesSo many “channels” but nothing on!!
Motivation: Impact of Digital Media
Digital TV broadcasters cannot find contentMobile operators looking for more interesting content for their phone users e.g. through WAP or iMODECompelling content in demandContent creators = producers of movies, editors of live events, character generatorsArchives more importantTools for Digital Movie making more important
6
Motivation: Automated Restoration
Archive material in demandDVD re-release importantPictures in bad shapeNeed Automated restorationA challengeCompression is improved (Bonus!) Jitter
Ghosting
Dropout
Dirt and Sparkle
DIY De-Blotching(To get you thinking …)
Detect then InterpolateArea occluded in next frame
Area uncoveredfrom the previous frame
n-1
n
n+1
7
A simple example
Some dirty pictures
n-1
n
n+1
8
Simple Detection
Hmmm … not so good. Too many false alarms due to bad motion
A Model for Blotches(How to make Dirty Pictures)
x
x
+Original
Corruption Data Location of Corruption
Location of OK pixels
9
Better Detection
Use more knowledge: Blotches are flat and chunky
What about reconstructing the Picture?
n n+1 n+2n-1n-2
10
Motion is key for Picture Building
n n+1 n+2n-1n-2
But we don’t know the motion
n n+1 n+2n-1n-2
11
Trick is to unify all sources of information
Motion
Picture
Blotches
Motion Smoothness
Picture Smoothness
Spatial Smoothness
Blotch Colour
Needs probabilistic framework. Need to know about probability and statistics
Motivation: Automated Restoration
Removing blotches, lines, noise
12
Motivation: Robust Video Transmission
Video comms over difficult channels: Wireless, InternetMPEG4 (Motion Picture Experts Group) partly addresses thisBUT encoder techniques not defined: Object Segmentation/Tracking still a major challengeAND a few errors can lead to big problems. Error detection/correction/concealment v. important.
Motivation: Robust Video Transmission
MP4 Correctly Rec’d MP4 with errors
13
Motivation: Robust Video Transmission
MP4 with errors Corrected with video processing
Motivation: Digital Special Fx (Rig Removal)
www.mee.tcd.ie/~sigmedia/postpro/postproSee paper on www.mee.tcd.ie/~sigmedia
Tool being used in movies now.Produced by www.thefoundry.co.uk
14
Motivation: Content Protection
Digital Media easy to copyNeed to protect rights of exploitationDigital Watermarking allows an i.d. signature to be embedded invisibly into pictures thus connecting the actual data to an owner or legitimate exploitation chainSeminal work done in EEE, TCD around 1996
Motivation: Digital Cinema
Digital Cinema possible because of digital projection (Texas Instruments)Cheaper film <-> digital scannersNew Digital Cameras (Star Wars will use Panasonic HD Cameras)Allows more control by distributors, better reproduction [no film developing accidents ]BUT one frame approx. 1920 x 3815 = 7MB!Good Quality Compression needed badly
15
Motivation: Information management/Retrieval [DVD Example]
Somebody has to generate the DVD IndexUse Storyboard (sometimes not available)But need FRAME accurate location (24 or 25 frames per second)Editors do this all the time e.g. news, live events. But have to watch the whole event.Painful to do by hand; possible but painful
Motivation: Information management/Retrieval
Indexing is part of a BIGGER problemWant to access digital media in a way relevant to usersYou are familiar with text searching on Internet e.g. AltaVista, Google sitesYou want to look something up: just type in a keyword combinationWorks pretty good if your keywords exist in the documents to be searched i.e. text is OK
16
Motivation: Information management/Retrieval
Want to be able to search digital media in the same wayBut associating keywords to audio or video is hard in generalPeople want to access the same data for different reasonsIn other words: people describe the same audio and video using different keywords
Picture archives like www.bridgeman.co.ukBook publisher wants a picture from the Manetcollection with predominantly red shades, and trees in the background. HUH?Nobody thought of keywording pictures for predominant colour when it was first put into the database !!!!So ask the 4 people in cataloguing
if they can REMEMBER seeing a picture like that ??
Motivation: Information Retrieval Example
17
Motivation: Information Retrieval Possible
Can solve these problems by using Signal/Image ProcessingAnalyse the data itself AS THE NEW QUERY OCCURSPredominant Colour : easy to automate: remember RGB pixels?Spotting famous people in video: harder but possibleBiometrics ….
Motivation: Information Retrieval Automated Storyboarding? [DVD]
Want to automatically segment the movie into Sematically Meaningful Scenes/ChaptersHard to make a signal processing algorithm which understands movies (don’t even try…)The basic building block is the SHOTCan detect each Shot automatically by detecting cuts
18
Motivation: Information Retrieval Automated Storyboarding? [DVD]
Shot 1 Shot 2 Shot 3 Shot 4
Cut 1 Cut 2 Cut 3
Motivation: Shot Cut DetectionYour first Image Processing Algorithm
How to detect cuts automatically?Cuts = consecutive images that show drastic change between shotsNeed to define cuts MathematicallyRemember Digital Images are composed of PixelsEach pixel is associated with 3 numbersAny ideas? (see next slide for reminder)
19
The Digital Image (a reminder)
h
k
I([h,k]) 0
I([50,50])=[40 70 200]
A Video sequence is just a bunch of frames recorded at regular intervals in time.
Frame 1
Frame 2
Frame 3
)(xI
Using position vector notation
Motivation: Shot Cut DetectionYour first Image Processing Algorithm
∑ −−=x
xx |)()(| 1nnn IIe
0 5 0 0 1 0 0 0 1 5 0 00
1 0
2 0
3 0
Frame Number
20
Motivation: Shot Cut DetectionYour first Image Processing Algorithm
Frame Difference not so goodHistograms better (will show you what these are later)
Image Moments for Sports Parsing
Content Based Analysis for Video from Snooker Broadcasts, H. Denman, N. Rea and A. Kokaram, International Conference on Image and Video Retrieval (CIVR), 2002, July, London,UK
21
Content Retrieval for Snooker at www.mee.tcd.ie/~sigmedia
Rea, DenmanKokaram
Tracking with a particle filterUsing histograms for matching
Content Based Analysis for Video from Snooker Broadcasts, H. Denman, N. Rea and A. Kokaram, to be published in Computer Vision and Image Understanding Journal (CVIU): Special Issue on Video Retrieval and Summarization
Multimodal Fusion (?) www.mee.tcd.ie/~sigmedia
Dahyot, Rea, Denman,Delacourt, Kokaram
Joint Audio Visual Retrieval for Tennis Broadcasts, R. Dahyot, A. Kokaram, N. Rea and H. Denman, International Conference on Acoustics, Speech, and Signal Processing, 2003
22
Conferences/Journals/etc
ICASSP, ICIP (Research/Industrial)NAB, IBC (Industrial)www.citeseer.orgIEEE Trans IP, Ccts Sys Video Tech, PAMI, SP, SPL, Systems & Cybernetics [H/M Interfacing, Sensors etc]EURASIP SP, Image Comms.SIGGRAPH: Cool conference for movie special fx industrywww.howstuffworks.com
OverviewImage and Video Processing useful for more than just compressionMany interesting new areas driven by availability of new devicesInformation Retrieval from Digital Media requires even more understanding of Image and Video analysis [MPEG7 on its way]Lets have a look at some background …
now for Sampling then Perception
23
Sampling an audio signalOriginal CD Audio 44.1 KHzDownsampled by 4 = 11.02 KHz (no anti-aliasing)Downsampled by 8 = 5.5 KHz (no anti-aliasing)Downsampled by 8 = 5.5 KHz (with anti-aliasing)Sampling frequency affects the position of the samples in time hence affects frequency content of signal (sortof)
Quantisation of the samples to make a digital signal
Original CD Audio 16 bit 44.1 KHz (65536 levels)8 bit Quantization (256 levels)4 bit Quantisation (16 Levels)2 bit Quantisation (4 Levels)Quantisation introduces NOISE into the digital signal because the accuracy of the digital samples as compared to the analogue signal is affected
24
Vision : The Human Visual System (HVS)
Light is focussed onto the retinaElectrical Impulses from the retina are
chanelled by the optic nerve to the Visual CortexThe Visual Cortex does a whole bunch
of smart things including filtering, object recognition, edge detection.In `primitive animals’ A LOT of
processing happens just behind the retina. Frogs and Rabbits have TEMPLATES for spotting birds of prey.Our motion sensitivity is better at the
periphery of vision than at the centre. [Helps to avoid people sneaking up on you.]
Lens
PupilRetina
Optic NerveBlind Spot
(hence the usual card trick ..)
Intensity Sensitivity of HVS
next
25
Foreground 167 Foreground 140
Objects appear to have similar brightness
latex
14 0 1 6 0 1 80 2 0 0 2 2 0 2 4 020
30
40
50
60
70
80
90
1 00
1 10
1 20
Gre
ysca
le (-
) and
Sen
sitiv
ity (-
-)
C o lum n
50 100 150 200 250 300 350 400 450 500
Spatial Freq. ResponseMach Banding
latex
26
Meaning of Spatial Frequency
h
5h
tan(h/(5h)) = tan(1/5)-1-1
768 pels = 11.3 degrees
Monitor Display
CCIR rec 500
latex
Frequency Sensitivity
Grating increases in freq. Left to RightIntensity decreases verticallySensitivity given by j.n.d. junction latex
27
Colour Spaces
Matlab demo color_wheels and why_hsv
Vision: Your colour sensitivity isn’t great cf intensity
Rods are active at low light levelsCones allow you to see colourThere are 100 Million Rods and 7
Million Cones in your retinaHence you SAMPLE luminance space A
LOT more finely (at a higher frequency) than COLOUR SPACEThe cells are arranged in a hexagonal
pattern .. Hence some suspect that hexagonal arrangements of light sensitive ccts in cameras is a good idea. Better than rectangular anyway.Fuji cameras claim to use hex grids,
while others use normal grids at higher densities of elements
Retina
Rod Cone
28
Orig ina l a t full co lour re s o lution
Consequences of Colour Sensitivity
512 x 512 x 3
= 0.64 MB
Original Image
Subsampling Colour Planes
2:1 in bothdirections
Keep Discard
29
4:1 Colour
Downsampling
OK
512 x 512 + 256 x 256 x 2 = 0.31 MB (1/2 bandwidth of original)
16:1 Colour
Downsampling
Still OK
512 x 512 + 128 x 128 x 2 = 0.24 MB (1/3 bandwidth of original)
30
16:1 Luminance
Downsampling
Not good
128 x 128 x 3 = 0.04 MB (1/16 bandwidth of original)Latex
A Noisy PictureActivity Masking and Weber’s Law
Noise hard to see in Textured areas, easy in flat areasNoise harder to see in Bright Areas than dark Areas
Latex
I(h,k) I(h,k)+e(h,k)
31
Measuring Picture Quality
Matlab demo snr_mse