Post on 13-Feb-2021
January 22, 2014 Sam Siewert
Computer and Machine Vision
Introduction to Continuous Camera Capture, Sampling, Encoding,
Decoding and Transport
Overview
Sam Siewert 2
Video Camera Fundamentals Introduction to Codecs (Encode/Decode for Digital Media) Digital Sampling
Color Models Color: Luminance/Chrominance
Digital Video Encoding
Pixel (POINT) Frame (XY Pixel Map) Sequence of Frames (Group of Pictures)
Digital Media Systems – Encoding, Transport, Decoding MPEG Basics and Standards
CCD Frame Grabbers Bt878 Micro-programmable NTSC Capture Luminance and Chrominance ADC Channels YPrPb (YCrCb) – We Will See how Y is Gray, and PrPb are Red and Blue Bands that are Sub-sampled CCD Collects Charge Over Time Through Exposure to Photons (Light field) – Integration Time Charge is Read Out Through ADC Channels
Sam Siewert 3
Analog NTSC Signal for Camera
Sam Siewert 4
As Seen on Agilent MSO – Composite Luminance and Chorma Signals at 6MhZ – Interlaced ODD/EVEN line raster
Rev-A Computational Photometer Board
Sam Siewert 5
Dual TI NTSC Decoders and FTDI Uplink – Plugs Into DE0, DE2i and DE4 Terasic Boards GPIO Header(s) – Can Uplink to DE2i or any Embedded Linux Board Via USB 2.0 – Reference Startix State Machine, FIFO, and Transforms
Example NTSC Digital Output
Sam Siewert 6
As Seen on Texas Instruments Decoder Digital Output for YCrCb by Intronix Logic Analyzer – Clock, Data, Control at Logic Levels – 6 MhZ NTSC decode, Sampled at 10MhZ or Higher
Sam Siewert 7
Real-Time Video Codecs and Tools Codec = Compression, Decompression (Encode/Decode) Basic Inputs to Encoder (Original Uncompressed Data) – POINT or Pixel: RGB and YCrCb Color Encoding – MACRO BLOCK: Portion of a frame (e.g. 8x8 pixels) – FRAME: XY Spatial Map of Pixels (Aspect Ratio, Resolution) – SEQUENCE: Group of Pictures (GoP)
I-Frame Single Frame Compression (Start) Forward/Backward Difference Images
Uncompressed Images – Portable Pixmap (PPM)
Leverage HW and SW MPEG Encoder/Decoder – VLC Media Player – ffmpeg - http://www.ffmpeg.org/
http://www.ffmpeg.org/
Color Considerations 1. Physics – Visible Spectrum
2. Physiology – What the Human Eye is Sensitive to (Rods
and Cones) and what the Brain Perceives
3. Psychology – What Looks Good
4. Technology – Palette and Methods of Display and Projection of Light
Sam Siewert 8
Sam Siewert 9
POINT or PIXEL: RGB Color Model RGB, 24-bit, 8 bits [0-255] for each color band (x, y, z) sampled Each Pixel is a 3-D Vector in RGB Space, Opponent Colors
Blue
Red
Green Black
White
Yellow
Cyan
Magenta
Discussion – What Does Eye See? Color Models (E.g. CIE Rec 709) – RGB Cube – HSV - Hue/Saturation/Value
Hue – Similarity to R, G, Y, B Saturation – Color vs. Brightness Value – Low=Black, High=Color
– Luminance (Candela/Square-Meter) Light Passing Through Area Forming a Solid Angle in A Direction Candela (Photonic Power )= Watts/Steradian More Precise than “Brightness”
– Chrominance (“CrCb” or “UV” in YCrCb or YUV)
U=Blue – Luminance (Y) V=Red - Luminance (Y)
– Wavelength Spectrum - ROYGBIV
Sam Siewert 10
HSV Cylinder/Cone
RGB Cube
YUV ITU-R BT.601 Component Video Standard Y is Luma UV is Color information designed so that BW TV (CRT) will still display grayscale image Used by NTSC, PAL, SECAM (Standard Definition TV)
RGB to YUV Conversion – Y = (0.299 * R) + (0.587 * G) + (0.114 * B) – U = -(0.147 * R) - (0.289 * G) + (0.436 * B) – V = (0.615 * R) - (0.515 * G) - (0.100 * B)
YUV to RGB Conversion – R = Y + 1.140V – G = Y – (0.394 * U) – (0.581 * V) – B = Y + (2.032 * U)
Sam Siewert 11
Sam Siewert 12
POINT: R, G, or B band only vs. Balance
R G B
Balanced
Y = 0.3R + 0.59G + 0.11B
Sam Siewert 13
POINT: YCrCb RGB An Alternative to RGB is YUV, Where Y is Luminance and CrCb is Chrominance The following 2 sets of formulae are taken from information from Keith Jack's excellent book "Video Demystified" (ISBN 1-878707-09-4).
RGB to YCrCb Conversion (For Computers with RGB [0-255]) – Y = (0.257 * R) + (0.504 * G) + (0.098 * B) + 16 – Cr = (0.439 * R) - (0.368 * G) - (0.071 * B) + 128 – Cb = -(0.148 * R) - (0.291 * G) + (0.439 * B) + 128
YCrCb to RGB Conversion – R = 1.164(Y - 16) + 1.596(Cr - 128) – G = 1.164(Y - 16) - 0.813(Cr - 128) - 0.392(Cb - 128) – B = 1.164(Y - 16) + 2.017(Cb - 128)
In both these cases, you have to clamp the output values to keep them in the [0-255] range.
http://www.video-demystified.com/
Sam Siewert 14
POINT: YCrCb 4:4:4 24-bit Format
For every Y sample in a scan-line, there is also one CrCb sample – Each Y (Y7:Y0), Cr (Cr7:Cr0), & Cb (Cb7:Cb0) Sample is 8 bits – No compression between RGB and YCrCb 4:4:4
… 0 319
… 76,480 76,799
…
= Y, Cr, and Cb sample = Y sample only
48 bit to 32 bit
Sam Siewert 15
POINT: YCrCb 4:2:2 16-bit Format For every 2 Y samples in a scan-line, there is one CrCb sample
… 0 319
… 76,480 76,799
…
= Y, Cr, and Cb sample = Y sample only
Pixel-0 = Y7:Y00, Cb7:Cb00; Pixel-1 = Y7:Y01, Cr7:Cr00 Pixel-2 = Y7:Y02, Cb7:Cb01; Pixel-3 = Y7:Y03, Cr7:Cr01 Pixel-4 = Y7:Y04, Cb7:Cb02; Pixel-5 = Y7:Y05, Cr7:Cr02
FRAME: XY Pixel Maps
Sam Siewert 16
FRAME Resolution Computer Graphics Resolutions
VGA = 640x480 SVGA=800x600
NTSC = Standard Defintion, 720x480 High Definition (Progressive full frame or Interlaced)
720i/p = 1280x720 1080i/p = 1920x1080
FRAME Aspect Ratios
X to Y Ratio NTSC = 3:2, 720x480 (240) HD 720 = 16:9, 1280x720 (80) HD 1080 = 16:9, 1920x1080 (120) 2K = 17:9, 2048x1080 (120)
FRAME Rates
60i, NTSC = 59.94 odd/even, or 29.97 24p, Cinema = 24 fps (motion blur) 60p, HDTV Progressive Modes
http://en.wikipedia.org/wiki/Display_resolution
http://en.wikipedia.org/wiki/Display_resolution
Display and Camera Resolutions
Sam Siewert 17
Red Epic 645 9K: 9334x7000, Red Epic 617 28K: 28000x9334
SEQUENCE: Series of Frames in Encoded Group of Pictures
Sam Siewert 18
I-Frame – Initial Frame in GoP, Compression Within Frame Only P-Frame – Predicted Frame B-Frame – Bi-Directional Interpolated Frame (Differences Between Last I-Frame and Next P-Frame or I-Frame)
Uncompressed Streams Used Only in Digital Cinema and Computer / Machine Vision – True Color (RGB, 8-bits Each), YCrCb (16-bits Each Pixel)
Uncompressed Lossless or Lossy I-Frame Only – MJPEG (http://en.wikipedia.org/wiki/MJPEG), JPEG 2000
(http://en.wikipedia.org/wiki/JPEG_2000 ) – Not Practical for Digital Video Transport
Standard Definition 720x480 RGB @ 30fps Requires 30MB/sec High Definition 720p (1280x720x3 = 2700KB/frame) @ 30fps = 80MB/sec High Defintion 1080p (1920x1080x3 = 6075KB/frame) @ 30fps = 178MB/sec DCI 2K, 2048x1080x3 (6480KB/frame) @ 24fps = 152MB/sec DCI 4K, 4096x2160x3 (25920KB/frame) @ 24fps = 607.5MB/sec
Sam Siewert 19
http://en.wikipedia.org/wiki/MJPEGhttp://en.wikipedia.org/wiki/JPEG_2000
Sam Siewert 20
MPEG2 Encode/Decode Introduction
Group of Pictures: High Level View
Sam Siewert 21
MPEG-2: Order Of Operators
Sam Siewert 22
POINT (Pixel) Encoding Macro-Block Lossy Intra-Frame Compression Motion-Based Compression in Group of Pictures
Computer and Machine VisionOverviewCCD Frame GrabbersAnalog NTSC Signal for CameraRev-A Computational Photometer BoardExample NTSC Digital OutputReal-Time Video Codecs and ToolsColor ConsiderationsPOINT or PIXEL: RGB Color ModelDiscussion – What Does Eye See?YUVPOINT: R, G, or B band only vs. BalancePOINT: YCrCb RGBPOINT: YCrCb 4:4:4 24-bit FormatPOINT: YCrCb 4:2:2 16-bit FormatFRAME: XY Pixel MapsDisplay and Camera ResolutionsSEQUENCE: Series of Frames in Encoded Group of PicturesUncompressed StreamsMPEG2 Encode/Decode IntroductionGroup of Pictures: High Level ViewMPEG-2: Order Of Operators