Unit ii

57
UNIT II: Image Processing Basic image fundamentals, image data types, image file formats (GIF, BMP, TIFF, JPEG, PCX, etc), Image acquisition, storage processing, communication and display, image enhancement: Enhancement by point processing, spatial filtering, color image processing. Image compression: Types of compression: Lossy and lossless, symmetrical and asymmetrical, intraframe and interframe Hybrid, Lossless: RLE, Shannon – Fano algorithm, arithmetic coding. Lossy: Vector quantization, fractal compression technique, transform coding, psychoanalysis, interframe correlation. Hybrid: JPEG-DCT

Transcript of Unit ii

UNIT II: Image Processing

Basic image fundamentals, image data types, image file formats (GIF, BMP, TIFF, JPEG, PCX, etc), Image acquisition, storage processing, communication and display, image enhancement: Enhancement by point processing, spatial

filtering, color image processing.Image compression: Types of compression: Lossy and lossless,

symmetrical and asymmetrical, intraframe and interframe Hybrid, Lossless: RLE, Shannon – Fano algorithm, arithmetic coding.

Lossy: Vector quantization, fractal compression technique, transform coding, psychoanalysis, interframe correlation. Hybrid: JPEG-DCT

Basic Image FundamentalsPictures are of 2 types :

 images (black & white, gray shades, color) graphics

Color is a sensation that light of different frequencies generates on our eyes, the higher frequencies producing the blue end and the lower frequencies producing the red end of the visible spectrum. To recognize and communicate color information we need to have color models; for eg RGB model (monitor) and CMYK model (paper).

Basic Image Fundamentals ...

Image Processing involves 3 stages :

1. input (scanner, digital camera)

2. editing (selecting, copying, scaling, rotating, trimming, changing the contrast, brightness, color tones etc )

3. output (Compression, file formats, resolution and bit depth)

Image Types

1. Hard Copy Vs Soft Copy hard copy such as surfaces like plastic, cloth, wood etcsoft copy such as electronic form.

2. Continuous Tone, Half Tone and Bitone

Photographs newspaper / Black & Whitemagazine photographs

Image Data TypesSome file formats used in the Macromedia Director.

File Import :-

Image – BMP, GIF, JPEG etc

Palette – PAL, ACT

Sound – AIFF, AU, MP3, WAV

Video – AVI, MOV

Animation – DIR, FLA, GIF, PPT

File Export :-

Image – BMP

Video – AVI, MOV

Native – DIR, DXR, EXE

Image Data Types (Ref Fundamentals of multimedia by Ze-Nian Li, Mark S Drew)

1 bit images – images consists of pixels or pels. It consists of on

and off bits only and also referred to as a binary image. It is also

called 1 bit monochrome image, since it contains no colour.

8 bit gray level images – the entire image can be thought of as a 2

dimensional array(stored in hardware called frame buffer) of pixel

values ie a bitmap. Image resolution refers to the number of

pixels in a digital image.

Dithering

Full-colour photographs may contain an almost infinite range of colour values. Dithering is the most common means of reducing the colour range of images down to the 256 (or fewer) colours seen in 8-bit GIF images.

Dithering is the process of juxtaposing pixels of two colours to create the illusion that a third colour is present. A simple example is an image with only black and white in the colour palette. By combining black and white pixels in complex patterns a graphics program like Adobe Photoshop can create the illusion of grey values:

Black & white image gives grey effect

Colour dithering

Image Data Types2 most common data types for graphics and image file formats ;

24 bit color – each pixel (picture elements in digital images) is

represented by 3 bytes for RGB ie it supports 256x256x256

possible combined colors.

Many 24 bit images are actually stored as 32 bit images, with the

extra byte of data for each pixel storing an alpha value for

representing special effect information.

Alpha channel is used for compositing several overlapping

object.

Image Data Types

8 bit color – it is used for space consideration by quantizing the color information to collapse it. The concept of Look Up Table is used to store color information. A data structure called a color histogram is also used to store the number of occurrences of a particular color.

CLUTs (palette) – suppose pixel stores 25 => goto row 25 of a CLUT. Images are usually stored in row-column order as simply a long series of values.

Note : Formats can be platform independent or platform dependent.

Image File Formats

Each file format is characterized by a specific compression type and colour depth. The choice of file formats would depend on :

-the final image quality required and -the import capabilities of the authoring system.

Popular file formats are :

-BMP (Bitmap)-JPEG (Joint Photographers Expert Group)-GIF (Graphics Interchange Format)-TIFF (Tagged Image File Format)- PNG (Portable Network Graphics)-PICT (Picture)-TGA (Targa)- PSD (Photoshop Document)

BMP

It is a standard Windows image format on DOS and Windows – compatible computers.

It supports RGB, Indexed Color, Greyscale, and Bitmap color modes and does not support alpha channel.

JPEGIt is used to display photographs and other continuous tone images in HTML documents over the www.

It supports CMYK, RGB and Greyscale color modes and does not support alpha channel.

GIF

It is used to display the indexed color graphics and images in HTML over the www.

It preserves transparency in indexed color images. It uses 8 bit color and efficiently compresses solid areas of color while preserving sharp detail.

It can represent at the most 256 colors for images of higher color depth by using the Color Look Up Table (CLUT).

TIFF

It is used to exchange files between applications and computer platforms. It is a flexible bitmap image format supported by virtually all paint, image-editing, and page layout applications.

It supports pixels resolutions of 48 bits, 16 bits for each of RGB and therefore can be stored in a number of different color models including CMYK, RGB, indexed color and greyscale images.

It uses lossless compression method and hence is an appropriate format for printing purpose.

PNG, PICT, TGA, PSD

PNG supports 24 bit images and produces background transparency without jagged edges.

PICT format is especially effective at compressing images with large areas of solid color.

TGA format supports 24 bit RGB images (8 bits x 3 color channels) and 32 bit RGB images (8 bits x 3 color + 8 bit alpha channel).

PSD format is used in the Adobe Photoshop package and the only format supporting all available image models, guides, alpha channels, spot channels and layers.

Image Acquisition

Image Input / Acquisition is the first step of image processing.

It deals with conversion of analog images into digital form, mainly done with 2 devices:

Scanner – convert a printed image or document into the digital form.

Digital Camera – digitizes real world images, similar to how a conventional camera works.

Scanner

The scan head contains a source of white light which on getting reflected by the paper image is made to fall on a grid of electronic sensors, by an arrangement of mirrors and lenses. The electronic sensors are called Charge Coupled Devices (CCD) and are basically converters of the light energy into voltage pulses.

After a complete scan, the image is converted from a continuous

entity into a discreet form represented by a series of voltage

pulses. This process is called Sampling.

Scanner ...

Scanner Types :

1. Flatbed scanner – head with a source of white light, mirrors.

2. Drum Scanners – cylindrical drum, photo multiplier tube (PMT) .

3. Bar-code scanners – machine readable representation of information in a visual format.

Color Scanning

Image Acquisition(Ref : Digital Image Processing – Gonzales and Woods)

Elements of Visual Perception

1. Structure of the human eye – cornea, sclera, choroid and retina

2. Image formation in the eye – radius of curvature of the

anterior surface of the lens is greater than the radius of its

posterior surface. The distance between the center of the lens

and the retina called the focal length, is variable. When the eye

focuses on an object farther away than about 3 m, the lens

exhibits its lowest refractive power.

3. Brightness adaptation and discrimination

Illusions

Illusions...

Illusions...

Image Sensing and Acquisition

The types of images in which we are interested are generated by the combination of an illumination source and the reflection or absorption of energy from that source by the elements of the scene being imaged.

Transforming illumination energy into digital images :

1. incoming energy is transformed into a voltage by the combination

of input electrical power and sensor material that is responsive to the

particular type of energy being detected.

2. The output voltage waveform is the response of the sensors and a

digital quality is obtained from each sensor by digitizing its response.

Image Acquisition...

Using a Single Sensor

A photodiode is constructed of silicon material and its output voltage

waveform is proportional to light. The filter in front of a sensor improves

selectivity. For generating a 2 D image, there is relative displacement in

x and y direction between the sensor and the imaged area.

Such type of mechanical digitizers are referred to as microdensitometers.

Eg : Flat Bed with bidirectional sensor

Image Acquisition...

Using Sensor Strips

Sensor strips mounted in a ring configuration are used in medical and

industrial imaging to obtain cross sectional images of 3-D objects.

Sensing devices with 4000 or more in-line sensors are possible. In-line

sensors are used routinely in airborne imaging applications.

Eg : Flat bed scanners

/media/Sapana/swapna/Subjects/MultimediaPrinciples/slides/Chapter02-Art.ppt

...

Basic steps of image processing : Input, Editing and Output

Basic Concepts in Sampling and Quantization

Suppose there is a curved continuous image, f(x,y), that is to be converted to digital form. An image may be continuous w.r.t. to x and y coordinate and also in amplitude.

Digitizing the coordinate values is called Sampling.Digitizing the amplitude values is called Quantization.

Storage Processing(Ref : Multimedia in Practice by Judith Jeffcoate)

The factors influencing the choice of a suitable storage system for multimedia will vary according to user's circumstances. They include :

1. Quantity of data to be stored, required access time and acceptable transfer rate.

2. Type of information to be stored : alphanumeric data, text, line art,

halftones or grey scale, color, audio or video.3. Stability of data, rates at which it is acquired and changed, its

expected life span and any legal requirements.4. Number of copies of data, its distribution, whether system must be

portable between sites.5. Cost of data preparation, capture, storage media and related

equipment.6. Skill and experience of users, their training needs.7. Interfaces required to existing system, backups, security.8. conversion of existing data – microfiche to optical disk.

Storage processing ...

1. Magnetic mediaRAID – Random arrays of inexpensive disks.

2. Optical mediaAnalogue mediaDigital media

3. Compact DiskCD – DA (compact disk digital audio)CD – ROM (compact disk read only memory)Recordable compact diskCD ROM XA (CD ROM Extended Architecture)CD I (CD Interactive)

Communication

1. Building multimedia networksBandwidth – high capacity Synchronization – video, sound and data.Different types of information flow – isochronous (continuous) and

asynchronous (bursty)Variable Demand

Image Enhancement(Ref – fundamentals of Digital Image Processing by Anil K Jain)

1. Enhancement by point processing – contrast stretching, clipping, window slicing, histogram modelling.

2. Spatial filtering – smoothing, filtering, un-sharp masking, zooming

3. Colour image processing.

Image Enhancement...

Image Enhancement refers to accentuation, sharpening of image features such as edges, boundaries or contrast to make a graphic display more useful for display and analysis.

It does not increase the inherent information content in the data.

The greatest difficulty in image enhancement is quantifying the criterion for enhancement.

Point Operations

Point operations are zero memory operations where a given gray level u E [0,L] is mapped into a gray level v E [0,L] according to a transformation

v = f(u)Following are the transformations.

1. Contrast Stretching

Low contrast images occur often due to poor or nonuniform lighting conditions or due to nonlinearity or small dynamic range of the imaging sensor

The slope of transformation is chosen greater than unity in the region of stretch.

Explanation

For example, the gray scale intervals where pixels occur most frequently would be stretched most to improve the overall visibility of a scene.

Point Operations...

2. Clipping and thresholding

Clipping is a special case of contrast stretching. This is useful for noise reduction when the input signal is known to lie in the range.

Thresholding is a special case of clipping. A binary image may not give a binary output when scanned because of sensor noise and background illumination variations. Thresholding is used to make such an image binary.

Point Operations...

3. Digital Negative

A negative image can be obtained by reverse scaling of the gray levels according to the transformation.

4. Intensity Level Slicing

These transformations permit segmentation of certain gray level regions from the rest of the image. This technique is useful when different features of an image are contained in different gray levels. For example segmentation of low temperature regions (clouds, hurricane) of two images where high intensity gray levels are proportional to low temperature.

4. Intensity Level Slicing

Point Operations...

5. Bit Extraction

Suppose each image pixel is quantized to bits. It is desired to extract the nth most significant bit and display it.

Process for File Size Reduction using Mathematical Algorithms.

Raw / Uncompressed media data – analog file that has been digitized and stored on disk as digital file. To compress size, it needs to be filtered by a software called CODEC – Compression / Decompression or Coder / Decoder

Compression process must be reversible.

Decompressed data may not be same as the original uncompressed data.

Image compression

Lossless – The CODECs represent the existing information in a more compact form without actually discarding any data ensuring good quality. But the compression is not very high.Eg. Medical images.

Lossy – Parts of original data are discarded permanently to reduce the file size, quality is compromised. Compression is very high. Eg – Multimedia presentations, web page content.

Types of compression

Lossy Compression

Original image 92% compressed 98% compressed

Symmetrical – a compression system that requires the same processing power and time scale to compress and decompress an image.

Asymmetrical – a compression system that requires the different processing power and time scale to compress and decompress an image.

symmetrical and asymmetrical

Based on the kind of redundancies.

Intraframe – applicable within a still image or a single video frame. Redundancies which occur when different portions of an image are identical, are detected for file compression.

Interframe – applicable when redundancies occur between adjacent frames in a video sequence (temporal redundancy) occur.

Statistical Redundancy – relationship existing within media data.

Psycho-Visual Redundancy – visual information is not perceived equally.

Intraframe and Interframe

They are also known as Entropy Encoding in which the compression techniques do not consider the nature of the information to be compressed. It ignores semantics of the information to be compressed. Few methods are :

RLE – Run Length Encoding Shannon – Fano algorithmarithmetic coding.

Lossless / Statistical Compression Techniques

RLE Method

Sequence of repetitive characters may be replaced by a more compact form. 'n' successive characters may be replaced by a single instance of the character and the number of occurrences.

This is a basic information theoretic algorithm. A simple example will be used to illustrate the algorithm:

Symbol A B C D E ---------------------------------- Count 15 7 6 6 5

Encoding for the Shannon-Fano Algorithm:

* A top-down approach

1. Sort symbols according to their frequencies/probabilities, e.g., ABCDE.

2. Recursively divide into two parts, each with approx. same number of counts.

Shannon – Fano algorithm

Symbol Count log(1/p) Code Subtotal (# of bits) ------ ----- -------- --------- --------------------

A 15 1.38 00 30 B 7 2.48 01 14 C 6 2.70 10 12 D 6 2.70 110 18 E 5 2.96 111 15

TOTAL (# of bits): 89

Shannon – Fano algorithm example

Also called Source Coding. Nature of input signal is considered for the compression.

Human audio visual capabilities and limitations are considered for isolating portions of media that cannot be perceived by average human senses.

Eg. discarding of colours which cannot be perceived by human eye. Frequencies not audible by the human ear may be filtered.

Lossy Compression

Lossy Compression Techniques

Vector quantization, fractal compression technique,transform coding, psychoanalysis, interframe correlation. Hybrid: JPEG-DCT

Vector quantization

Vector quantization (VQ) is a lossy data compression method based on the principle of block coding. It is a fixed-to-fixed length algorithm.

Block coding - divide message into blocks, each of k bits, called

datawords, add r redundant bits to each block to make the length

n = k + r. The resulting n-bit blocks are called codewords.

Vector quantization is used for lossy data compression, lossy data

correction and density estimation.

fractal compression technique

The method is best suited for textures and natural images, relying

on the fact that parts of an image often resemble other parts of

the same image. Fractal algorithms convert these parts into

mathematical data called "fractal codes" which are used to

recreate the encoded image.

transform coding

In transform coding, knowledge of the application is used to choose information to discard, thereby lowering its bandwidth. The remaining information can then be compressed via a variety of methods. When the output is decoded, the result may not be identical to the original input, but is expected to be close enough for the purpose of the application.

psychoanalysis

It is responsible for analyzing the transformed data and identifying which portions may be irrelevant with respect to human visual or acoustic system.

Ear – Psycho acoustic model, frequency masking (sensitivity to sound), temporal masking (difference in sound level)

Eye – spatial frequency (close light and dark patterns are difficult to detect)

interframe correlation

An inter frame is a frame in a video compression stream which is expressed in terms of one or more neighboring frames. The "inter" part of the term refers to the use of Inter frame prediction. This kind of prediction tries to take advantage from temporal redundancy (2 techniques) between neighboring frames allowing to achieve higher compression rates.

Frame ReplenishmentMotion Compensation

Hybrid – JPEG DCT

JPEG is a compression standard for continuous-tone gray-scale or colour images. It uses a combination of discreet cosine transform (DCT), quantization, run length encoding and supports various modes of operations including lossless and lossy modes. Its performance depends on the complexity of image.

DCT transforms each block from the spatial domain to the frequency domain.

Thank you