Media File Formats Jon Ivins, DMU. Text Files n Two types n 1. Plain text (unformatted) u ASCII...

Post on 25-Dec-2015

217 views 0 download

Tags:

Transcript of Media File Formats Jon Ivins, DMU. Text Files n Two types n 1. Plain text (unformatted) u ASCII...

Media File Formats

Jon Ivins, DMU

Text Files

Two types 1. Plain text (unformatted)

ASCII Character set is most common

7 bits are used This can represent 128 Code

words A = 1000001 a= 1100001

Parity / Extended Character Sets Computers store data in bytes The extra bit can be used for:

Error detection A parity bit is used 10000011 (Odd Parity)

Extend codewords to 256 IBM’s EBCDIC

Text Files

2. Formatted Text Used by Word Processors / DTP

Characters used to give text and formatting information

Bold, Italic, Position, etc Also contains information on page

numbers, version, index, etc Formatted files are usually much

larger than their plain text equivalent

Graphics Files

Consist of objects Contain data on size, position, colour

These are called VECTOR graphics Use INTER-ALIASING to smooth lines

Image Files

Consist of PIXELS A pixel is a small area of the screen VGA displays are 640 X 480

480 lines of 640 pixels This is 307200 pixels

Pixels contain data on colour Greyscale uses one byte

Black = 0, White = 255

Colour uses 3 Bytes 1 for Red, 1 for Green and 1 for

Blue (RGB) 24 bits gives 16 million RGB

combinations

BUT most monitors are usually at 256

colours

Bit Mapped Files Graphics use a mathematical

relationship to describe their position & size

A line might be described by its end points 0,0, 10,10

Double the size the co-ordinates are simply doubled 0,0, 20,20

Graphic objects are scaleable Normally, graphics objects are saved as

BMP files which are not scaleable

GIF Files Image files hold a lot of data

Image files tend to be large files To reduce storage space

COMPRESSION techniques are used One solution is RUN LENGTH

ENCODING Count the number of pixels that are the

same Decoder uses this count to copy the

original pixel X times

GIF Files Developed by Compuserve Used for single or multiple images Based on LZW compression

Lempel, Ziv invented original algorithm Welch developed it further

Replaces multiple strings of data with a TOKEN……..

And a count value LZW can give reasonable compression 50%

GIF Files

Decompression is fairly quick Universal standard Not optimised for image

compression UNISYS hold patent on LZW so

there may be a problem with royalties

JPEG Files Joint Photographic Experts Group Uses a Fourier Transform technique to

eliminate high frequency components in image

Uses several algorithms including run-length encoding

Can be lossy blockiness posterisation ringing

Video Files

AVI ( Audio Visual Interleave) Supported on all versions of

Windows from 1995 Almost all PC users can watch AVI

files MAC users probably won’t be able

to watch AVI files Large file size ( 20 Mbytes per

second)

MPEG

Motion Pictures Expert Group Popular format

Good compression Still large files

Uses similar compression techniques to JPEG

Other Video Formats

MOV Mac format

can be difficult to play on PCs Real Audio & Shockwave

“Streaming” files Optimised for the Internet

Sound Files

Two main types WAV files

Digital samples of analogue waveforms

Midi Files Set of instructions to control

computer

WAV Files Sound is sampled according to

Nyquist Sampling Theorem SAMPLE RATE = 2 X Highest

frequency Telephone bandwidth is 300 -3400Hz

Sampling rate is 6800 times a second Audio is 20 - 20, 000 Hz

Sampling rate is 40,000Hz

We also need control information so sampling rate is always higher than the Nyquist limit

telephone speech is 8kHZ CD Audio is approx.. 44kHZ The better the frequencies the

higher the sampling rate so the higher the quality

-15

-10

-5

0

5

10

15

Time

Volts Sound Level

The sound is sampled at regular intervals

Conversion to digital

There are 21 signal levels -10 to 0 to +10

We need 5 bits to represent this range

Note 5 bits gives 32 combinations Use 0XXXX for Positive values Use 1XXXX for negative values

3volts is represented by 00011

7volts is represented by 00111

10volts is represented by 01100

-3volts is represented by 10011

-7volts is represented by 10111

-10volts is represented by 11100

0volts is represented by 00000

Each sample is transmitted to an output device sequentially

Quantisation noise

The example uses a 1 volt step range What if the audio sample is 7.5 volts? The encoder gives a value of 8 volts The decoder outputs an 8 volt signal This error is called QUANTISATION

NOISE

Companding Most audio signals are quiet

more signals at lower levels than high levels Companding means using a non-linear scale

For example, 0-5 volts might have 20 values 5- 8 volts might have 8 values 8-10 volts might have 2 values

This gives better resolution at lower levels at the expense of high signal levels

CD Quality WAV files

Use 16 X 2 bits to represent the audio signal

This gives 65536 X 2 “steps” Quantisation noise is low A lot of bits will carry no information

(low sound levels) This means a lot of data redundancy WAV file size becomes large 1Mbyte = 0.7 seconds of sound

MIDI Files

These are digital sound files Control computers, sequencers,

etc Each bit in the signal is used Must have a MIDI player to hear

the sound File size is very small compared to

WAV files

Audio Compression

ADPCM Predicts next sample value

TrueSpeech Based on mathematical model of

airflow over vocal tract Highly efficient (1/16th)

MPEG Audio Fits with MPEG Video files

Zip Files

Popular file compression utility Based on LZW

Used to transfer or store large files Zipped files give good results for

text and WAV files Poor results for graphics / video

(typically 3%)

File Size / Performance

There is a trade-off between:

Speed of loading

File size

Quality

There is no one correct solution for all multimedia applications