Bits, Bytes and Files

CS 1 Introduction to Computers and Computer Technology

Rick GrazianiSpring 2007

Rick Graziani graziani@cabrillo.edu 2

Digitization

• Digitize – To represent information with digits or symbols.• Digitization does not require any digits, any set of symbols will to.• We use digits 0 through 9, do digitize a phone number:

– (Area code) Prefix – Number– (aaa) ppp – nnnn

• We use numbers, but we could have chosen any group of symbols.

Digitization

• The advantage of using digits instead symbols is they can be listed in numerical order.

• Symbols can have an order to them, known as collating sequence, such as player encoding for DVD and VCR players.

Other Collating Sequences?

Encoding with Dice

• Because information can be digitized using any symbols, consider a representation based on using dice.

• Encoding - The process of putting information into a specific format (usually digital format).

• A die has 6 unique patterns on it.• If we used a single die we could encode, represent, 6 pieces of

information.• This would not be enough to represent the entire alphabet.

A B C D E F

Encoding with Dice

• Two dice patterns together produce 6 x 6 = 36 different pattern sequences, because each die can be paired with each of 6 different patterns of the other die.

• Three dice would produce 6 x 6 x 6 = 216 different pattern sequences.

Pairing two dice patterns results in 36 = 6 6 possible pattern sequences.

Encoding with Dice

• If we used two dice to encode information…

A B C D

E F G…

Encoding with Dice

• Two dice patterns together produce 6 x 6 = 36 different pattern sequences.

• Notice that only 26 of the 36 unique pattern sequences are needed for the 26 letters in the alphabet.

Encoding with Dice

• Encode the phrase CABRILLO COLLEGE using two dice.• Choose a sequence for the space. __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __

C A B R I L L

O space C O L L

Encoding with Dice

• The other 10 combinations could be used for numbers 0 through 9.• Or• It could be used for punctuation including an Escape Character or a

Shift Key.

Encoding with Dice

• Using one of the keys as a special character key, like the Shift Key, can double the number of possibilities for all other keys.

• Using combinations of two keys, can triple the possible combinations, i.e. Control-Shift-Delete

aShifta b c d e fg h i j k lm n o p q r

s t u v w xy z

Moving to Bits…

• Most fundamental form of information is the presence or absence of a physical phenomenon:– Is something there, or not?– Is light detected, or not?– Is it magnetized, or not?

• PandA is the name we use for the fundamental patterns of digital information based on the presence or absence of a physical phenomenon.

Presence AbsenceTrue False1 0On OffYes No+ -Black WhiteFor AgainstYang YinLisa Bart

BIT – BInary digiT

• In computer technology, the PandA unit is known as a bit.• Bit (Binary Digit) = Basic unit of information, representing one of two

discrete states. The smallest unit of information within the computer. • The only thing a computer understands.• Abbreviation: b• Bit has one of two values: 1 (ON) or 0 (OFF)

OFF ON

• Two patterns are known as the state of the bit.

• For example, magnetic encoding of information on tapes, floppy disks, and hard disks are done with positive or negative polarity.

The boxes illustrate a position where magnetism may be set and sensed; pluses (red) indicate magnetism of positive polarity (1 bit), interpreted as “present” and minuses (blue) (0 bit).

0 0 0 0 0 0 0 01 1 1 1 1 1 1 1

Analogy: Sidewalk Memory• Memory (temporary or permanent) analogy is like a strip of concrete.• Stone on the sidewalk represents a 1 bit, absence of a stone

represents a 0 bit.

Sidewalk sections as a sequence of bits (1010 0010).

Combining Bit Patterns

• Using a single bit, with two discrete states, gives only two options (ON or OFF).

• Like using multiple dice, combining two or more bits gives us more options. (Bits only have 2 unique patterns, whereas the die had 6).

• 1 bit, 2 unique patterns: 0 or 1• 2 bits, 4 unique patterns: 00, 01, 10 or 11• 4 bits, 16 unique patterns: 0000, 0001, 0010, 0010, …1111• 8 bits, 256 unique patterns: 00000000, 00000001, 00000010, …

11111111

bit OFF ON

Binary Math

• Why does 7 bits give you 128 unique bytes?• Why does 8 bits give you 256 unique bytes?

Starting with all “off’s” or 0’s 0 0 0 0 0 0 0 0Add 1 until you get all 1’s (on’s) 1 1 1 1 1 1 1 1 You get 256 unique combinations of 0’s and 1’s

Binary Math

www.thinkgeek.com

Binary Math

0 0 1 10 11 100 101+0 +1 +1 +1 +1 + 1 + 1 0 1 10 11 100 101 110Decimal 0 1 2 3 4 5 5

111 00000000 11111110 + 1 + 0 -> + 11000 …… 00000000 11111111

Base 10 (Decimal) Number System

Digits (10): 0, 1, 2, 3, 4, 5, 6, 7, 8, 9

Number of: 104 103 102 101 100

10,000’s 1,000’s 100’s 10’s 1’s 1 2 3 9 1 0 9 9 1 0 0

Base 10 (Decimal) Number System

Digits (10): 0, 1, 2, 3, 4, 5, 6, 7, 8, 9

Number of: 104 103 102 101 100

10,000’s 1,000’s 100’s 10’s 1’s 4 1 0 8 3 8 2 1 0 0 0 9 1 0 0 1 0

Rick’s Number System Rules

• All digits start with 0• A Base-n number system has n number of digits:

– Decimal: Base-10 has 10 digits– Binary: Base-2 has 2 digits– Hexadecimal: Base-16 has 16 digits

• The first column is always the number of 1’s• Each of the following columns is n times the previous

column (n = Base-n)– Base 10: 10,000 1,000 100 10 1– Base 2: 16 8 4 2 1 – Base 16: 65,536 4,096 256 16 1

Base 2 (Binary) Number System

Digits (2): 0, 1

Number of: 27 26 25 24 23 22 21 20

128’s 64’s 32’s 16’s 8’s 4’s 2’s 1’sDec. 2 1 010 1 0 1 01770130255

Base 2 (Binary) Number System

Digits (2): 0, 1

Number of: 27 26 25 24 23 22 21 20

128’s 64’s 32’s 16’s 8’s 4’s 2’s 1’sDec. 2 1 010 1 0 1 017 1 0 0 0 170 1 0 0 0 1 1 0130 1 0 0 0 0 0 1 0255 1 1 1 1 1 1 1 1

Converting between Decimal and Binary

Digits (2): 0, 1

Number of: 27 26 25 24 23 22 21 20

128’s 64’s 32’s 16’s 8’s 4’s 2’s 1’sDec. 1 0 0 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0172192

Converting between Decimal and Binary

Digits (2): 0, 1

Number of: 27 26 25 24 23 22 21 20

128’s 64’s 32’s 16’s 8’s 4’s 2’s 1’sDec.70 1 0 0 0 1 1 040 1 0 1 0 0 00 0 0 0 0 0 0 0 0128 1 0 0 0 0 0 0 0172 1 0 1 0 1 1 0 0192 1 1 0 0 0 0 0 0

Computers do Binary

0 1• Bits have two values: OFF and ON• The Binary number system (Base-2) can represent OFF

and ON very well since it has two values, 0 and 1– 0 = OFF– 1 = ON

• Understanding Binary to Decimal conversion is critical in networking.

• Although we use decimal numbers in networking to display information such as IP addresses (LATER), they are transmitted as OFF’s and ON’s that we represent in binary.

Rick’s Program

Binary Math

Why does 7 bits give you 128 unique bytes?Starting with all “off’s” or 0’s 0 0 0 0 0 0 0 (decimal 0)Add 1 until you get all 1’s (on’s) 1 1 1 1 1 1 1 (decimal 127) You get 128 unique combinations of 0’s and 1’sWith 7 combinations of 2 bits, 27 = 128

Number of: 26 25 24 23 22 21 20

64’s 32’s 16’s 8’s 4’s 2’s 1’sDec.0 0 0 0 0 0 0 0127 1 1 1 1 1 1 1

Binary Math

Why does 8 bits give you 256 unique bytes?Starting with all “off’s” or 0’s 0 0 0 0 0 0 0 0 (decimal 0)Add 1 until you get all 1’s (on’s) 1 1 1 1 1 1 1 1 (decimal 255)You get 256 unique combinations of 0’s and 1’sWith 8 combinations of 2 bits, 28 = 256

Number of: 27 26 25 24 23 22 21 20

128’s 64’s 32’s 16’s 8’s 4’s 2’s 1’sDec.0 0 0 0 0 0 0 0 0255 1 1 1 1 1 1 1 1

Digitizing Text

• Earliest uses of PandA was to digitize text (keyboard characters).• We will look at digitizing images and video later.• Assigning Symbols in United States:

– 26 upper case letters– 26 lower case letters– 10 numerals– 20 punctuation characters– 10 typical arithmetic characters– 3 non-printable characters (enter, tab, backspace)– 95 symbols needed

ASCII-7

• In the early days, a 7 bit code was used, with 128 combinations of 0’s and 1’s, enough for a typical keyboard.

• The standard was developed by ASCII (American Standard Code for Information Interchange)

• Each group of 7 bits was mapped to a single keyboard character.

0 = 0000000 1 = 0000001 2 = 0000010 3 = 0000011… 127 = 1111111

Byte = A collection of bits (usually 7 or 8 bits) which represents a character, a number, or other information.

• More common: 8 bits = 1 byte• Abbreviation: B

1 byte (B)

Kilobyte (KB) = 1,024 bytes (210) • “one thousand bytes” 1,024 = 2 * 2 * 2 * 2 * 2 * 2 * 2 * 2 * 2 * 2

Megabyte (MB) = 1,048,576 bytes (220) • “one million bytes”

Gigabyte (GB) = 1,073,741,824 bytes (230) • “one billion bytes”

ASCII-8

• IBM later extended the standard, using 8 bits per byte.

• This was known as Extended ASCII or ASCII-8

• This gave 256 unique combinations of 0’s and 1’s.

0 = 00000000 1 = 00000001 2 = 00000010 3 = 00000011… 255 = 11111111

ASCII-8

Remember the Dice…

36 possible combinations 6 x 6 = 36

256 possible combinations 2 x 2 x 2 x 2 x 2 x 2 x 2 x 2 = 256 or shown as 16 x 16 = 256

Try it!

• Write out Cabrillo College (Upper and Lower case) in bits (binary) using the chart above.

0100 0010 0110 0001 … C a

The answer!

0100 0010 0110 0001 0110 0010 0111 0010 0110 1001 0110 1100 C a b r i l 0110 1100 0110 1111 0010 0000 0100 0010 0110 1111 0110 1100 l o space C o l 0110 1100 0110 0101 0110 0111 0110 0101 l e g e

• Although ASCII works fine for English, many other languages need more than 256 characters, including numbers and punctuation.

• Unicode uses a 16 bit representation, with 65,536 possible symbols.• Unicode can handle all languages.• www.unicode.org

Unicode

Non-text Files: Representing Images and Sound

RGB Colors and Binary Representation

• A monitors screen is divided into a grid of small unit called picture elements or pixels. (See reading from Chapter 1).

• The more pixels per inch the better the resolution, the sharper the image.

• All colors on the screen are a combination of red, green and blue (RGB), just at various intensities.

• Each Color intensity of red, green and blue represented as a quantity from 0 through 255.

• Higher the number the more intense the color.• Black has no intensity or no color and has the value (0, 0, 0)• White is full intensity and has the value (255, 255, 255)• Between these extremes is a whole range of colors and intensities.• Grey is somewhere in between (127, 127, 127)

• You can use your favorite program that allows you to choose colors to view these various red, green and blue values.

• Let’s convert these colors from Decimal to Binary! Red Green Blue

Purple: 172 73 185Gold: 253 249 88

Red Green BluePurple: 172 73 185Gold: 253 249 88

Number of: 27 26 25 24 23 22 21 20

128’s 64’s 32’s 16’s 8’s 4’s 2’s 1’sDec.17273185

25324988

Red Green BluePurple: 172 73 185Gold: 253 249 88

Number of: 27 26 25 24 23 22 21 20

128’s 64’s 32’s 16’s 8’s 4’s 2’s 1’sDec.172 1 0 1 0 1 1 0 073 0 1 0 0 1 0 0 1185 1 0 1 0 1 1 1 1

253 1 1 1 1 1 1 0 1249 1 1 1 1 1 0 0 188 0 1 0 1 1 0 0 0

• We have now converted these colors from Decimal to Binary! Red Green Blue

Purple: 172 73 18510101100 01001001 10101111

Gold: 253 249 8811111101 11111001 01011000

• Why does this matter?

First a word about Pixels Per Inch

graphicssoft.about.com• PPI stands for pixels per inch. • PPI is a measurement of image resolution that defines the size an

image will print. • An image that is 1600 by 1200 pixels at 300ppi will print at a size of

5.3 by 4 inches. • Or it could be printed at 180 ppi for a printed size of 8.89 by 6.67

inches. • The higher the PPI value, the better quality print you will get--but only

up to a point. • 300ppi is generally considered the point of diminishing returns when it

comes to ink jet printing of digital photos.

1600 pixels

1200 pixels

1200/300 = 4 inches 1200/300 = 5.3 inches

First a word about Pixels Per Inch

• The higher the PPI value, the better quality print you will get--but only up to a point.

Red Green BluePurple: 172 73 185

10101100 01001001 10101111

24 bits for one pixel!

• “True color” systems require 3 bytes or 24 bits per pixel.• There is 8 bit and 16 bit color, which gives you less of a color palette.

Red Green BluePurple: 172 73 185

10101100 01001001 10101111 = 24 bits per pixel• An 8 inch by 10 inch image scanned in at 300 pixels per inch:

– 8 x 300 = 2,400 pixels 10 x 300 = 3,000 pixels– 2,400 pixels by 3,000 pixels = 7,200,000 pixels or 7.2 megapixels

– At 24 bits per pixel (7,200,000 x 24) • = 172,800,000 bits or 21,600,000 bytes (21.6 megabytes)• RAM memory, video memory, disk space, bandwidth,…

10 inches or 3,000 pixels

8 inches or 2,400 pixels

Small (100 x 100, 72 ppi)

From NASA

Medium

720 x 720,

72 ppi

Large (2242 x 2242, 300 ppi)

Just a part of the image!

File Compression

• Typical computer screen only has about 100 pixels per inch, not 300.

• Images still require a lot of memory and disk space, not to mention transferring images over the network or Internet.

• Compression – A means to change the representation to use fewer bits to store or transmit information.

• Information sent via a fax is either black or white, long strings of 0’s or long strings of 1’s.

Run-length encoding

• Many fax machines use run-length encoding.

• Run-length encoding uses binary numbers to specify how long the first sequence (run) of 0’s is, then how long the following sequence of 1’s is, then how long the following sequence of 0’s is, and so on.

• 0-100 1-373 0-96 etc.• Fewer bits needed than sending

100 0’s, then 373 1’s etc.• Run-length encoding is a lossless

compression scheme, meaning that the original representation of 0’s and 1’s can be reconstructed exactly.

JPEG Compression

• JPEG – Joint Photographic Experts Group• JPEG is a common standard for compressing and storing still images.• Our eyes are not very sensitive to small changes in hue (chrominance), but

we are sensitive to brightness (luminance).• This means we can store less accurate description of the hue of the picture

(fewer bits) and our eyes will not notice it.• This is a lossy compression scheme, because we have lost some the

original representation of the image and it cannot be reconstructed exactly.

JPEG Compression Scheme

• With JPEG we can get 20:1 compression ratio or more, without being able to see a difference.

• There are large areas of similar hues in pictures that can be lumped together without our noticing.

• Because of this, when Run-length compression is used there is more compression because there is less variations in the hue.

This is what happens when they leave me alone with my niece Emmalia…

MPEG Compression Scheme

• MPEG (Motion Pictures Experts Group)• MPEG compression is similar to JPEG, but applied to movies.

– JPEG compression is applied to each frame.– Then interframe coherency is used, which only records and

transmits the “differences” between frames.

Digitizing Sound

• Many definitions of analog.• (Our definition) analog wave is a wave form analogous to the human

voice.• The telephone systems uses an analog wave to transmit your voice

over the telephone line to their Central Office.

Digitizing Sound

• Two parts of the wave:– Amplitude – Height of the wave which equates to volume.– Frequency – Number of waves per second, which equates to

pitch.• Computers are digital devices, so the analog wave needs to be

converted to a digital format.

Digitizing Sound

• Converting Analog to Digital requires three steps:1. Sampling2. Quantifying3. Coding

Digitizing Sound

• Sampling – To take measurements at regular intervals.• The more samples you take, the more accurately you represent the

original wave, and the more accurately you can reproduce the original wave.

Digitizing Sound

• Nyquist’s Theorem which states that a sampling of two times the highest allowable frequency is sufficient for reconstructing an analog wave into a digital data.

• Human can hear frequencies up to about 20,000 Hz or 20,000 frequencies per second.

• Using Nyquist’s Theorem, this means we need to sample each analog wave at 40,000 times per second of sound.

• In other words, each one second of sound gets sample 40,000 times. (Actually, 44,100 times per second.)

1 second, 40,000 samples

Digitizing Sound

• Quantifying – This is the process of giving a value to each of the samples taken.

• The larger the range of numbers, the more detailed or specific you can be in your quantifying.

Digitizing Sound

• Coding – This is the process taking the value quantified and representing it as a binary number.

• Audio CDs use 16 bits for coding.• 16 bits gives a range from 0 to 65,536.• Actually:

– 15 bits are used for the range of numbers – 1 bit is used for + (positive) or – (negative)

• 32,768 positive values and 32,768 negative values • How many bits does it take to record one minute of digital audio?

Digitizing Sound

• How many bits does it take to record one minute of digital audio?• 1 minute = 60 seconds• 44,100 samples per second• This equals 2,646,000 samples.

• Each sample requires 16 bits.• 2,646,000 samples times 16 bits per sample equals 42,336,000 bits.• 42,336,000 bits times 2 for stereo equals 84,672,000 bits for 1 minute

of audio.

• 84,672,000 bits divided by 8 bits per byte equals 10,584,000 bytes for 1 minute of audio. (More than 10 megabytes!)

• One hour of audio equals 635,040,000 bytes or 635 MB (megabytes)!

MP3 Compression

• Compressing digital audio means to reduce the number of bits needed to represent the information.

• There are many sounds, frequencies, that the human ear cannot hear, some too high, some too low.

• These waves can be removed without impacting the quality of the audio.

• MP3 uses this sort of compression for a typical compression ratio of 10:1, so a one minute of MP3 music takes 1 megabyte instead of 10 megabytes.

Suggested music for your enjoyment…

• Couple of the first concerts I ever went to…

Advantage of Digitizing Information

• A key advantage to digital representation of information, images and sounds, is that the it can be reproduced exactly without losing a “bit” of the quality.

Bits, Bytes and Files

CS 1 Introduction to Computers and Computer Technology

Rick GrazianiSpring 2007

Bits, Bytes and Files

Documents

Transcript of Bits, Bytes and Files

Fall 2010 Bits & Bytes Newsletter

CSE Bits and Bytes

Lecture03 Bits Bytes Integers

Bits, Bytes, and Integers

Bits&Bytes June Online

Bits & Bytes

Broadband Bits and Bytes

Bits and Bytes 0904 Bits and Bytes... · Bits and Bytes William M. Wozniak, Editor Bits and Bytes is published quarterly on behalf of the Indiana State Financial Aid Association Phone

Bits, Bytes, & Words

MACalvey/Teachnet/Bits& Bytes 2005 1 Bits and Bytes! Concepts of ICT explained.

CHAPTER 1: BITS AND BYTES

Lecture 01 - Bits, Bytes, Codes, VariablesLecture 01 Bits, Bytes, Codes, Variables Erdal Y lmaz June 24, 2012 Lecture 01 Bits, Bytes, Codes, Variables

Newsletter Bits n’ Bytes

CSE Bits and Bytes Magazine

Digital bits and bytes

Bits and bytes

PATTISON HIGH SCHOOL Bits & Bytes

Lecture 3: Bits, Bytes, Binary€¦ · Lecture 3: Bits, Bytes, Binary ... Bits, bytes, binary numbers, and the representation of information • computers represent, process, store,

Bits n Bytes - Volume 5

Bits, Bytes and Blobs