Post on 18-Feb-2016
description
Bits, Bytes and Files
CS 1 Introduction to Computers and Computer Technology
Rick GrazianiSpring 2007
Rick Graziani graziani@cabrillo.edu 2
Digitization
• Digitize – To represent information with digits or symbols.• Digitization does not require any digits, any set of symbols will to.• We use digits 0 through 9, do digitize a phone number:
– (Area code) Prefix – Number– (aaa) ppp – nnnn
• We use numbers, but we could have chosen any group of symbols.
Rick Graziani graziani@cabrillo.edu 3
Digitization
• The advantage of using digits instead symbols is they can be listed in numerical order.
• Symbols can have an order to them, known as collating sequence, such as player encoding for DVD and VCR players.
Rick Graziani graziani@cabrillo.edu 4
Other Collating Sequences?
Rick Graziani graziani@cabrillo.edu 5
Encoding with Dice
• Because information can be digitized using any symbols, consider a representation based on using dice.
• Encoding - The process of putting information into a specific format (usually digital format).
• A die has 6 unique patterns on it.• If we used a single die we could encode, represent, 6 pieces of
information.• This would not be enough to represent the entire alphabet.
A B C D E F
Rick Graziani graziani@cabrillo.edu 6
Encoding with Dice
• Two dice patterns together produce 6 x 6 = 36 different pattern sequences, because each die can be paired with each of 6 different patterns of the other die.
• Three dice would produce 6 x 6 x 6 = 216 different pattern sequences.
Pairing two dice patterns results in 36 = 6 6 possible pattern sequences.
Rick Graziani graziani@cabrillo.edu 7
Encoding with Dice
• If we used two dice to encode information…
A B C D
E F G…
Rick Graziani graziani@cabrillo.edu 8
Encoding with Dice
• Two dice patterns together produce 6 x 6 = 36 different pattern sequences.
• Notice that only 26 of the 36 unique pattern sequences are needed for the 26 letters in the alphabet.
Rick Graziani graziani@cabrillo.edu 9
Encoding with Dice
• Encode the phrase CABRILLO COLLEGE using two dice.• Choose a sequence for the space. __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __
Rick Graziani graziani@cabrillo.edu 10
C A B R I L L
O space C O L L
G
E
E
space
Rick Graziani graziani@cabrillo.edu 11
Encoding with Dice
• The other 10 combinations could be used for numbers 0 through 9.• Or• It could be used for punctuation including an Escape Character or a
Shift Key.
space
Rick Graziani graziani@cabrillo.edu 12
Encoding with Dice
• Using one of the keys as a special character key, like the Shift Key, can double the number of possibilities for all other keys.
• Using combinations of two keys, can triple the possible combinations, i.e. Control-Shift-Delete
aShifta b c d e fg h i j k lm n o p q r
s t u v w xy z
Shift
A
Rick Graziani graziani@cabrillo.edu 13
Moving to Bits…
• Most fundamental form of information is the presence or absence of a physical phenomenon:– Is something there, or not?– Is light detected, or not?– Is it magnetized, or not?
• PandA is the name we use for the fundamental patterns of digital information based on the presence or absence of a physical phenomenon.
Presence AbsenceTrue False1 0On OffYes No+ -Black WhiteFor AgainstYang YinLisa Bart
Rick Graziani graziani@cabrillo.edu 14
BIT – BInary digiT
• In computer technology, the PandA unit is known as a bit.• Bit (Binary Digit) = Basic unit of information, representing one of two
discrete states. The smallest unit of information within the computer. • The only thing a computer understands.• Abbreviation: b• Bit has one of two values: 1 (ON) or 0 (OFF)
OFF ON
Rick Graziani graziani@cabrillo.edu 15
Bits
• Two patterns are known as the state of the bit.
• For example, magnetic encoding of information on tapes, floppy disks, and hard disks are done with positive or negative polarity.
The boxes illustrate a position where magnetism may be set and sensed; pluses (red) indicate magnetism of positive polarity (1 bit), interpreted as “present” and minuses (blue) (0 bit).
0 0 0 0 0 0 0 01 1 1 1 1 1 1 1
Rick Graziani graziani@cabrillo.edu 16
Bits
Analogy: Sidewalk Memory• Memory (temporary or permanent) analogy is like a strip of concrete.• Stone on the sidewalk represents a 1 bit, absence of a stone
represents a 0 bit.
Sidewalk sections as a sequence of bits (1010 0010).
Rick Graziani graziani@cabrillo.edu 17
Combining Bit Patterns
• Using a single bit, with two discrete states, gives only two options (ON or OFF).
• Like using multiple dice, combining two or more bits gives us more options. (Bits only have 2 unique patterns, whereas the die had 6).
• 1 bit, 2 unique patterns: 0 or 1• 2 bits, 4 unique patterns: 00, 01, 10 or 11• 4 bits, 16 unique patterns: 0000, 0001, 0010, 0010, …1111• 8 bits, 256 unique patterns: 00000000, 00000001, 00000010, …
11111111
die
bit OFF ON
Rick Graziani graziani@cabrillo.edu 18
Binary Math
• Why does 7 bits give you 128 unique bytes?• Why does 8 bits give you 256 unique bytes?
Starting with all “off’s” or 0’s 0 0 0 0 0 0 0 0Add 1 until you get all 1’s (on’s) 1 1 1 1 1 1 1 1 You get 256 unique combinations of 0’s and 1’s
Rick Graziani graziani@cabrillo.edu 20
Binary Math
0 0 1 10 11 100 101+0 +1 +1 +1 +1 + 1 + 1 0 1 10 11 100 101 110Decimal 0 1 2 3 4 5 5
111 00000000 11111110 + 1 + 0 -> + 11000 …… 00000000 11111111
Rick Graziani graziani@cabrillo.edu 21
Base 10 (Decimal) Number System
Digits (10): 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
Number of: 104 103 102 101 100
10,000’s 1,000’s 100’s 10’s 1’s 1 2 3 9 1 0 9 9 1 0 0
Rick Graziani graziani@cabrillo.edu 22
Base 10 (Decimal) Number System
Digits (10): 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
Number of: 104 103 102 101 100
10,000’s 1,000’s 100’s 10’s 1’s 4 1 0 8 3 8 2 1 0 0 0 9 1 0 0 1 0
Rick Graziani graziani@cabrillo.edu 23
Rick’s Number System Rules
• All digits start with 0• A Base-n number system has n number of digits:
– Decimal: Base-10 has 10 digits– Binary: Base-2 has 2 digits– Hexadecimal: Base-16 has 16 digits
• The first column is always the number of 1’s• Each of the following columns is n times the previous
column (n = Base-n)– Base 10: 10,000 1,000 100 10 1– Base 2: 16 8 4 2 1 – Base 16: 65,536 4,096 256 16 1
Rick Graziani graziani@cabrillo.edu 24
Base 2 (Binary) Number System
Digits (2): 0, 1
Number of: 27 26 25 24 23 22 21 20
128’s 64’s 32’s 16’s 8’s 4’s 2’s 1’sDec. 2 1 010 1 0 1 01770130255
Rick Graziani graziani@cabrillo.edu 25
Base 2 (Binary) Number System
Digits (2): 0, 1
Number of: 27 26 25 24 23 22 21 20
128’s 64’s 32’s 16’s 8’s 4’s 2’s 1’sDec. 2 1 010 1 0 1 017 1 0 0 0 170 1 0 0 0 1 1 0130 1 0 0 0 0 0 1 0255 1 1 1 1 1 1 1 1
Rick Graziani graziani@cabrillo.edu 26
Converting between Decimal and Binary
Digits (2): 0, 1
Number of: 27 26 25 24 23 22 21 20
128’s 64’s 32’s 16’s 8’s 4’s 2’s 1’sDec. 1 0 0 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0172192
Rick Graziani graziani@cabrillo.edu 27
Converting between Decimal and Binary
Digits (2): 0, 1
Number of: 27 26 25 24 23 22 21 20
128’s 64’s 32’s 16’s 8’s 4’s 2’s 1’sDec.70 1 0 0 0 1 1 040 1 0 1 0 0 00 0 0 0 0 0 0 0 0128 1 0 0 0 0 0 0 0172 1 0 1 0 1 1 0 0192 1 1 0 0 0 0 0 0
Rick Graziani graziani@cabrillo.edu 28
Computers do Binary
0 1• Bits have two values: OFF and ON• The Binary number system (Base-2) can represent OFF
and ON very well since it has two values, 0 and 1– 0 = OFF– 1 = ON
• Understanding Binary to Decimal conversion is critical in networking.
• Although we use decimal numbers in networking to display information such as IP addresses (LATER), they are transmitted as OFF’s and ON’s that we represent in binary.
Rick Graziani graziani@cabrillo.edu 29
Rick’s Program
Rick Graziani graziani@cabrillo.edu 30
Rick’s Program
Rick Graziani graziani@cabrillo.edu 31
Rick’s Program
Rick Graziani graziani@cabrillo.edu 32
Binary Math
Why does 7 bits give you 128 unique bytes?Starting with all “off’s” or 0’s 0 0 0 0 0 0 0 (decimal 0)Add 1 until you get all 1’s (on’s) 1 1 1 1 1 1 1 (decimal 127) You get 128 unique combinations of 0’s and 1’sWith 7 combinations of 2 bits, 27 = 128
Number of: 26 25 24 23 22 21 20
64’s 32’s 16’s 8’s 4’s 2’s 1’sDec.0 0 0 0 0 0 0 0127 1 1 1 1 1 1 1
Rick Graziani graziani@cabrillo.edu 33
Binary Math
Why does 8 bits give you 256 unique bytes?Starting with all “off’s” or 0’s 0 0 0 0 0 0 0 0 (decimal 0)Add 1 until you get all 1’s (on’s) 1 1 1 1 1 1 1 1 (decimal 255)You get 256 unique combinations of 0’s and 1’sWith 8 combinations of 2 bits, 28 = 256
Number of: 27 26 25 24 23 22 21 20
128’s 64’s 32’s 16’s 8’s 4’s 2’s 1’sDec.0 0 0 0 0 0 0 0 0255 1 1 1 1 1 1 1 1
Rick Graziani graziani@cabrillo.edu 34
Digitizing Text
• Earliest uses of PandA was to digitize text (keyboard characters).• We will look at digitizing images and video later.• Assigning Symbols in United States:
– 26 upper case letters– 26 lower case letters– 10 numerals– 20 punctuation characters– 10 typical arithmetic characters– 3 non-printable characters (enter, tab, backspace)– 95 symbols needed
Rick Graziani graziani@cabrillo.edu 35
ASCII-7
• In the early days, a 7 bit code was used, with 128 combinations of 0’s and 1’s, enough for a typical keyboard.
• The standard was developed by ASCII (American Standard Code for Information Interchange)
• Each group of 7 bits was mapped to a single keyboard character.
0 = 0000000 1 = 0000001 2 = 0000010 3 = 0000011… 127 = 1111111
Rick Graziani graziani@cabrillo.edu 36
Byte
Byte = A collection of bits (usually 7 or 8 bits) which represents a character, a number, or other information.
• More common: 8 bits = 1 byte• Abbreviation: B
Rick Graziani graziani@cabrillo.edu 37
Bytes
1 byte (B)
Kilobyte (KB) = 1,024 bytes (210) • “one thousand bytes” 1,024 = 2 * 2 * 2 * 2 * 2 * 2 * 2 * 2 * 2 * 2
Megabyte (MB) = 1,048,576 bytes (220) • “one million bytes”
Gigabyte (GB) = 1,073,741,824 bytes (230) • “one billion bytes”
Rick Graziani graziani@cabrillo.edu 38
ASCII-8
• IBM later extended the standard, using 8 bits per byte.
• This was known as Extended ASCII or ASCII-8
• This gave 256 unique combinations of 0’s and 1’s.
0 = 00000000 1 = 00000001 2 = 00000010 3 = 00000011… 255 = 11111111
Rick Graziani graziani@cabrillo.edu 39
ASCII-8
Rick Graziani graziani@cabrillo.edu 40
Remember the Dice…
36 possible combinations 6 x 6 = 36
256 possible combinations 2 x 2 x 2 x 2 x 2 x 2 x 2 x 2 = 256 or shown as 16 x 16 = 256
Rick Graziani graziani@cabrillo.edu 41
Try it!
• Write out Cabrillo College (Upper and Lower case) in bits (binary) using the chart above.
0100 0010 0110 0001 … C a
Rick Graziani graziani@cabrillo.edu 42
The answer!
0100 0010 0110 0001 0110 0010 0111 0010 0110 1001 0110 1100 C a b r i l 0110 1100 0110 1111 0010 0000 0100 0010 0110 1111 0110 1100 l o space C o l 0110 1100 0110 0101 0110 0111 0110 0101 l e g e
Rick Graziani graziani@cabrillo.edu 43
• Although ASCII works fine for English, many other languages need more than 256 characters, including numbers and punctuation.
• Unicode uses a 16 bit representation, with 65,536 possible symbols.• Unicode can handle all languages.• www.unicode.org
Unicode
Non-text Files: Representing Images and Sound
Rick Graziani graziani@cabrillo.edu 45
RGB Colors and Binary Representation
• A monitors screen is divided into a grid of small unit called picture elements or pixels. (See reading from Chapter 1).
• The more pixels per inch the better the resolution, the sharper the image.
• All colors on the screen are a combination of red, green and blue (RGB), just at various intensities.
Rick Graziani graziani@cabrillo.edu 46
• Each Color intensity of red, green and blue represented as a quantity from 0 through 255.
• Higher the number the more intense the color.• Black has no intensity or no color and has the value (0, 0, 0)• White is full intensity and has the value (255, 255, 255)• Between these extremes is a whole range of colors and intensities.• Grey is somewhere in between (127, 127, 127)
Rick Graziani graziani@cabrillo.edu 47
RGB Colors and Binary Representation
• You can use your favorite program that allows you to choose colors to view these various red, green and blue values.
Rick Graziani graziani@cabrillo.edu 48
RGB Colors and Binary Representation
• Let’s convert these colors from Decimal to Binary! Red Green Blue
Purple: 172 73 185Gold: 253 249 88
Rick Graziani graziani@cabrillo.edu 49
RGB Colors and Binary Representation
Red Green BluePurple: 172 73 185Gold: 253 249 88
Number of: 27 26 25 24 23 22 21 20
128’s 64’s 32’s 16’s 8’s 4’s 2’s 1’sDec.17273185
25324988
Rick Graziani graziani@cabrillo.edu 50
RGB Colors and Binary Representation
Red Green BluePurple: 172 73 185Gold: 253 249 88
Number of: 27 26 25 24 23 22 21 20
128’s 64’s 32’s 16’s 8’s 4’s 2’s 1’sDec.172 1 0 1 0 1 1 0 073 0 1 0 0 1 0 0 1185 1 0 1 0 1 1 1 1
253 1 1 1 1 1 1 0 1249 1 1 1 1 1 0 0 188 0 1 0 1 1 0 0 0
Rick Graziani graziani@cabrillo.edu 51
RGB Colors and Binary Representation
• We have now converted these colors from Decimal to Binary! Red Green Blue
Purple: 172 73 18510101100 01001001 10101111
Gold: 253 249 8811111101 11111001 01011000
• Why does this matter?
Rick Graziani graziani@cabrillo.edu 52
First a word about Pixels Per Inch
graphicssoft.about.com• PPI stands for pixels per inch. • PPI is a measurement of image resolution that defines the size an
image will print. • An image that is 1600 by 1200 pixels at 300ppi will print at a size of
5.3 by 4 inches. • Or it could be printed at 180 ppi for a printed size of 8.89 by 6.67
inches. • The higher the PPI value, the better quality print you will get--but only
up to a point. • 300ppi is generally considered the point of diminishing returns when it
comes to ink jet printing of digital photos.
1600 pixels
1200 pixels
1200/300 = 4 inches 1200/300 = 5.3 inches
Rick Graziani graziani@cabrillo.edu 53
First a word about Pixels Per Inch
• The higher the PPI value, the better quality print you will get--but only up to a point.
Rick Graziani graziani@cabrillo.edu 54
RGB Colors and Binary Representation
Red Green BluePurple: 172 73 185
10101100 01001001 10101111
24 bits for one pixel!
• “True color” systems require 3 bytes or 24 bits per pixel.• There is 8 bit and 16 bit color, which gives you less of a color palette.
Rick Graziani graziani@cabrillo.edu 55
RGB Colors and Binary Representation
Red Green BluePurple: 172 73 185
10101100 01001001 10101111 = 24 bits per pixel• An 8 inch by 10 inch image scanned in at 300 pixels per inch:
– 8 x 300 = 2,400 pixels 10 x 300 = 3,000 pixels– 2,400 pixels by 3,000 pixels = 7,200,000 pixels or 7.2 megapixels
– At 24 bits per pixel (7,200,000 x 24) • = 172,800,000 bits or 21,600,000 bytes (21.6 megabytes)• RAM memory, video memory, disk space, bandwidth,…
10 inches or 3,000 pixels
8 inches or 2,400 pixels
Rick Graziani graziani@cabrillo.edu 56
Small (100 x 100, 72 ppi)
From NASA
Rick Graziani graziani@cabrillo.edu 57
Medium
720 x 720,
72 ppi
Rick Graziani graziani@cabrillo.edu 58
Large (2242 x 2242, 300 ppi)
Just a part of the image!
Rick Graziani graziani@cabrillo.edu 59
File Compression
• Typical computer screen only has about 100 pixels per inch, not 300.
• Images still require a lot of memory and disk space, not to mention transferring images over the network or Internet.
• Compression – A means to change the representation to use fewer bits to store or transmit information.
• Information sent via a fax is either black or white, long strings of 0’s or long strings of 1’s.
Rick Graziani graziani@cabrillo.edu 60
Run-length encoding
• Many fax machines use run-length encoding.
• Run-length encoding uses binary numbers to specify how long the first sequence (run) of 0’s is, then how long the following sequence of 1’s is, then how long the following sequence of 0’s is, and so on.
• 0-100 1-373 0-96 etc.• Fewer bits needed than sending
100 0’s, then 373 1’s etc.• Run-length encoding is a lossless
compression scheme, meaning that the original representation of 0’s and 1’s can be reconstructed exactly.
Rick Graziani graziani@cabrillo.edu 61
JPEG Compression
• JPEG – Joint Photographic Experts Group• JPEG is a common standard for compressing and storing still images.• Our eyes are not very sensitive to small changes in hue (chrominance), but
we are sensitive to brightness (luminance).• This means we can store less accurate description of the hue of the picture
(fewer bits) and our eyes will not notice it.• This is a lossy compression scheme, because we have lost some the
original representation of the image and it cannot be reconstructed exactly.
Rick Graziani graziani@cabrillo.edu 62
JPEG Compression Scheme
• With JPEG we can get 20:1 compression ratio or more, without being able to see a difference.
• There are large areas of similar hues in pictures that can be lumped together without our noticing.
• Because of this, when Run-length compression is used there is more compression because there is less variations in the hue.
This is what happens when they leave me alone with my niece Emmalia…
Rick Graziani graziani@cabrillo.edu 63
MPEG Compression Scheme
• MPEG (Motion Pictures Experts Group)• MPEG compression is similar to JPEG, but applied to movies.
– JPEG compression is applied to each frame.– Then interframe coherency is used, which only records and
transmits the “differences” between frames.
Rick Graziani graziani@cabrillo.edu 64
Digitizing Sound
• Many definitions of analog.• (Our definition) analog wave is a wave form analogous to the human
voice.• The telephone systems uses an analog wave to transmit your voice
over the telephone line to their Central Office.
Rick Graziani graziani@cabrillo.edu 65
Digitizing Sound
• Two parts of the wave:– Amplitude – Height of the wave which equates to volume.– Frequency – Number of waves per second, which equates to
pitch.• Computers are digital devices, so the analog wave needs to be
converted to a digital format.
Rick Graziani graziani@cabrillo.edu 66
Digitizing Sound
• Converting Analog to Digital requires three steps:1. Sampling2. Quantifying3. Coding
Rick Graziani graziani@cabrillo.edu 67
Digitizing Sound
• Sampling – To take measurements at regular intervals.• The more samples you take, the more accurately you represent the
original wave, and the more accurately you can reproduce the original wave.
Rick Graziani graziani@cabrillo.edu 68
Digitizing Sound
• Nyquist’s Theorem which states that a sampling of two times the highest allowable frequency is sufficient for reconstructing an analog wave into a digital data.
• Human can hear frequencies up to about 20,000 Hz or 20,000 frequencies per second.
• Using Nyquist’s Theorem, this means we need to sample each analog wave at 40,000 times per second of sound.
• In other words, each one second of sound gets sample 40,000 times. (Actually, 44,100 times per second.)
1 second, 40,000 samples
Rick Graziani graziani@cabrillo.edu 69
Digitizing Sound
• Quantifying – This is the process of giving a value to each of the samples taken.
• The larger the range of numbers, the more detailed or specific you can be in your quantifying.
Rick Graziani graziani@cabrillo.edu 70
Digitizing Sound
• Coding – This is the process taking the value quantified and representing it as a binary number.
• Audio CDs use 16 bits for coding.• 16 bits gives a range from 0 to 65,536.• Actually:
– 15 bits are used for the range of numbers – 1 bit is used for + (positive) or – (negative)
• 32,768 positive values and 32,768 negative values • How many bits does it take to record one minute of digital audio?
Rick Graziani graziani@cabrillo.edu 71
Digitizing Sound
• How many bits does it take to record one minute of digital audio?• 1 minute = 60 seconds• 44,100 samples per second• This equals 2,646,000 samples.
• Each sample requires 16 bits.• 2,646,000 samples times 16 bits per sample equals 42,336,000 bits.• 42,336,000 bits times 2 for stereo equals 84,672,000 bits for 1 minute
of audio.
• 84,672,000 bits divided by 8 bits per byte equals 10,584,000 bytes for 1 minute of audio. (More than 10 megabytes!)
• One hour of audio equals 635,040,000 bytes or 635 MB (megabytes)!
Rick Graziani graziani@cabrillo.edu 72
MP3 Compression
• Compressing digital audio means to reduce the number of bits needed to represent the information.
• There are many sounds, frequencies, that the human ear cannot hear, some too high, some too low.
• These waves can be removed without impacting the quality of the audio.
• MP3 uses this sort of compression for a typical compression ratio of 10:1, so a one minute of MP3 music takes 1 megabyte instead of 10 megabytes.
Rick Graziani graziani@cabrillo.edu 73
Suggested music for your enjoyment…
• Couple of the first concerts I ever went to…
Rick Graziani graziani@cabrillo.edu 74
Advantage of Digitizing Information
• A key advantage to digital representation of information, images and sounds, is that the it can be reproduced exactly without losing a “bit” of the quality.
Bits, Bytes and Files
CS 1 Introduction to Computers and Computer Technology
Rick GrazianiSpring 2007