IZEE UNIT 2

download IZEE UNIT 2

of 52

Transcript of IZEE UNIT 2

  • 8/8/2019 IZEE UNIT 2

    1/52

    DATA REPRESENTATIONUnit 2

    By: Dipti Purohit

    Izee business school bangalore

  • 8/8/2019 IZEE UNIT 2

    2/52

    UNIT COVERS......

    Data Representation,

    Binary Data Representation,Binary Coding Schemes:

    EBCDIC,

    ASCII &UNICODE

  • 8/8/2019 IZEE UNIT 2

    3/52

    3

    Data FormsData Forms Human communication

    Includes language, images and sounds

    ComputersProcess and store all forms of data in binaryformat

    Conversion to computer-usable representationusing data formats

    Define the different ways human data may berepresented, stored and processed by a computer

  • 8/8/2019 IZEE UNIT 2

    4/52

    4

    Data conversion and representationData conversion and representation

  • 8/8/2019 IZEE UNIT 2

    5/52

    Data Storage

  • 8/8/2019 IZEE UNIT 2

    6/52

    6

    Data formatsData formats Proprietary formats

    Unique to a product or companyE.g., Microsoft Word, Word

    Perfect Standards (evolve in two ways):

    Proprietary formats become defacto standards (e.g., Adobe

    PostScript)Invented by an internationalstandard organization (e.g.,Motion Pictures Experts Group,MPEG)

  • 8/8/2019 IZEE UNIT 2

    7/52

    7

    Common Data RepresentationsCommon Data Representations

    Type of Data Standard(s)

    Alphanumeric Unicode, ASCII, EDCDIC

    Image (bitmapped) GIF (graphical image format)

    TIF (tagged image file format)

    PNG (portable network graphics)

    Image (object) PostScript, JPEG, SWF (MacromediaFlash), SVG

    Outline graphics and fonts PostScript, TrueType

    Sound WAV, AVI, MP3, MIDI, WMA

    Page description PDF (Adobe Portable Document Format),HTML, XML

    Video Quicktime, MPEG-2, RealVideo, WMV

  • 8/8/2019 IZEE UNIT 2

    8/52

    8

    Alphanumeric DataAlphanumeric Data

    Characters (r, T), numberdigits (0..9), punctuation (!, ;),special purpose characters ($,

    &) Four codes/standards torepresent letters and numbers:

    BCD (Binary-Coded Decimal)UnicodeASCII (American Standard Codefor Information Interchange)EBCDIC (Extended Binary CodedDecimal Interchange Code)

  • 8/8/2019 IZEE UNIT 2

    9/52

  • 8/8/2019 IZEE UNIT 2

    10/52

    4

    Another example is a light fixture Astandard light switch is similar to digital

    x It is either on or off 1 or 0

    Adimmer light switch issimilar to analogx Its rotating dial can be

    turned to many different

    positions to make the lightvarying degrees of brightor dim

    Data Representation: How do computers

    represent data digitally?

  • 8/8/2019 IZEE UNIT 2

    11/52

  • 8/8/2019 IZEE UNIT 2

    12/52

    6

    Unlike the decimal system (base 10), the binary numbersystem (base 2) uses only two digits:0 and 1

    The following table listssome decimal numbersand their binaryequivalent:

    How can a computer represent

    numbers?

  • 8/8/2019 IZEE UNIT 2

    13/52

    7

    Character data is composed of letters, symbols,and numbers that will not be used in arithmetic

    operations Numeric data is used in arithmetic calculations, and is

    encoded differently

    ASCII(American Standard Code for Information

    Interchange) requires only7 bits for each character Extended ASCIIuses 8 bits for each character.

    Used in most personal computers See the code on the next slide

    How can a computer represent words and

    letters using bits?

  • 8/8/2019 IZEE UNIT 2

    14/52

    7

    How can a computer represent words andletters using bits?

  • 8/8/2019 IZEE UNIT 2

    15/52

    8

    EBCDIC(Extended Binary-CodedDecimal Interchange Code) is an

    alternative 8-bit used by older IBM systemsUnicode uses 16 bits and provides codes

    for 65,000 characters a bonus forrepresenting alphabets of multiple

    languages Used for foreign language support

    How can a computer represent words and

    letters using bits?

  • 8/8/2019 IZEE UNIT 2

    16/52

    9

    Sounds and pictures must betransformed into a format the computer

    can understandAcomputer must digitize colors, notes,

    and instrument sounds into 1s and 0s

    For example, a red dot on your screenmight be represented by 1100, a greendot by 1101

    How does a computer convert sounds and

    pictures into codes?

  • 8/8/2019 IZEE UNIT 2

    17/52

    9

    Data is stored on a computer in afile Datafiles might contain the text of a document, the

    numbers for a calculation, the contents of a web page, orthe notes of a music clip as binary code

    Executablefiles contain the programs or instructionsthat tell the computer how to perform a specific task.Forexample, how to display and print text

    Data files have afileh

    eaderwhich tells thecomputer how the binary code is used to representthe data. The header tells the computer if the binary code

    represents a music file, a graphic, a text document, etc.

    How does a computer store all these

    codes?

  • 8/8/2019 IZEE UNIT 2

    18/52

    10

    Abitis one binary digit (b) Eg. 0

    Abyte is 8 bits (B) Eg. 0010 0100

    Anibble is 4 bits Eg. 0011

    Quantifying Bits and bytes: How can I tell

    the difference between bits and bytes?

  • 8/8/2019 IZEE UNIT 2

    19/52

    11

    Prefixes Kilo- means a 1000 Mega- means million Giga- means billion

    Kilobit (Kb) is approx. 1,000 bits (1,024) Kilobyte (KB) is approx. 1,000 bytes (1,024) Megabyte (MB) is approx. 1,000,000 bytes

    (1,048,576) Gigabyte (GB) is approx. 1,000,000,000 bytes

    (1,07 ,741,824)

    Quantifying Bits and bytes: How can I tell

    the difference between bits and bytes?

  • 8/8/2019 IZEE UNIT 2

    20/52

    11

    1. A(n) _______ device works with discretenumbers, whereas a(n) _______ device works

    with continuous data.

    2. The _______ number system represents numericdata as a series of 0s and 1s.

    3. Most personal computers use the _______ codeto represent character data.

    4. 100 Mb is larger than 100 MB. True or false?5. Aprefix that means a million bytes is _______.

    Self Quiz Questions

  • 8/8/2019 IZEE UNIT 2

    21/52

    11

    1. A(n) digital device works with discretenumbers, whereas a(n) analog device works

    with continuous data.

    2. The binary number system represents numericdata as a series of 0s and 1s.

    3. Most personal computers use the extended

    ASCII code to represent character data.4. 100 Mb is larger than 100 MB.False

    5. Aprefix that means a million bytes is Mega .

    Self Quiz Answers

  • 8/8/2019 IZEE UNIT 2

    22/52

    BinaryConcepts

    -- OFF

    -- ON

    DATA(in binary Digits)

  • 8/8/2019 IZEE UNIT 2

    23/52

    Data Representation

    000111000111010101

    011101000110100101010

    010010101010101

    0100110010001001001

    00111100110100101010

    100101011010101010000

    Main()

    {

    printf( Hello);

    printf(We are enjoying

    a world of alphabetical

    coding);

    }

  • 8/8/2019 IZEE UNIT 2

    24/52

    Data Representation

    Digital computers use binary code to

    represent characters.

    Binary code is made up of binary

    digits or bits. A string of "0s" and "1s" is used to

    represent characters.

    Byte is a sequence of 8 bits.

    Most computers have words thatconsist of 8 or 16 bits.

    In large computers the number of bits

    per word could be 16 or 32 bits.

  • 8/8/2019 IZEE UNIT 2

    25/52

    Data representation (Contd.)

    When data is keyed in, each keystroke

    is converted to a binary character

    code and transmitted to the computer

    Each character to the printer, screen,

    disk is communicated in binary code.

    While displaying or printing, the character

    is converted back to human readable form

  • 8/8/2019 IZEE UNIT 2

    26/52

  • 8/8/2019 IZEE UNIT 2

    27/52

    During calculation the decimal

    number is converted to its binaryequivalent.

    After calculation the result is

    converted back to its decimal

    equivalent.

    Data Storage (Contd.)

  • 8/8/2019 IZEE UNIT 2

    28/52

    Number systems The additive approach - Number

    earlier consisted of symbols e.g.

    Roman number system - I for 1,

    II for 2, III for 3 etc.

    Positional numbering - Symbols

    represent different values depending

    on the position they occupy e.g. theDecimal system

  • 8/8/2019 IZEE UNIT 2

    29/52

    DecimalNumber System

    In the decimal number system thesuccessive position to the left of thedecimal point represent units, tens,hundreds, thousands etc.

    (3 * 100) + (6*10) + (5*1) = 365 The position of the number affects

    its value. These kind of number systems

    therefore are called positionalnumber system.

    Base

    Position number

    (6*10)

  • 8/8/2019 IZEE UNIT 2

    30/52

    DecimalNumber System (Contd.)

    The value of each digit in the

    number system is determined by:

    a) The digit itself

    b) The position of the digit

    in the number

    c) The base/radix of the system

  • 8/8/2019 IZEE UNIT 2

    31/52

    BinaryNumber System The binary number system has a base of two and

    symbols used are 0 and 1. In this number system, as we move to the left,

    the value of the digit will be two times greater

    than its predecessor because the base is two.

    Thus the value of the places are : 64 32 16 8 4 2 1

    0001111001010111

    Least Significant bit Most Significant bit

    Binary Number

  • 8/8/2019 IZEE UNIT 2

    32/52

    Octal number systems

    Binary Octal000 0001 1

    010 2011 3100 4101 5

    110 6111 7

    Uses a base of 8

    Values increase

    from right to left1, 8, 64, 512 ...

  • 8/8/2019 IZEE UNIT 2

    33/52

  • 8/8/2019 IZEE UNIT 2

    34/52

    OctalNumber System

    Binary Octal

    000 0001 1010 2011 3100 4

    101 5110 6111 7

    To convert a number from binary to octal andvice versa, the following table must be kept inmind:

  • 8/8/2019 IZEE UNIT 2

    35/52

    Hexadecimal

    NumberSystems

    Hexadecimal Decimal0 0

    1 12 23 34 45 56 67 78 89 9

    A 10B 11C 12D 13E 14

    F 15

  • 8/8/2019 IZEE UNIT 2

    36/52

    Hex. Number Systems(Contd.)

    Uses a base of 16 The 16 symbols required for the

    hexadecimal number system obtained

    by using the alphabets A, B, C,

    D, E and F

    Converting hexadecimal to decimal

    decimal equivalent of a hexadecimal

    number A0119(10 * 65,536)+(0 * 4,096)+(1 * 256)+ ( 1 * 16) + ( 9 * 1)

    = 6,55,360 + 0 + 256 + 16 + 9

    = 6, 55, 641

  • 8/8/2019 IZEE UNIT 2

    37/52

    Converting binary numbers to decimalvalue

  • 8/8/2019 IZEE UNIT 2

    38/52

    Divide the decimal number by the

    base of the required number system

    Note the remainder in one column

    and divide the quotient again with the base

    Keep repeating this process until quotient is

    reduced to a zero Reading remainders in the reverse

    order gives the binary equivalent

    Decimal to BinaryConversion

  • 8/8/2019 IZEE UNIT 2

    39/52

    E.g. Converting the decimal number 52to its binary equivalent.

    Remainder

    2 |__52

    2 |__26 | 02 |__13 | 0

    2 |__06 | 1

    2 |__03 | 02 |__01 | 1

    2 |__00 | 1Thus the binary equivalent of the decimal

    number 52 is 110100

    Decimal to BinaryConversion

  • 8/8/2019 IZEE UNIT 2

    40/52

    BinarytoHexadecimal

    Each hexadecimal digit is representedby 4 binary digits.

    Binary Hexadecimal0000 00001 10010 20011 30100 40101 50110 60111 71000 81001 9

    1010 A1011 B1100 C1101 D1110 E1111 F

  • 8/8/2019 IZEE UNIT 2

    41/52

    Binary to Hexadecimal (Contd.)

    Split the quantity into groups of fouroutwards from right to left

    Each group of four is directly converted into

    its hexadecimal equivalentAdd zeros to the left of the number if

    necessary

    E.g. Binary 10101011000010Hexadecimal Equivalent

    0010 1010 1100 0010

    2 A C 2

  • 8/8/2019 IZEE UNIT 2

    42/52

    Hexadecimal to Binary

    Write binary equivalent of eachhexadecimal digit in groups of four

    E.g. hexadecimal 191A412C

    0001 1001 0001 1010 0100 0001 0010 1100 Thus the required binary number can be

    written as :

    11001000110100100000100101100

    The leading zeroes are omitted

  • 8/8/2019 IZEE UNIT 2

    43/52

    Converting from Binary to Octal

    The binary numbermust be divided intogroups of three fromthe octal point- to the

    right in case of thefractional portion andto the left in case of theinteger portion.

    Each group can thenbe replaced with their

    octal equivalent. We may add zero to

    the left of the numberif required.

    For example :

    Binary 101010101010100

    101 010 101 010 100

    5 2 5 2 4

    52524 is the octal

    equivalent of the givenbinary number.

  • 8/8/2019 IZEE UNIT 2

    44/52

    Converting from Octal to Binary

    For example :

    6 5

    110 101

    Similarly the binary equivalent of theoctal number 65 is 110101.

    Each octal digit is replaced with the appropriatetriple of binary digits.

  • 8/8/2019 IZEE UNIT 2

    45/52

    ENCODING SCHEMES

    In the previous slides, you learnedhow numbers are stored in computers

    as binary code.Here is encoding schemes

  • 8/8/2019 IZEE UNIT 2

    46/52

    Representing Characters

    To allow consistent data transfer amongcomputer systems (such as using the ftp

    command), rules on how characters are assignedbinary code combinations needed to be created. These rules or encoding standards have

    evolved over a period of time and are still

    evolving. We will discuss 5 popular encoding standards

    ASCII, ISO 8859, EBCDIC, UNICODEandUTF-8.

  • 8/8/2019 IZEE UNIT 2

    47/52

    Encoding Scheme #1

    ASCII (American Standard Code for InformationInterchange) Aconsistent set of rules in which series of 0s and 1s are used to

    represent characters. This allows uniformity between data transferamong computer systems.

    Evolved from computers that could only work on 7-bit codes at atime.

    Computers then evolved into 8-bit machines, thus a leading 0(zero) was placed at beginning to keep originalASCII code, butallowed for additional characters which are often referred to as theextended ASCII character set.

    Programmers when writing programs may need to access theseASCII characters (or control codes) by decimal, octal or hexadecimalnumber, so an ASCII table is available to provide assistance. You canissue command man ascii to find out more information regardingthis encoding scheme.

  • 8/8/2019 IZEE UNIT 2

    48/52

    Encoding Scheme #2ExtendedASCII:ISO 8859 (International Organization for Standards #8859) An encoding scheme to provide additional characters from the

    extra bit added to the already existing 7-bitASCII code.

    There are sets 1 (ISO 8859-1) and more recently set 15 (ISO 8859-15) which are used to represent most westernEuropean symbols).

    Other sets in between include set 2 (ISO 8859-1) use to representmost eastern European symbols and set 10 (ISO 8859-10) used torepresent Lap/Nordic/Eskimo symbols and so forth

    ISO 8859 tables are accessible from the internet by performing anet search on ISO 8859 Tables. There are also links on theUNX122 webpage.

    You can issue command man iso_8859_1, etc. tofind out more information regarding these encodingschemes.

  • 8/8/2019 IZEE UNIT 2

    49/52

    Encoding Scheme #3

    EBCDIC (Extended Binary Coded DecimalInterchange Code)

    An 8-bit binary code used on IBM mainframecomputers. The rules for 0s and 1s in a binary code to

    represent characters, differerent from ASCII,but there are programs (including the FTP

    command) that can transfer ASCII files toEBCDIC files to allow transfer of data betweendifferent types of computers.

    An EBCDIC Table also exists (see link onUNX122 webpage).

  • 8/8/2019 IZEE UNIT 2

    50/52

    Encoding Schemes for the Future

    As economies move to a more globalenvironment there is a move towards anencoding scheme that will simultaneouslyincorporate all the world language symbols intoone large encoding scheme.

    This would avoid fragmented encodingschemes previously discussed and allowprograms to easily translate and transfer dataamong different countries.

  • 8/8/2019 IZEE UNIT 2

    51/52

    Encoding Scheme #4

    UNICODE(Universal Character Set / ISO 10646)

    16-bit encoding scheme used to represent over65,000 characters.

    This encoding scheme will allow most worldlanguage symbols due to the additional 8 bitsin the code.

    Unicode is currently in use for many PCs

    runningWindows 98 and up, and isconsidered to be the latest trend in datarepresentation to foster global communication.

  • 8/8/2019 IZEE UNIT 2

    52/52

    THANK YOU