Data Representation, Data Structures, and Multi-file compilation.

44
Data Representation, Data Structures, and Multi-file compilation

Transcript of Data Representation, Data Structures, and Multi-file compilation.

Page 1: Data Representation, Data Structures, and Multi-file compilation.

Data Representation, Data Structures, and Multi-file compilation

Page 2: Data Representation, Data Structures, and Multi-file compilation.

Data Representation :Binary representationOctal, HexadecimalData types

Page 3: Data Representation, Data Structures, and Multi-file compilation.

Memory concepts

Every piece of information stored on computer is encoded as combination of ones and zeros.

These ones and zeros are called bits.

One byte is a sequence of eight consecutive bits.

A word is some number (typically 4) of consecutive bytes.

Page 4: Data Representation, Data Structures, and Multi-file compilation.

Binary representation

bit 0bit 6 bit 5 bit 4 bit 3 bit 2 bit 1bit 7

A single (unsigned) byte of memory

10010111

In decimal representation, this number is:

1*20 + 0*21 + 0*22 + 1*23 + 0*24 + 1*25 + 1*26 + 1*27 = 233

Page 5: Data Representation, Data Structures, and Multi-file compilation.

Binary representation

bit 0bit 6 bit 5 bit 4 bit 3 bit 2 bit 1bit 7

A single (signed) byte of memory

1001011+/-

In decimal representation, this number is:

1*20 + 0*21 + 0*22 + 1*23 + 0*24 + 1*25 + 1*26

= +/- 105

One bit must be usedto store sign of number

Page 6: Data Representation, Data Structures, and Multi-file compilation.

Binary representation, cont.

What is the range of numbers that can be stored in a single signed/unsigned byte?

How would you write a program to convert an arbitrary base 10 number to binary?

How would you write a program to convert an arbitrary binary number to base 10?

What is the effect of right/left shifting bits (assuming the lost bit is set to zero)?

Page 7: Data Representation, Data Structures, and Multi-file compilation.

Octal representation

Octal representation: base 8 Just a simple extension of binary and decimal but using only the digits 0-7.

Best seen with an example:

What is the value of the octal number 711?

1*80 + 1*81 + 7*82 = 457

What is the octal representation of the number 64?

100 (since 0*80 + 0*81 + 1*82 = 64)

Try this in C using the "%o" format expression with printf: printf("%o\n", 457);

Page 8: Data Representation, Data Structures, and Multi-file compilation.

Hexadecimal representation

Hexadecimal representation: base 16 Just a simple extension of binary, octal, and decimal but using 16 "digits": 0-9,a,b,c,d,e,f

Example: What is the value of the hexadecimal number 10ef?

15*160 + 14*161 + 0*162 + 1*163 = 4351

Try this in C using the "%x" format expression with printf: printf("%x\n", 4351);

Page 9: Data Representation, Data Structures, and Multi-file compilation.

Understanding datatypes at a more fundamental level

int and char

Page 10: Data Representation, Data Structures, and Multi-file compilation.

char revisitedBefore doing some example bitwise operations, we first revisit our simple C datatypes to understand them at a deeper level.Recall that we have just a few basic types:

Char, int, float, double

Recall also that char represents a single byte of storage, while int is typically 4 bytes

Important: Do not be misled by the name "char" ; the char datatype is really no different from int (other than its storage capacity)

What do I mean by "no different from int"? We explore this with some examples on the next slide

Page 11: Data Representation, Data Structures, and Multi-file compilation.

Char vs. int

Consider the following declarations: int j = 4; char k = 4;

In memory, these appear as:

j

k

0 00 0000000 000000 0 0000000 0 000000 1

0 000000 1

They are both perfectly valid ways to represent the number 4.In one case (int), there is much more "wasted" memory. In the other case (char), there is a much stricter limit on howlarge the number can be if you choose to change it.

Page 12: Data Representation, Data Structures, and Multi-file compilation.

Char, cont.

Why would you not always use char to represent a small number, such as 4?

Consider what happens in this case:char j = 4;

j = j + 300; /* bad! Can't store 304 in a char!

So, it is safer to use a larger type, such as int, unless you are 100% sure that the char limit will never be exceeded in the program!

Page 13: Data Representation, Data Structures, and Multi-file compilation.

Char as "character" storageSo, if char is just an abbreviated int, what does it have to do with characters?The answer is twofold:

First, char can do nothing special with characters that int can't do.

Both store equivalent ASCII integer code when single quotes are placed around a single character in an assignment

Example: char c = 'e'; /* store the integer (ASCII) code for the character e in the byte c */

Int c = 'e'; /* same as above, but store integer in 4-byte (ie int) sequence.

Page 14: Data Representation, Data Structures, and Multi-file compilation.

Char example

The best way to understand this is with a simple example./* char_int1.c */#include <stdio.h>main(){ char c; int j; j = 100; c = 100; /* random choice < 255 */ printf("%d %d\n", j, c); /* print j and c as decimal ints */ printf("%c %c\n", j, c); /* print j and c as characters */ j = 'h'; c = 'h'; /* change assignment */ printf("%c %c\n", j, c); /* what is printed here? */ printf("%d %d\n", j,c); /* print asci code for 'h' */}

Page 15: Data Representation, Data Structures, and Multi-file compilation.

#include <stdio.h>int main(int argc, char* argv[]){ int input; if (argc !=2){ printf("%s\n", "Must enter a single argument"); exit(1); } input = atoi(argv[1]); /* grab input as integer */ if (input > 255 || input < 0){ printf("%s\n", "Must enter a number > 0 < 256"); exit(1); } printf("%s: %c\n", "The corresponding character is", input);}

Page 16: Data Representation, Data Structures, and Multi-file compilation.

#include<stdio.h>int main(int argc, char* argv[]){ char input; if (argc !=2){ printf("%s\n", "Must enter a single argument"); exit(1); } input = *argv[1]; /* grab single character from keyboard */ printf("%s %c: %d\n", "The ascii code for", input , input);}

Note: We will not understand whythe * needs to be here until we study pointers. However, you should be ablego write an equivalent code using scanf.

Page 17: Data Representation, Data Structures, and Multi-file compilation.

Very low-level stuff

Bitwise operations in C

Page 18: Data Representation, Data Structures, and Multi-file compilation.

Bitwise operations

C contains six operators for performing bitwise operations on integers:

& Logical AND: if both bits are 1 the result is 1

| Logical OR: if either bit is 1, the result is 1

^ Logical XOR (exlusive OR): if one and only one bit equals 1, the result is 1

~ Logical invert: if the bit is 1, the result is 0; if the bit is 0, the result is 1

<< n Left shift n places

>> n Right shift n places

Page 19: Data Representation, Data Structures, and Multi-file compilation.

Bitwise operations

Bitwise operations are considered "low-level" programming by today's standards. For many programs, manipulating individual bits is never necessary.

Sometimes, this level of control is needed for memory or performance optimization

In any case, it is very important for a conceptual understanding of programming

Page 20: Data Representation, Data Structures, and Multi-file compilation.

Bitwise examples: AND

Bitwise AND:

Char j = 11; char k = 14;j: 0 0 0 0 1 0 1 1

k: 0 0 0 0 1 1 1 0

---------------------

0 0 0 0 1 0 1 0 = 10

Page 21: Data Representation, Data Structures, and Multi-file compilation.

OR

Bitwise OR:

Char j = 11; char k = 14;j: 0 0 0 0 1 0 1 1

k: 0 0 0 0 1 1 1 0

---------------------

0 0 0 0 1 1 1 1 = 15

Page 22: Data Representation, Data Structures, and Multi-file compilation.

XOR

Bitwise XOR:

Char j = 11; char k = 14;j: 0 0 0 0 1 0 1 1

k: 0 0 0 0 1 1 1 0

---------------------

0 0 0 0 0 1 0 1 = 5

Page 23: Data Representation, Data Structures, and Multi-file compilation.

Shifting

Logical invert:Char j = 11;j: 0 0 0 0 1 0 1 1

~j: 1 1 1 1 0 1 0 0 = 244

Shifting char j = 11;j << 1: 0 0 0 1 0 1 1 0 = 22j >> 1: 0 0 0 0 0 1 0 1 = 5

Page 24: Data Representation, Data Structures, and Multi-file compilation.

Data Structures and Algorithms

Page 25: Data Representation, Data Structures, and Multi-file compilation.

Sorting

Comes up all the time

Demonstrates important techniques

Can be done many ways Different algorithms.

Page 26: Data Representation, Data Structures, and Multi-file compilation.

Bubble Sort

Very simple

Terrible

Go through list, swapping out-of-order neighbors

Continue until no more swaps

Page 27: Data Representation, Data Structures, and Multi-file compilation.

Bubble Sort

N = number of items

If first number is initially at bottom of list, have to go through list N times

Each time, looking/maybe swapping N times

Total of N2 operations

S..L..O..W.. for long lists

But if list is very nearly sorted, can be quick.

No one would really use this algorithm.

Page 28: Data Representation, Data Structures, and Multi-file compilation.

Insertion sort

About as simple, but better

Way most people sort cards

Keep inserting in order

Still ~N2, but faster on average

Page 29: Data Representation, Data Structures, and Multi-file compilation.

Data Structures

Both these methods very array-basedHave to look through half/most/all of list each iteration

Definitely need ~N iterations

Doomed to be fairly slow

For faster techniques, need different ways of looking at data.

Page 30: Data Representation, Data Structures, and Multi-file compilation.

Binary Trees

A binary tree is either empty, or consists of a node with a left and a right child.

Left and right children are binary trees

Page 31: Data Representation, Data Structures, and Multi-file compilation.

Complete Binary Trees

In a complete binary tree, every node has either 2 or 0 children, and all nodes w/ 0 nodes (`leaf nodes') are on the bottom level.

A complete binary tree with L levels has 2L-1 nodes;

One with N nodes has log

2(N+1) levels

Page 32: Data Representation, Data Structures, and Multi-file compilation.

Heaps

A binary tree with values (`keys') stored at each node.

Almost complete binary tree

Partial ordering: root's key is less than either of children, and both children are roots of heaps

Page 33: Data Representation, Data Structures, and Multi-file compilation.

Storing a heap in an array

Can easily store a heap in an array

Parent node i has left child (2*i+1) and right child (2*i+2).

Page 34: Data Representation, Data Structures, and Multi-file compilation.

Why bother?

Putting things in this partial order easier than sorting

Very easy to find lowest value in data once data is in heap

This is useful:Priority queue

Sorting!

Page 35: Data Representation, Data Structures, and Multi-file compilation.

Heap Sort teaser

Get data into heap

Top value is lowest value.

Delete top value; re-heap

Repeat until no more data

Results are sorted list!

Page 36: Data Representation, Data Structures, and Multi-file compilation.

Heap Operations:insert

Put # into existing heap:Put number in first available leaf node.

If parent tree no longer a heap, swap.

Then repeat this process until you hit the root.

Page 37: Data Representation, Data Structures, and Multi-file compilation.

Heap Operations: delete root

Take bottom-most value from the tree, put it where root used to be

Remove that node.

Go down heap, swapping if node larger than children.

Page 38: Data Representation, Data Structures, and Multi-file compilation.

Heap Ops: build heap from data

It's much easier to insert into an existing heap than build one at once.

Single nodes are always heaps!

Start from bottom, working up, inserting parents into heaps.

Repeat until no more data

Page 39: Data Representation, Data Structures, and Multi-file compilation.

Notice:

Heap insert/delete operations take ~lg(N) operations (one per level of the tree).

To build heap, each piece of data needs to be put in; ~ N lg N operations

To pull out sorted list, need to do N operations of a delete which takes ~lg N steps; another N lg N operations.

N lg N is much less than N2 for large N!!

Page 40: Data Representation, Data Structures, and Multi-file compilation.

Heapsort Algorithm:

Build heap from scratch

For each piece of data,Get root value

Delete from heap

Page 41: Data Representation, Data Structures, and Multi-file compilation.

Multiple-File compilation

Page 42: Data Representation, Data Structures, and Multi-file compilation.

Why more than one file?

As program gets bigger, having whole program in one file gets quickly awkward.

File hard to read

Takes forever to edit a 1M line file!

Hard to re-use code

Have to re-compile entire program even if just small change in one routine

Page 43: Data Representation, Data Structures, and Multi-file compilation.

Compilation vs. Linking

Compilation: compile source code into machine language.

Generates object file (.o)

Linking: bring in code from other libriaries that we might need

Link in code for printf() from std. C library; link in code for sin() from math library, etc.

Generates an executable

Page 44: Data Representation, Data Structures, and Multi-file compilation.

Compilation vs. Linking

If all of program is in one file, the distinction isn't important, and gcc will do the compile/link in one step.

Otherwise, do it seperately

Running Average Example

Sort Example