The Machine Model Memory - Clemson Universitywestall/texnh/courses/215.f05/notes… · This is the...

The Machine Model

Memory A one dimensional array of individually addressable storage elements.

Each element, called a byte, holds 8 binary digits (bits) of information.

It is extremely important to be able understand and distinguish the address of a storage elementthe contents of a storage element

Addresses

Addresses begin at 0 and increase in unit steps to N1 where N is the total number of bytes in the address space. A pointer variable holds an address.

Contents

Since each byte consists of only 8 bits, there are only 256 different values that can be contained in a storage element. These range from binary 00000000 to binary 11111111 which corresponds to decimal numbers 0 to 255.

Aggregation of basic memory elements

More than 8 bits are needed for useful application to numerical problems. Thus it is common to group adjacent bytes into units commonly called words.

Multiple word lengths are common, and common word lengths include 2 bytes, 4 bytes, and 8 bytes.

In some architectures (Sun Sparc) it is required that the address of each word be a multiple of word length. That is, the only valid addresses of 4byte words are 0, 4, 8, 12, 16, ...)

1

C Program Structure

The Basic Block {

declaration of variablesexecutable code

}

Historically, unlike in Java and C++, all variable declarations must precede the first line of executable code within the block. With newer compilers this restriction may not be true, but in any case, scattering variable declarations throughout a program has adverse effects on the readability and maintainability of a program.

Nesting of blocks is legal and common. Each interior block may include variable declarations.

Declaration of variables

Two generic types of basic (unstructured) variable exist:

integerfloating point

Integer variables may be declared as follows:

char a; /* 8 bits */short b; /* (usually) 16 bits */int c; /* (usually) 32 bits */long d; /* 32 or 64 bits */long long e; /* (usually) 64 bits */

These declarations implictly create signed integers. An 8 bit signed integer can represent values in the range

[27,...0,....271]

Signed integers are represented internally using 2's complement representaion.

2

Unsigned integers

Each of these declarations may also be preceded by the qualifier unsigned.

unsigned char a; /* 8 bits */

unsigned short b; /* (usually) 16 bits */

An 8 bit unsigned integer can represent values in the range

[0,....281]

In all modern computer systems different hardware instructions are used for signed and unsigned arithmetic.

Encoding of integer constants:

Integer constants may be expressed in several ways

decimal number 65hexadecimal number 0x41octal number 0101ASCII encoded character 'A'

ALL of the above values are equivalent ways to represent the 8 bit byte whose value is:

01000001

Constants of different representation may be freely intermixed in expressions.

x = 11 + 'b' - 055 + '\t';

x = 0xb + 0x62 - 0x2d + 0x9;

x = 11 + 98 - 45 + 9;

x = 73;

3

Floating point data

Two variations on floating point variables

float 32 bitsdouble 64 bits

Example

float a, b;double c;

Floating point constants can be expressed in two ways

Decimal number 1024.123Scientific notation 1.024123e+3

avogadro = 6.02214199e+23;

4

Executable code

Expressions consist of (legal combinations of):

constantsvariablesoperatorsfunction calls

Operators

Arithmetic: +, -, *, /, %

Comparative: ==, != , <, <=, >, >=Logical: !, &&, ||

Bitwise: &, |, ~, ^

Shift: <<, >>

Special types of expression

A statement consists of an expression followed by a semicolon

An assignment expression consists of

lvalue = expression;

lvalue is short for "leftvalue", which in turn represents any entity that may legitimately assigned a value:

The two most common examples are:A simple variableA pointer dereference

5

Warnings:

Nonassignment statements (with the exception of function calls) are syntactically legal but (generally) semantically useless:

x + y 3;x <= 2;

Use parentheses to avoid problems with:operator precedence (the order in which the operators are evaluated)

y = x + 5 & z 5;

andoperator associativity (the direction in which operations of equal precedence are evaluated)

y = 10 / 5.0 * 4.0

6

Control flow: if and while

if (expression)statement | basicblock < Executed if expression value is true

elsestatement | basicblock < Executed if expression value is false

while (expression)statement | basicblock < Executed while expression remains true

do < Executed until expression becomes falsestatement | basic block

while (expression);

There is no boolean type in C. An expression is false <=> its value is 0

If the expression is an assignment, the value of the expression is the value that is assigned.

Be particularly careful not to confuse the following:

if (x = (a + b))

if (x == (a + b))

Be careful not to accidentally use:

while (x < y);

7

Function definitions

A C function is comprised of 4 components

1 the type of value returned by the function2 the name of the function3 parenthesized declaration of function parameters4 at least one basic block containing local variable declarations and executable code

int main(int argc, /* Number of cmd line parms */char *argv[]) /* Array of ptrs to cmd line parms */{--- basic block ---}

8

Encoding of alphanumeric and special characters

As previously noted, a byte can contain a numeric value in the range 0255.Computers don't understand Latin, Cyrillic, Hindi, Arabic character sets!

Alphanumeric and special characters of the Latin alphabet are stored in memory as integer values encoded using the ASCII code. (American standard code for the interchange of information).

Other codes requiring 16 bits per character have been developed to support languages having large number of written symbols.

A simple program for displaying the ASCII encoding scheme

The correspondence between decimal, hexadecimal, and character representations can be readily generated via the following simple program.

#include <stdio.h> int main(int argc,char *argv[]){ int c;

c = ' '; /* Same as c = 32 or c = 0x20 */

while (c <= 'z') { printf("%3d %02x %c \n", c, c, c); c = c + 1; }}

9

All C programs must have a main() function, and that is where execution begins.

The printf() function is used to produce formatted output.

The programmer uses format specifications to govern the form of the output.

Output of the ASCII table generator

class/215/examples ==> gcc -o p2 p2.cclass/215/examples ==> p2 | more 32 20 33 21 ! 34 22 " 35 23 # 36 24 $ 37 25 % 38 26 & 39 27 ' 40 28 ( 41 29 ) 42 2a * 43 2b + 44 2c , 45 2d - 46 2e . 47 2f / 48 30 0 49 31 1 : 57 39 9 58 3a : 59 3b ; 60 3c < 61 3d = 62 3e > 63 3f ? 64 40 @ 65 41 A 66 42 B : 89 59 Y 90 5a Z 91 5b [ 92 5c \ 93 5d ] 94 5e ^ 95 5f _ 96 60 ` 97 61 a 98 62 b :

120 78 x 121 79 y122 7a z

There are also a few special characters that follow z.

10

The ASCII encoding of the letter A is the byte having the value

0100 0001 = 2 6 + 2 0

This value is written in decimal as 65 and in hexadecimal as 41

Control characters:

The ASCII encodings between decimal 0 and 31 are used for to encode what are commonly called control characters. Control characters having decimal values in the range 1 through 26 can be entered from the keyboard by holding down the ctrl key while typing a letter in the set a through z.

Some control characters have “escaped code” respresentations in C, but all may be written in octal.

Dec Keystroke Name Escaped code

4 ctrl-D end of file '\004'8 ctrl-H backspace '\b'9 ctrl-I tab '\t'10 ctrl-J newline '\n'12 ctrl-L page eject '\f'13 ctrl-M carriage return '\r'

11

The EOF character has no \letter representation.

It may be expressed as'\004' or for that matter simply 4.

Formatted Output and Input

Printing integer values:

As previously noted integer values are stored in computer memory using a binary representation.To communicate the value of an integer to a human, it is necessary to produce the string of ASCII characters that correspond to the rendition of the integer in the Latin alphabet.

For example the consider the byte

1001 0100

This is the binary encoding of the hex number 0x94 which is the decimal number 9 * 16 + 4 = 148. Therefore, if this number is to be rendered as a decimal number on a printer or display, three bytes corresponding to the ASCII encodings of 1, 4, and 8 must be sent to the printer or display. These bytes are expressed in hexadecimal as:

31 34 38 In the C language, run time libraries provide functions that interface with the Operating System to provide this service. Some of these functions may actually be implemented as macros that call the actual RTL functions.

The printf() function (actually a macro that converts to fprintf(stdout, ) ) is used to produce the ASCII encoding of an integer and send it to an output device or file.

Format codes specify how you want to see the integer represented.

%c Consider the integer to be the ASCII encoding of a character and render that character

%d Produce the ASCII encoding of the integer expressed as a decimal number%x Produce the ASCII encoding of the integer expressed as a hexadecimal number%o Produce the ASCII encoding of the integer expressed as an octal number

12

Specifying field width

These may be preceded by an optional field width specifier. The code %02x shown below forces the field to be padded with leading 0's if necessary to generate the specified field width.

#include <stdio.h>

int main(int argc,char *argv[]){ int x; int y = 78;

x = 'A' + 65 + 0101 + 0x41 + '\n'; printf("X = %d \n", x);

printf("Y = %c %3d %02x %4o \n", y, y, y, y);}

/home/westall ==> gcc -o p1 p1.c/home/westall ==> p1X = 270 Y = N 78 4e 116

The number of values printed by printf() is determined by the number of distinct format codes.

13

Output redirection

The printf() function sends its output to a logical file commonly known as the standard output or simply stdout.

When a program is run in the Unix environment, the logical file stdout is by default associated with the screen being viewed by the person who started the program.

The > operator may be used on the command line to cause the standard output to be redirected to a file:

class/215/examples ==> p1 > p1.output

A file created in this way may be subsequently viewed using the cat command (or edited using a text editor).

class/215/examples ==> cat p1.outputX = 270 Y = N 78 4e 116

14

Input of integer data

When a human enters numeric data using the keyboard, the values passed to a program are the ASCII encodings of the keystrokes that were made.

For example, when I type:

123.45

6 bytes are produced by the keyboard. The hexadecimal encoding of these bytes is:

31 32 33 2E 34 35 - hex 1 2 3 . 4 5 - ascii

To perform arithmetic using my number, it must be converted to interal floating point representation.

The scanf() function is used to

1 consume the ASCII encoding of a numeric value2 convert the ASCII string to the proper internal representation 3 store the result in a memory location provided by the caller.

As with printf(), a format code controls the process. The format code specifies both the input encoding and the desired type of value to be produced:

%d string of decimal characters int%x string of hex characters unsigned int%o string of octal characters unsigned int%c ascii encoded character char%f floating point number in decimal float%lf floating point number in decimal double%e floating pt in scientific notation float

Also as with printf() the number of values that scanf() will attempt to read is determined by the number of format codes provided. For each format code provided, it is mandatory that a variable be provided to hold the data that is read.

15

Specifying the variables to receive the values:

It is extremely important to note that:the value to be printed ins passed to printf() butthe address of the variable to receive the value must be passed to scanf()

The & operator in C is the “address of “ operator.

Formatted input of integer values

/* p3.c */#include <stdio.h>

int main(int argc,char *argv[]){

int a;int r;int b;r = scanf(“%d %d“, &a, &b);printf(“Got %d items with values %d %d \n”,

r, a, b);}

16

This call to scanf() will read two integer values from the standard input into the variables a and b.

Note that 2 format specifiers and 2 pointer variables are provides as required.

The scanf() function returns the number of values that it obtains. In this case r should be set to 2.

Care and feeding of input format specifiers

Embedding of extra spaces and including '\n' in scanf() format strings can also lead to wierd behavior and should be avoided. Specification of field widths is dangerous unless you really know what you are doing.

Effects of extra spaces in format specifier

/* p3.c */#include <stdio.h>

int main(int argc,char *argv[]){

int a;int r;int b;r = scanf(“ %d %d “, &a, &b);printf(“Got %d items with values %d %d \n”,

r, a, b);}

If you enter only the two values that scanf() is expecting to find followed by enter nothing will happen. You will have to type controld (which is the standard Unix end of file indicator). To get scanf() to return.

class/215/examples ==> gcc -o p3 p3.cclass/215/examples ==> p34 49

If you enter three values, it will not be necessary to enter controld, but the third value will be ignored.

class/215/examples ==> p31 5 6Got 2 items with values 1 5

This bad behavior can be fixed by changing the function call to:r = scanf(“%d %d“, &a, &b);

17

Effect of invalid input

If you accidentally enter a nonnumeric value scanf() will abort and return you only 1 value. The value 4927 represents the uninitialized value of b.

class/215/examples ==> p31 t 6Got 1 items with values 1 4927

Pitfalls of field widths:It is legal to specify field widths to scanf() but usually dangerous to do so!

class/215/examples ==> cat p4.c#include <stdio.h>

int main(int argc,char *argv[]){ int a; int r; int b;

r = scanf(" %2d %2d ", &a, &b);

printf("Got %d items with values %d %d \n", r, a, b);}

class/215/examples ==> p4123 456Got 2 items with values 12 3

18

Detecting endoffile with scanf()scanf() returns the number of values it actually obtained (which may be less than the number of values requested) .

Therefore the proper way to test for endoffile is to ensure that the number of values obtained was the nmber of values requested.

/* p4b.c */#include <stdio.h>

int main(int argc,char *argv[]){ int a; int r; int b;

while ((r = scanf("%d %d", &a, &b)) == 2) { printf("Got %d items with values %d %d \n", r, a, b); }}

Input redirection

Like the stdout the stdin may also be redirected. To redirect both stdin and stdout use:

a.out < input.txt > output.txt

when invoked in this manner when the program a.out reads from the stdin via scanf() or fscanf(stdin,.) the input will actually be read from a file named input.txt and any data written to the stdout will end up in the file output.txt.

19

The necessity of format conversion

As stated previously, the %d format causes scanf() to convert the ASCII text representation of a number to an internal binary integer value. One might think it would be possible to avoid this and be able to deal with mixed alphabetic and numeric data by simply switching to the %c format.

However, if we need to do arithmetic on what we read in, this approach does not work.

/* p11c.c */#include <stdio.h>

int main(void){ char i, j, k;

scanf("%c %c %c", &i, &j, &k);

printf("The first input was %c \n", i); printf("The third input was %c \n", k); printf("Their product is %c \n", i * k);}

class/215/examples ==> p11c2 C 4The first input was 2 The third input was 4 Their product is (

Why do we get “('' and not 8 for the answer? Since no format conversion is done the value stored in i is the ASCII code for the numeral 2 which is 0x32 = 50

Similarly the value stored in j is 0x34 = 52.

When they are multiplied, the result is 50 x 52 = 2600 which is too large to be held in 8 bits.

The 8 bit product is (2600 mod 256) (the remainder when 2600 is divided by 256).

That value is 2600 2560 = 40 = 0x28 which according to the ASCII table is the encoding of “(''

20

Floating point input and output

The %e and %f format codes are used to print floating point numbers in scientific and decimal format. The l modifier should be used for doubles. Field size specificiation is of the form: fieldwidth.numberofdigitstorightofdecimal

#include <stdio.h>

int main(int argc,char *argv[]){ float a; double b; float c; double d;

a = 1024.123; b = 1.024123e+03;

scanf("%f %le", &c, &d);

printf("a = %10.2f b = %8.4lf c = %f d = %8le \n", a, b, c, d);}

class/215/examples ==> gcc -o p5 p5.c class/215/examples ==> p512.4356 1.4e22 a = 1024.12 b = 1024.1230 c = 12.435600 d = 1.400000e+22

21

General Input and Output

The C language itself defines no facility for I/O operations. I/O support is provided through two collections of mutually incomptible function libraries

Low level I/O

open(), close(), read(), write(), lseek(), ioctl()

Standard library I/O

fopen(), fclose() opening and closing filesfprintf(), fscanf() field at a time with data conversionfgetc(), fputc() character (byte) at a timefgets(), fputs() line at a time fread(), fwrite(), fseek() physical block at a time

Our focus will be on the use of the standard I/O library:

Function and constant definitions are obtained via #include facility. To use the standard library functions:

#include <stdio.h>

#include <errno.h>

22

Standard library I/O

The functions operate on ADT's of type FILE * (which is defined in stdio.h).

Three special FILE *'s are automatically opened when any process (program) starts:

stdin Normally keyboard input (but may be redirected with < )

stdout Normally terminal output (but may be directed with > or 1> ) 1

stderr Normally terminal output (but may be directed with 2> )

Since these files are predeclared and preopend they must not be declared nor opened in your program! The following example shows how to open, read, and write disk files by name. In our assignments we will almost always read from stdin and write to stdout or stderr.

1 The 1> and 2> notation is supported by the sh family of shells incuding bash,but is not supported in csh, tcsh.

23

#include <stdio.h>

int main()

{

FILE *f1;

FILE *f2;

int x;

f1 = fopen(“in.txt”, “r”);

if (f1 == 0)

{

perror(“f1 failure”);

exit(1);

}

f2 = fopen(“out.txt”, “w”);

if (f2 == 0)

{

perror(“f2 failure”);

exit(2);

}

if (fscanf(f1, “%d”, &x) != 1)

{

perror(“scanf failure”);

exit(2);

}

fprintf(f2, “%d”, x);

fclose(f1);

fclose(f2); }

24

Examples of the use of p6

In this example, we have not yet created in.txt

class/215/examples ==> gcc -o p6 p6.c

class/215/examples ==> p6

f1 failure:: No such file or directory

cat: in.txt: No such file or directory

Here, in.txt is created with the cat command:

class/215/examples ==> cat > in.txt

99

Now if we rerun p6 and cat out.txt we obtain the correct answer

class/215/examples ==> p6

class/215/examples ==> cat out.txt

99

Note: The scanf() and printf() functions are actually macros or abbreviations for:

fscanf(stdin, ...) and fprintf(stdout, ...)

25

This message was produced by the call to perror() in the program.

This one was produced by the cat program itself.

Character at a time input and output

The fgetc() function can be used to read an I/O stream one byte at a time.The fputc() function can be used to write an I/O stream one byte at a time.

Here is an example of how to build a cat command using the two functions. The p10 program is being used here to copy its own source code from standard input to output.

class/215/examples ==> p10 < p10.c

/* p10.c */

#include <stdio.h>

main(){ int c;

while ((c = fgetc(stdin)) > 0) fputc(c, stdout);}

While fputc() and fgetc() are fine for building interactive applications, they are very inefficient.and should never be used for reading or writing a large volume of data such as a photographic image file.

26

Even though fgetc() reads one byte at a time it returns an int whose low order byte is the byte that was read.

Line at a time input and character string output.

The fgets(buffer_address, buf_size, file) function can be used to read from a stream until either:

1 a newline character '\n' = 10 = 0x0a is encountered or2 the specified number of characters 1 has been read or3 end of file is reached.

There is no string data type in C, but a standard convention is that a string is an array of characters in which the end is of the string is marked by the presence of a byte which has the value binary 00000000 (sometimes called the NULL character).

fgets() will append the NULL character to whatever it reads in.

Since fgets() will read in multiple characters it is not possible to assign what it reads to a single variable of type char. Thus a pointer to a buffer must be passed (as was the case with scanf()).

The fputs() function writes a NULL terminated string to a stream (after stripping off the NULL). The following example is yet another cat program, but this one works one line at a time.

class/215/examples ==> p11 < p11.c 2> countem

/* p11.c */

#include <stdio.h>#include <string.h>

main(){ unsigned char *buff = NULL; int line = 0;

buff = malloc(1024); // alloc storage to hold data if (buff == 0) exit(1);

27

The variable buff is declared to be a pointer to a char variable.

Pointer variables should always be initialized when declared.

while (fgets(buff, 1024, stdin) != 0) { fputs(buff, stdout); fprintf(stderr, "%d %d \n", line, strlen(buff)); line += 1; } free(buff); // release the storage malloc allocated}

Here buff is declared a pointer and the storage which will hold the data is allocated with malloc()Alternatively we could have declared unsigned char buff[1024] and not used malloc()

Here is the output that went to the standard error. Note that each line that appeared empty has length 1 (the newline character itself) and the lines that appear to have length 1 actually have length 2.

class/215/examples ==> cat countem0 12 1 1 2 19 3 20 4 1 5 7 6 2 7 24 8 18 9 1 10 24 11 18 12 15 13 1 14 41 15 5 16 27 17 55 18 17 19 4 20 2

28

Processing Heterogenous Input Files

While scanf() provides a handy way to process input files consisting of only numeric data, the handling of mixed input can be more difficult.

An example of such an input is a .ppm image file. Such a file begins with a header of the following format:

P5# CREATOR: XV Version 3.10a Rev: 12/29/94 (PNG patch 1.2)# This is another comment1024 814# So is this255

Items of useful information in this header include:P this is a .ppm file5 this is a .ppm file containing a grayscale (as opposed to color (P6)) image1024 the number of columns of pixels in each row of the image814 the number of rows of pixels in the image255 the maximum brightness value for a pixel

Comment lines Lines beginning with # are comments .

Any number (including 0) of comment lines may be present.

It is possible for all the useful values to appear on one line:P5 1024 814 255

Or they could be specified as follows:P5#1024# This is a comment814 255

Your mission will be to write a program that can read and write .ppm files.

29

Reading .ppm headers with fgets() and sscanf()

Because of the arbitary location and number of comment lines there is no good way to read the data using scanf(). A better approach is to use a combination of fgets() and the sscanf() function. The sscanf() function operates in a manner similar to fscanf() but instead of consuming data from a file it will consume data from a memory resident buffer. Like fscanf() it returns the number of items it successfully consumed from the buffer.

So if the .ppm header is nice and simple like: P6 768 1024 255the following program would suffice to read it: 13 int main() 14 { 15 char id[3] = {0, 0, 0}; /* Will hold P5 or P6*/

16 long vals[5]; /* The 3 ints */ 17 int count = 0; /* # of vals so far */ 19 char *buf = malloc(256); /* the line buffer */ 20 21 fgets(buf, 256, stdin); 22 id[0] = buf[0]; 23 id[1] = buf[1]; 24 count = sscanf(buf + 2, “%d %d %d”, 25 &vals[0], &vals[1], &vals[2]); 27 printf(“Got %d vals \n”, count); 28 printf(“%4d %4d %4d \n”, vals[0], vals[1], vals[2]); 29 }

Unfortunately life is rarely nice and simple and this program won't work if the .ppm header looks like P6# comment768 # another1024 255

In this case the buffer will contain P6\n at line 24 and sscanf() will return 0. What is needed is a loop located at line 26 in which fgets() will be called to read a new line of input into buf and sscanf will attempt to consume 3 integers from the buffer. The loop should end when a total of three integers have been placed in the vals array or when fgets() returns a value <= 0. Question: After each call to fgets() within the loop is it necessary to explicitly test to see if the first character in buf is # ???

30

Block input and output

The fread() and fwrite() functions are the most efficient way to read or write large amounts of data. The second parameter passed to the function is the size of a basic data element and the third parameter is the number of elements. Here the basic data element is a single byte so 1 is used. The fread() function returns the number of elements that it read.

Here is a still more efficient implementation of the catlike program that copies standard input to the standard output.

class/215/examples ==> p12 < p12.c/* p12.c */

#include <stdio.h>

main(){ unsigned char *buff; int len = 0; int iter = 0;

buff = malloc(1024); if (buff == 0) exit(1);

while ((len = fread(buff, 1, 1024, stdin)) != 0) { fwrite(buff, 1, len, stdout); fprintf(stderr, "%d %d \n", iter, len); iter += 1; }}0 322

Questions: Removing the parentheses surrounding

(len = fread(buff, 1, 1024, stdin))

will break the program. Explain exactly how and why things will go wrong in this case?What will happen if len in the fwrite() is replaced by 1024?

31

More on Pointers

As previously defined, a pointer is a variable whose contents is the address of another variable.

The size of the pointer variable must be n bits where 2n bytes is the size of the address space. For the Intel x86 and SPARC systems, address space size is 4GB and we have n=32.

The amount of space required to hold a pointer variable is always 4 bytes on these hardware platforms and is not related to the size of the entity to which it points.

To declare a pointer in a program just use the type it points to followed by *.

int *a;short *b;unsigned char *c;

To refer to the value of the pointer itself just use the variable name:

a = &x; // assign the address of x to pointer a

To refer to the entity to which the pointer points prepend *

*a = 132; // assign the value 132 to x (since a now points to x)

32

Initialization of pointers

Like all variables pointers must be initialized before they are used.

/* p17.c */

/* Example of a common error: failure to intialize *//* a pointer before using it.. This program is *//* is FATALLY flawed.... */

main(){ int* ptr;

*ptr = 99;

printf("val of *ptr = %d and ptr is %p \n", *ptr, ptr);}

But unfortunately, on Linux this program appears to work!

class/215/examples ==> p17 val of *ptr = 99 and ptr is 0x40015360

The program appears to work because the address 0x40015360 just happens to be a legal address in the address space of this program. Unfortunately, it may be the address of some other variable whose value is now 99!!!

This situation is commonly referred to as a loose pointer and bugs such as these may be very hard to find.

33

OOPS

We can convert the bug from latent by active by changing the location of the variable ptr.

Here we move it down the stack by declaring an array of size 200.

class/215/examples ==> cat p18.c/* p17.c */

/* Example of a common error: failure to intialize *//* a pointer before using it */

main(){ int a[200]; // force ptr to live down in uninit turf int* ptr;

printf("val of ptr is %p \n", ptr);

*ptr = 99;

printf("val of *ptr = %d and ptr is %p \n", *ptr, ptr);}

class/215/examples ==> p18val of ptr is (nil) class/215/examples ==>

Note that in this case the second printf() is not reached because the program segfaulted and died when it illegally attempted to assign the value 99 to memory location 0 (nil).

Minimizing latent loose pointer problems

Never declare a pointer without intializing it in the declaration.

int *ptr = NULL;

34

Using gdb to find the point of failure:

The gdb debugger is a handy tool for identifying the location at which a program failed.

To use the debugger it is necessary to compile with the g option.

class/215/examples ==> gcc -g -o p18 p18.c

To start the debugger use the gdb command and specify the program name

class/215/examples ==> gdb p18

At the (gdb) prompt you will usually want to tell the debugger to halt the program when it reaches the start of the main() function. The b command is short for breakpoint and tells the debugger where to stop. After a function is entered source code line numbers can be used to specify breakpoints.

(gdb) b mainBreakpoint 1 at 0x804833b: file p18.c, line 11.

To start the program use the run command:

(gdb) runStarting program: /local/westall/class/215/examples/p18

When the program reaches a breakpoint gdb will tell you and display the next line of code to be executed.

Breakpoint 1, main () at p18.c:1111 printf("val of ptr is %p \n", ptr);

Use the next command to execute a single source statement. The next command will treat a function call as a single statement and not single step into the function being called. If you want to single step through the function use the step command to step into it. The output of the printf() is intermixed with gdb's output and is shown in blue below.

(gdb) nextval of ptr is (nil) 13 *ptr = 99;

35

Saying next again will cause the program to execute the flawed assignment. gdb will show you the line that caused the error (line # 13).

(gdb) next

Program received signal SIGSEGV, Segmentation fault.0x08048357 in main () at p18.c:1313 *ptr = 99;

The where command can show you where the failure occured (along with a complete function activation trace.

(gdb) where#0 0x08048357 in main () at p18.c:13#1 0x40049917 in __libc_start_main () from /lib/libc.so.6(gdb)

To print the value of a variable use the print command.

(gdb) print ptr$1 = (int *) 0x0

Attempting to print what ptr points to reaffirms what the problem is:

(gdb) print *ptrCannot access memory at address 0x0

Use the q (quit) command to terminate gdb

(gdb) quit The program is running. Exit anyway? (y or n) yclass/215/examples ==>

36

Correct use of the pointer

In the C language, variables that are declared within any basic block are allocated on the runtime stack at the time the basic block is activated.

/* p19.c */

main() Addresses of y and ptr{ int y; int* ptr; static int a;

ptr = &y; // assign the address of y to the pointer

*ptr = 99; // assign 99 to what the pointer points to (y)

printf("y = %d ptr = %p addr of ptr = %p \n", y, ptr, &ptr);

ptr = &a;

printf("The address of a is %p \n", ptr);}

class/215/examples ==> p17y = 99 ptr = 0xbffff894 addr of ptr = 0xbffff890 The address of a is 0x804958c

Note that a is heap resident and has an address far removed from the stack resident y and ptr.

37

0xbffff8940xbffff890

Use of pointers in processing Cstrings.

Recall that a “Cstring” is an array of characters whose logical end is denoted by a zero valued byte. The C standard library has a number of functions designed to work with C strings. The strtol() function is one of them. You can see most of them on a Solaris system via the command:

man -s 3C string

string, strcasecmp, strncasecmp, strcat, strncat, strlcat, strchr, strrchr, strcmp, strncmp, strcpy, strncpy, strlcpy, strcspn, strspn, strdup, strlen, strpbrk, strstr, strtok, strtok_r - string operations

In this example we will see how strcat might be implemented. Its prototype is shown below.

char *strcat(char *s1, const char *s2);

38

An implementation of strcat

The name zstrcat is used to avoid potential nastygrams and name conflicts with the “real” strcat.

/* p12b.c */

#include <stdio.h>#include <error.h>

/* The mission of this function is to catenate *//* the string pointed to by p2 to the end of *//* the string pointed to by p1 */

char *zstrcat(char *p1, /* string to be extended */ char *p2) /* string to be appended */{ char *src; /* source pointer */ char *dst; /* destination pointer */

dst = p1; src = p2;

/* Start by advancing dest to the end *//* of the string to be extended */

while (*dst != 0) dst = dst + 1;

/* Now tack on the string to be appended */

while (*src != 0) { *dst = *src; dst = dst + 1; src = src + 1; }

/* Complete the job by NULL terminating the new string */

*dst = 0; return(p1); }

39

An alternative implementation

It is possible to shrink and obfuscate the code in an attempt to demonstrate ones C language machismo. This approach produces code that is difficult to read, painful to maintain, but may (or may not) produce a trivial improvement in performance . When tempted, just say no!

char *zstrcat(char *p1,char *p2){ char *r = p1; while (*p1++); p1--; while (*p1++ = *p2++); return(r);}

main(){ char *s1; char *s2; char *result;

s1 = (char *)malloc(40); s2 = (char *)malloc(40); result = (char *)malloc(81); *result = 0;

fgets(s1, 40, stdin); fgets(s2, 40, stdin);

zstrcat(result, s1); zstrcat(result, " "); zstrcat(result, s2);

fputs(result, stdout);}class/215/examples ==> p12bHelloWorldHelloWorld

40

Using a pointer to process an array of numbers

This program demonstrates

(1) how to process command line parameters and (2) how to process an array of ints using a pointer.

/* p20.c */

/* This program illustrates the use of pointers in *//* reading and processing an array of integers *//* It also shows how to access command line args */

/* An upper bound on the number of ints to be read *//* must be specified on the command line.. *//* p2 1000 */

#include <stdio.h>

int main(int argc, /* number of command line args */char* argv[]) /* array of pointers to args */{ int* base; // points to start of the array int* loc; // array index int max; // maximum number of values to read in int count; // actual number of values read in int largest; // largest number in the array int i;

41

Processing the command line argument

The argc parameter contains then number of command line parameters including the program name itself. Thus a command such as

a.out 300

will cause argc to be set to 2. It's very important to ensure that the user has provided the number of parameters you need before you attempt to process them!

if (argc < 2) /* Make sure at least one arg was given */ { printf("Usage is p20 upper-bound \n"); exit(1); }

This program expects that a numeric value representing the maximum number of values that are present in the standard input will be present on the command line.

/* The pointer argv[0] points to the name of the program (p20) *//* argv[1] points to the first command line argument *//* The atoi() function converts the ascii character *//* representation an integer to a binary int value. */

max = 0; max = atoi(argv[1]);

if (max <= 0) { printf("upper-bound must be a positive integer \n"); exit(2); }

42

Allocating storage for the array of ints.

Note that the size of the area allocated must be specified in bytes. A better way to do this would be malloc(max * sizeof(int));

count = 0; base = (int *)malloc(4 * max); loc = base;

43

Reading the input values

Note that loc and not &loc is passed to scanf(). What would happen if a programmer “accidentally'' passed &loc??

Also note that as each integer is read loc is incremented by 1 and not by 4. The C language automagically takes into account the size of the element pointed to when doing pointer arithmetic! If you were to printf() the value of loc using the p format code, you would see that the actual value does increase by 4 each time it is incremented.

/* Read in the integers from standard input making sure *//* not to overrun the size of the array */

while (scanf("%d", loc) == 1) { loc = loc + 1; count = count + 1; if (count == max) break; }

Identifying the largest value in the array

Before starting the search for the largest number the value of loc is reset to point to the base of the array.

loc = base; // point loc back to the start of the array largest = *loc; // init largest to the first value in the array loc = loc + 1;

for (i = i; i < count; i++) { if (*loc > largest) *loc = largest;

loc = loc + 1; } printf("Largest was %d \n", largest);}

Exercise: this program actually has a couple of nasty bugs in it. Use gdb to find and fix them.

44

Other ways to consume command line parameters

The standard runtime library of functions that is normally distributed with C compilers provides a variety of ways to consume command line parameters. In someways they are better than atoi() because they better indicate that the user entered incorrect data. Nevertheless, atoi() is probably the most widely used.

The sscanf() function

As we saw earlier, the sscanf() function may be used to attempt to convert ASCII strings in a memory resident buffer to a numeric value. Since argv[1] is a pointer to a memory resident buffer containing the string entered as parameter 1 we could replace the atoi() call in the previous example by:

code = sccanf(argv[1], “%d”, &max); if (code != 1){

fprintf(stderr, “Yeow! bad string in parm 1 \n”);}

Since sscanf() returns the number of values it converted, the variable code will be 1 if it was successful. The strttol() function

The strotol() function is more powerful still. It will fill in a pointer to the first illegal character it encounters in the string. If it was successful in producing a valid value, badchar will point to the NULL character that terminates the string.

char *badchar = NULL;long max = 0;

max = strtol(argv[1], &badchar, 10)if (*badchar != 0){

fprintf(stderr, “Yeow! bad character %c in value\n”, *badchar);

}

45

Representation of mulitdimensional data

Some data, for example, a grayscale image is most naturally represented as a two dimensional array of the form:

#define NUM_ROWS 768#define NUM_COLS 1024unsigned char pixmap[NUM_ROWS][NUM_COLS];

In this representation each byte represents a single pixel. A value of 0 represents completely black and a value of 255 represents the brightest possible white. Intermediate values provide a more or less linear brightness ramp.

To access a specific pixel in such an array one would use:

pixval = pixmap[row][col];

where row and col identify the location of the target pixel within the image.

As with all arrays in C, legal values of row range from 0 to NUM_ROWS 1

A disadvantage of this representation is that the value of NUM_ROWS and NUM_COLS must generally be established at compile time.

We would like to be able to read in the dimensions of the image from the .ppm header and then declare the pixmap array using the actual dimensions of the image.

There is no good way to do that in C because dimensions of static arrays are not allowed to be variables.

46

Possible approaches to the unknown dimension problem

Thus there are two ways to approach the problem:

1 The naive way:

unsigned char pixmap[MAX_ROWS][MAX_COLS];

where MAX_ROWS and MAX_COLS represent the dimensions of the largest image our program is able to handle.

This approach

wastes spaceforces us to read or write the image one row at a time.constrains the program to work only with images having size within the predined limits

2 The correct way:

Use a single dimensional malloc'd array and handle the indexing ourselves

47

Using a single dimension array to represent 2 D data

Suppose the integer variables numrows and numcols represent the number of rows and columns in the image and that they have been correctly set using information in the .ppm header.

Grayscale images

Then space for a grayscale image encoded in binary can be allocated by:

unsigned char* imageloc;imageloc = (unsigned char *)malloc(numrows * numcols);

To read in the grayscale image from the standard input:

pixcount = fread(imageloc, 1, numrows * numcols, stdin);if (pixcount != numrows * numcols){ fprintf(stderr, “pix count err - wanted %d got %d \n”,

numrows * numcols, pixcount); exit(1);}

Color images

A color image is often called an rgb image because the red, green, and blue intensities of each pixel are stored together. Space for a color image in binary rgb format can be allocated by:

unsigned char* imageloc;imageloc = (unsigned char *)malloc(3 * numrows * numcols);

To read in the rgb image:

pixcount = fread(imageloc, 3, numrows * numcols, stdin);if (pixcount != numrows * numcols){ fprintf(stderr, “pix count err - wanted %d got %d \n”,


48

Accessing a specific element in malloc'd two dimensional data The value of any grayscale pixel at location (row, col) within the image is accessed in the following way:

*(imageloc + row * numcols + col)

or equivalently

imageloc[row * numcols + col]

For example, if the value of numcols is 10, then there are 10 pixels per image row. To reach the pixel whose (row, column) address is (3, 5) it is necessary to pass over three complete rows(row 0, row 1, and row 2) and 5 pixels in row three (pixels 0, 1, 2, 3, and 4).

Thus, the offset of the pixel at (3, 5) is 3 * 10 + 5 as shown above.

Processing an image one row at a timeIn this example we print pixel values of an entire image with one line of output per row of pixel data:

for (row = 0; row < numrows; row = row + 1){ unsigned char* loc; loc = imageloc + row * numcols; // first pix in row printf(“\n”); for (col = 0; col < numcols; col = col + 1) { printf(“%03x”, *loc); loc = loc + 1; }}

49

Floating point images

Even though the pixels of a binary grayscale image require one byte of storage and those of a floating point image require four bytes of storage. The value of a floating point pixel at location (row, col) is also


Why? Because pointer arithmetic automagically takes into account the size of the object pointed to and the C compiler knows that unsigned chars require one byte and floats require 4.

Accessing the individual pixels of the binary rgb image

Here the process is slightly grubbier because each pixel is represented by three constituent components (r, g, and b) where each of (r, g, and b) are represented by a single unsigned char.

Nevertheless a bit of reflection yields:

red = *(imageloc + 3 * row * numcols + 3 * col);green = *(imageloc + 3 * row * numcols + 3 * col + 1);blue = *(imageloc + 3 * row * numcols + 3 * col + 2);

It is necessary to muliply by 3 because each pixel occupies three bytes of memory. Since our pointer is a pointer to type unsigned char which is a single byte, there is no magic available from the compiler to help us out here.

50

Basic elements of 3D coordinate systems and linear algebra

The location of a point P in 3D Euclidean space is given by a triple (px, py, pz)

The x, y, and z coordinates specify the distance you must travel in directions parallel to the the x, y, and z axes starting from the origin (0, 0, 0) to arrive at the point.

A vector in 3D space is sometimes called a directed distance because it represents both

a direction and a magnitude or distance

In this context the triple (px, py, pz) can also be considered to represent the direction from the origin

(0, 0, 0) to (px, py, pz) and its length sqrt(px2

+ py2

+ pz2) is the Euclidean (straight line)

distance from the origin to (px, py, pz)

Given two points P and Q in 3D Euclidean space, the vector

R = P Q = (px qx, py qy, pz qz)

represents the direction from Q to P. And its length as defined above is the distance between P and Q. Note that the direction is a signed quantity. The direction from P to Q is the negative of the direction from Q to P. However, the distance from P to Q is always the same as the distance from Q to P.

Example: Let V = (8, 6, 5) and P = (3, 2, 0).

Then the vector direction from V to P is : (3 8, 2 6, 0 5) = (5, 4, 5)The vector direction from P to V is (5, 4, 5)The distance between V and P is: sqrt(25 + 16 + 25) = sqrt(66) = 8.12.

51

The geometric interpretation of vector aritmetic

Here we work with 2 dimensional vectors to simplify the visual interpretation, but in 3d the principles are the same.

P = (5, 1) => +5 in the x direction and then +1 in the y direction

Q = (2, 4) => +2 in the x direction and +4 in the y direction.

R = P + Q = (7, 5)P = R – Q

52

2

4

Useful operations on vectors: We can define the sum of two vectors P and Q as follows:

R = P + Q = (px + qx, py + qy, pz + qz)(3, 4, 5) + (1, 2, 6) = (4, 6, 11)

The difference of two vectors is computed as:

R = P Q = (px qx, py qy, pz qz)(3, 4, 5) (1, 2, 6) = (2, 2, 1)

We also define multiplication (or scaling) of a vector by a scalar number a

S = aP = (apx, apy, apz)3 * (1, 2, 3) = (3, 6, 9)

The length of a vector P is a scalar whose value is denoted:

|| P || = sqrt(px2

+ py2

+ pz2)

|| (3, 4, 5) || = sqrt(9 + 16 + 25)

A unit vector is a vector whose length is 1. Therefore an arbitrary vector P may be converted to a unit vector by scaling it by 1 / its own length. Here U is a unit vector in the same direction as P.

U = ( 1 / || P || ) P

The inner product or dot product of two vectors P and Q is a scalar number

x = P dot Q = (px qx + py qy, + pz qz)(2, 3, 4) dot (3, 2, 1) = 6 + 6 + 4 = 16

Thus || P || = sqrt(P dot P)

If U and V are unit vectors and is the angle beween them then:

cos () = U dot V = V dot U

53

A library for 3D vector operations

Since the above operations will be commonly required in the raytracer, you will build a library of functions which we will call veclib3d.c to perform them. Here are the function prototypes that must be employed. /* Return the inner product of two input vectors */double vl_dot3(double *v1, /* Input vector 1 */double *v2); /* Input vector 2 */

/* Scale a 3d vector */void vl_scale3(double fact, /* Scale factor */double *v1, /* Input vector */double *v2); /* Output vector */

/* Return length of a 3d vector */double vl_length3(double *v1); /* Vector whose length is desired */

/* Compute the difference of two vectors *//* v3 = v2 - v1 */void vl_diff3(double *v1, /* subtrahend */double *v2, /* minuend */double *v3); /* result */

/* Compute the sum of two vectors */void vl_sum3(double *v1, /* addend */double *v2, /* addend */double *v3); /* result */

/* Construct a unit vector in direction of input */void vl_unitvec3(double *v1, /* Input vector */double *v2);

/* Print a label and the contents of a vector */void vl_vecprn3(char *label,double *vec);

54

Warning regarding aliased parametersWhen parameters are passed using pointers a potentially destructive phenomenon known as aliasing may occur. Here the caller of vl_unitvec3() is requesting that a vector be converted to a unit vector in place.

vl_unitvec3(v1, v1);

Now suppose the implementation of vl_unitvec3() is as follows:void vl_unitvec3(double *vin,double *vout){ *(vout + 0) = *(vin + 0) / vl_length3(vin); *(vout + 1) = *(vin + 1) / vl_length3(vin); *(vout + 2) = *(vin + 2) / vl_length3(vin);}

This looks correct and (assuming vl_length3()) is working properly it will work correctly as long as the parameters vin and vout point to different locations in memory. However, if they point to the same location in memory an incorrect computation will result. If vin and vout point to the same location the assignment

*(vout + 0) = *(vin + 0) / vl_length3(vin);

also changes *(vin+0). Therefore, in the second step of the computation*(vout + 1) = *(vin + 1) / vl_length3(vin);

vl_length3() will return a different value than in the first step.*(vout + 1) = *(vin + 1) / vl_length3(vin);

The function can be written correctly (and more efficiently) as. void vl_unitvec3(double *vin,double *vout){ double scale = 1.0 / vl_length3(vin); vl_scale3(scale, vin, vout); }

ALL veclib3d functions should work correctly with aliased parameters.

55

A sample test driver for veclib3d.c

/* p33.c */#include <math.h>#include "veclib3d.h"double v1[3] = {3.0, 4.0, 5.0};double v2[3] = {4.0, -1.0, 2.0};

main(){ double v3[3]; double v4[3]; double v;

vl_vecprn3("v2", v2); vl_vecprn3("v3", v3); vl_diff3(v1, v2, v3); vl_vecprn3("v2 - v1 = ", v3);

v = vl_dot3(v1, v2); printf("v1 dot v2 is %8.3lf \n", v);

v = vl_length3(v1); printf("Length of v1 is %8.3lf \n", v);

vl_scale3(1 / v, v1, v3); vl_vecprn3("v1 scaled by its 1/ length:", v3);

vl_unitvec3(v1, v4); vl_vecprn3("unit vector in v1 direction:", v4);}v1 3.000 4.000 5.000v2 4.000 -1.000 2.000v2 - v1 = 1.000 -5.000 -3.000v1 dot v2 is 18.000 Length of v1 is 7.071 v1 scaled by its 1/ length: 0.424 0.566 0.707unit vector in v1 direction: 0.424 0.566 0.707

56

Structured data types

A more elegant way to deal with the rgb image is to employ structured data types. A Java class is a generalization of a C struct. A structure is a aggregation of basic and structured data elements possibly including pointers, but unlike a class it contains no embedded methods (functions).

A C structure is declared as follows:

struct pix_type{ unsigned char r; unsigned char g; unsigned char b;};

As with Java it is important to be aware that struct pix_type is the name of a user defined structure type. It is not the name of a variable. To declare an instance (variable) of type struct pix_type use:

struct pix_type pixel;

struct pix_type is the name of the typepixel is the name of a variable of type struct pix_type

To set or reference components of the pixel use:

pixel.r = 250; // make Mr. Pixel yellow pixel.g = 250;pixel.b = 0;

57

Pointers to structures:

To declare a pointer to a struct pix_type use:

struct pix_type *pixptr;

Before using the pointer we must always make it point to something:

pixptr = (struct pix_type *)malloc(sizeof(struct pix_type));

To set or reference components of the pix_type to which it points use:

(*pixptr).r = 250; // make Mr. *pixptr magenta (*pixptr).g = 0;(*pixptr).b = 250;

Warning: the C compiler is very picky about the location of the parentheses here.

Perhaps because of the painful nature of the above syntax, an alteranative “short hand” notation has evolved for accessing elements of structures through a pointer:

pixptr->r = 0; // make Mr. pixptr-> cyan pixptr->g = 250;pixptr->b = 250;

This shorthand form is almost universally used.

58

Revisiting the color image

Space for a color image in binary rgb format can be allocated by:

struct pix_type *pixptr;:

pixptr = (struct pixptr *)malloc(sizeof(struct pix_type) * numrows * numcols);

To read the image data

pixcount = fread(imageloc, sizeof(struct pix_type), numrows * numcols, stdin);

if (pixcount != numrows * numcols){ fprintf(stderr, “pix count err - wanted %d got %d \n”,


To access a specific pixel

red = *(pixptr + row * numcols + col).r;green = *(pixptr + row * numcols + col).b;blue = *(pixptr + row * numcols + col).g;

or

red = (pixptr + row * numcols + col)->r;green = (pixptr + row * numcols + col)->g;blue = (pixptr + row * numcols + col)->b;

or even but not in class assignments

red = pixptr[row * numcols + col].r;

59

Structures and Arrays

An element of a structure may be an array:

struct img_type{ int numrows; int numcols; unsigned char pixval[1024][768];};

struct img_type image;

Elements of the array are accessed in the usual way: image.pixval[5][10] = 125;

It is also possible to create an array of structures

struct pixel_type{ unsigned char r; unsigned char g; unsigned char b;};

struct pixel_type pixmap[100];

To access an individual element of the array place the subscript next to the name it indexes

pixmap[15].r = 250;

It is also possible to create a array of stuctures containing arrays

struct img_type images[20];

Elements of the array are accessed in the usual way: image[4].pixval[5][10] = 125;

60

Structures containing structures

It is common for structures to contain elements which are themselves structures or arrays of structures. In these cases the structure definitions should appear in “insideout” order.

This is done to comply with the usual rule of not referencing a name before it is defined. struct pixel_type{ unsigned char r; unsigned char g; unsigned char b;};

struct img_type{ int numrows; int numcols; struct pixel_type pixdata[1024][768];};

struct img_type image;

image.pixdata[4][14].r = 222;

61

Ray tracing introduction

The objective of a ray tracing program is to render a photorealistic image of a virtual scene in 3 dimensional space. There are three major elements involved in the process:

1 The This is the location in 3d space at which the viewer of the scene is located 2 The screen This defines a virtual window through which the viewer observes the scene.

The window can be viewed as a discrete 2D pixel array (pixmap) . The raytracing procedure computes the color of each pixel. When all pixels have been computed, the pixmap is written out as a .ppm file

3 The scene The scene consists of objects and light sources | light light | obj viewpt | obj | obj | obj screen

Two coordinate systems will be involved and it will be necessary to map between them:1 Window coordinates the coordinates of individual pixels in the window. These are two

dimensional (x, y) interger numbers For example, if a 400 cols x 300 rows image is being created the window x coordinates range from 0 to 399 and the window y coordinates range from 0 to 299.

2 World coordinates the “natural” coordinates of the scene measured in feet/meters etc.Since world coordinates describe the entire scene these coordinates are three dimensional (x, y, z) floating point numbers.

For the sake of simplicity we will assume that

the screen lies in the z = 0.0 planethe center of the window has world coordinates (0.0, 0,0, 0,0)the lower left corner of the window has window (pixel) coordinates (0, 0)the location of the viewpt has a positive z coordinateall objects have negative z coordinates.

62

The projection data structure

The typedef facility can be used to create a new identifier for a user defined type. The following example creates a new type name, proj_t, which is 100% equivalent to struct projection_type.You may either use or not use typedef as you see fit.

A structure of the following type can be used to hold the view point and coordinate mapping data that defines the projection onto the virtual window:

typedef struct projection_type{ int win_size_pixel[2]; /* Projection screen size in pix */ double win_size_world[2]; /* Screen size in world coords */ double view_point[3]; /* Viewpt Loc in world coords */} proj_t;

To map a pixel coordinate to a world coordintate a function such as the following could be used:

void map_pix_to_world(proj_t *proj, /* projection definition */int x, /* x and y pixel */ int y, /* coordinates */double *world) /* pointer to 3 doubles */{ *(world + 0) = (double)x /(proj->win_size_pixel[0] - 1) * proj->win_size_world[0]; *(world + 0) -= proj->win_size_world[0] / 2.0; *(world + 1) = ???; *(world + 2) = 0.0;}

Example:Suppose win_size_pixel[0] = 800 pixelsSuppose win_size_world[0] = 20 units

Then the world x coordinate of the pixel with x pixel coordinate 400 is approximately 0, the world x coordinate of the pixel with x pixel coordinate 0 is 10. the world x coordinate of the pixel with x pixel coordinate 799 is 10.0

63

The raytracing algorithm

The complete raytracing algorithm is summarized below:

Phase 1: Initialization

acquire window pixel dimensions from command lineread world coordinate dimensions of the window from the stdinread world coordinates of the view point from stdinprint projection data to the stderr

load object and light descriptions from the stdindump object and light descriptions to the stderrr

Phase 2: The raytracing procedure for building the pixmap

for each pixel in the window{

compute the world coordinates of the pixelcompute the direction in 3d space of a ray from the viewpt through the pixelcompute the color of the pixel based upon the illumination of the object(s) hit by the

ray}

Phase 3: Writing out the pixmap as a .ppm file

write .ppm header to stdoutwrite the image to stdout

64

Data structures the big picture

WARNING: Some elements of the definitions have been have been abbreviated and or assume the use of the typedef construct. See the examples on other pages for these details.

65

struct model_type{ proj_t *proj; list_t *lights; list_t *scene;};

struct projection_type{ int win_size_pixel[2]; double win_size_world[3]; double view_point[3];};

struct list_type{ obj_t *first; obj_t *last;};


struct object_type{ obj_t *next; void *priv;};

struct object_type{ : void *priv;};

struct object_type{ : void *priv;};


List management functions

The characteristics of the object lists used by the raytracer include the following:1 newly created objects are always added to the end of the list2 objects are never deleted from the list 3 lists are always processed sequentially from beginning to end

Because of these restrictions a singly linked list suffices nicely, and a list may be associated with a list header structure of the type shown below.

typedef struct list_type{ obj_t *head; /* pointer to first object in list */ obj_t *tail; /* pointer to last object in list */} list_t;

The list management module requires only two functions. The list_init() function is used to create a new list. Its mission is to:

1 malloc() a new list_t structure.2 set the head and tail elements of the structure to NULL. 3 return a pointer to the list_t to the caller.

list_t *list_init(void){

}

66

The list_add() function must add the object structure pointed to by new to the list structure pointed to by list. Two cases must be distinguished:

1 the list is empty (list> head == NULL)2 the list is not empty

void list_add(list_t *list,obj_t *new){

}

The model data structure

This structure is a container used to reduce the number of parameters that must be passed through the raytracing system.

typedef struct model_type{ proj_t *proj; list_t *lights; list_t *scene;} model_t;

67

The main function

A properly designed and constructed program is necessarily modular in nature. Modularity is somewhat automatically enforced in OO languages, but new C programmers often revert to an ugly pack it all into onemain function approach.

To discourage this in the raytracing program, deductions will be made for:

1 Functions that are too long (greater than 30 lines)2 Nesting of code greater than 2 deep3 Lines that are too long (greater than 72 characters)

Here is the main function for the final version of the ray tracer.

int main(int argc,char **argv){ model_t *model = (model_t *)malloc(sizeof(model_t)); int rc;

model->proj = projection_init(argc, argv, stdin); projection_dump(stderr, model->proj);

model->lights = list_init(); model->scene = list_init();

rc = model_init(stdin, model); model_dump(stderr, model);

if (rc == 0) make_image(model);

return(0);}

68

The generic object structure

Even though C is technically not an Object Oriented language it is possible to employ mechanisms that emulate both the inheritence and polymorphism found in true Object Oriented languages.

The obj_t structure serves as the generic “base class” from which the esoteric objects such as planes and spheres are derived. As such, it carries only the attributes that are common to the derived objects.

Polymorphic behavior is achieved by the use of function pointers embedded in the obj_t. These can be initialized to point to functions that provide a default behavior but may be overridden as needed when an esoteric object such as a tiled plane must substitute its own method. Fields shown in blue are required for the first version of the ray tracer.

typedef struct obj_type{ struct obj_type *next; /* Next object in list */ int objid; /* Numeric serial # for debug */ int objtype; /* Type code (14 -> Plane ) */ double (*hits)(double *base, double *dir, struct obj_type *); /* Hits function. */

/* Optional plugins for retrieval of reflectivity *//* useful for the ever-popular tiled floor */ void (*getamb)(struct obj_type *, double *); void (*getdiff)(struct obj_type *, double *); void (*getspec)(struct obj_type *, double *);

/* Reflectivity for reflective objects */ material_t material;

/* These fields used only in illuminating objects (lights) */ void (*getemiss)(struct obj_type *, double *); double emissivity[3]; /* For lights */ void *priv; /* Private type-dependent data */

double hitloc[3]; /* Last hit point */ double normal[3]; /* Normal at hit point */} obj_t;

69

Declaration of derived object types

The esoteric characteristics of specific object types must be carried by structures that are specific to the object type being described. The priv pointer of the base classe obj_t is used to connect the base class instance to the derived class instance. This connection is automatic and invisible in a true OO language but is manual and visible in C.

/* Infinite plane */

typedef struct plane_type { double point[3]; /* A point on the plane */ double normal[3]; /* A normal2 vector to the plane */} plane_t;

/* Sphere */typedef struct sphere_type{ double center[3]; double radius;} sphere_t;

/* Point light source */typedef struct light_type { double location[3];} light_t;

2 The term normal vector is used to refer to a vector perpindicular to the surface of an object.

70




struct plane_type{ double point[3]; double normal[3];}

struct plane_type{ double point[3]; double normal[3];}

Loading the model description

The raytracer must be able to read model descriptions of the format shown below. This format is designed for easy digestion. All numeric values are readable with scanf(). After reading the required values from each line fgets(buff, 256, stdin); should be called to consume the descriptive text.

Model definitions will begin with the projection data as previously described.

Following the view point will be a arbitrary and unknown number of object descriptions. Each object description will begin with an object type (10 = light and 14 = plane, 15 = sphere... other new types will follow. The remainder of the parameters will be dependent upon the type of object being loaded. Therefore you must create a separate object loader (and dumper) for each object type. Here you will need routines plane_init(), plane_dump(), sphere_init(), sphere_dump(). 8 6 world x and y dims0 0 3 viewpoint (x, y, z)

14 plane5 5 2 r g b ambient0 0 0 r g b diffuse0 0 0 r g b specular

1 0 1 normal-4 -1 0 point

14 plane5 2 5 r g b ambient0 0 0 r g b diffuse0 0 0 r g b specular

-1 0 1 normal 4 -1 0 point

13 sphere2 5 2 r g b ambient0 0 0 r g b diffuse0 0 0 r g b specular

0 1 -3 center 1.5 radius

71

The model_init function

The model_init() function should consist of a single loop that reads an object type code from the standard input and then invokes an object type specific function to read in the data describing the sphere, plane, or light. This function should abort the program by calling exit if errors are encountered in the input.

Associating numeric identifiers with symbolic names

When numeric identifiers are used in a C program they should always be equated to a symbolic name and only the symbolic name should be used in executable code.

/* Object types */

#define FIRST_TYPE 10#define LIGHT 10#define SPOTLIGHT 11#define PROJECTOR 12#define SPHERE 13#define PLANE 14

72

/**/int model_init(FILE *in,model_t *model){ char buf[256]; int objtype; int rc = 0; obj_t *new;

/* now load the objects in the scene */

while (fscanf(in, "%d", &objtype) == 1) { fgets(buf, 256, in); /* consume rest of line */ switch (objtype) { case SPHERE: new = sphere_init(in, objtype) break; case PLANE: new = plane_init(in, objype); break; : default: fprintf(stderr, “Bad code %d \n”, objtype); exit(1); }

if (new == NULL) { fprintf(stderr, “failed to load type %d \n”, objtype); model_dump(stderr, model); exit(2); } else { list_add(model->scene, new); } }

It will be shown later that it is possible to replace this rather messy construction with a table of function pointers.

73

Object creation and initialization

In true object oriented languages instances of derived classes are “automagically” bound to an instance of the base class at the time the object is created. In C it will be necessary to manually invoke a constructor for the derived class. The constructor for the sphere_t is:

obj_t *sphere_init(FILE *in, int objtype);

Derived class constructors must:

1. Explicitly invoke the contstructor for the obj_t base class.2. malloc() an instance of the structure describing the derived class3. Fill in the attributes of the instance of the derived class.4. Fill in required function pointers in the obj_t structure5. Link the obj_t structure to the derived class structure using the obj>priv pointer in the obj_t

The sphere_init(), plane_init() functions are responsible for creating the required structures and reading in attribute data from the model definition file.

/**/obj_t *sphere_init(FILE *in,int objtype){ obj_t *obj; sphere_t *new; int pcount = 0;

All objectspecific loaders begin by creating the generic object type.

obj = object_init(in, objtype);

malloc a sphere_t structure link it to the obj_t structure read the location of the center and the radius into the sphere_t

}

74

Initialization of the obj_t

The obj_t constructor object_init() is responsible for allocating an instance of the obj_t and initializing it. The reflective properties of visible objects in the scene are carried in the material_t structure that is embedded within the obj_t.

The material_t structures carry the red, green, and blue reflectivity of the object to ambient, diffuse and specular light. Larger values make the object brighter. An ambient reflectivity of (5, 0, 0) makes the object appear as red, while (5, 5, 0) is yellow, and (5, 5, 5) white. For the first milestone only the ambient reflectivity will be used but we will go ahead and read in all components.

typedef struct material_type{ double ambient[3]; /* Reflectivity for materials */ double diffuse[3]; double specular[3];} material_t;

obj_t *object_init(FILE *in,int objtype){ obj_t *obj;

malloc a structure of type obj_t fill in objtype and objid fields if (the object is not a light) { call material_load to read ambient, diffuse, specular reflectivity } return(obj);}

75

Object dumpers

For each object type you must also provide an object dumper that provides a reasonly formatted report describing the input data. The following example is acceptable:

Dumping object of type Plane Material data -Ambient - 5.000 5.000 2.000Diffuse - 0.000 0.000 0.000Specular - 0.000 0.000 0.000

Plane data normal - 1.000 0.000 1.000point - -4.000 -1.000 0.000

The recommeded form of a object dumper is shown below. All should reply on a common material_dump() function rather than each embedding its own material dumper.

int plane_dump(FILE *out,obj_t *obj){ plane_t *plane;

material_dump(out, &obj->material);

plane = (plane_t *)obj->priv;

print plane specfic data}

76

Ray tracer designed (continued)

Now we are finally ready to build an image. This will be a very crude image because it will support ambient lighting only. Nevertheless this is a significant milestone because the 3D geometry problem must be addressed.

Overview of the make_image() function

The make_image() function should live in a separate module named image.c

void make_image(model_t *model){ unsigned char *pixmap;

compute size of output image and malloc() pixmap.

for y = 0 to window size in pixels { for x = 0 to window size in pixels { make_pixel(model, x, y, pixmap_location); } } write .ppm P6 header write pixmap}

77

The make_pixel function

This function is responsible for driving the construction of the (r, g, b) components of a single pixel. Within the ray tracing process pixel colors are represented as

1. double precision values in the range [0.0, 1.0] where 2. 0.0 represents black and 1.0 the brightest level of the corresponding color.

However, depending upon input values its possible for the raytracing algorithm to compute intensities that exceed 1.0. When this happens this module must clamp them back to the allowable range [0.0, 1.0].

void make_pixel(model_t *model,int x, /* Pixel x coord */int y, /* Pixel y coord */unsigned char *pixval) /* -> to (r, g, b) in pixmap */{ double *world = alloca(3 * sizeof(double)); double *intensity = alloca(3 * sizeof(double));

map_pix_to_world(x, y, world);

initilize intensity to (0.0, 0.0, 0.0)

compute unit vector dir in the direction from the view_point to world;

ray_trace(model, model->proj->view_point, dir, intensity, 0.0, NULL);

clamp each element of intensity to the range [0.0, 1.0]

set (r, g, b) components of vector pointed to by pixval to 255 * corresponding intensity

}

78

The ray_trace function

The ray_trace function is responsible for tracing a single ray. It should reside in ray.c

/**//* This function traces a single ray and returns the composite *//* intensity of the light it encounters It is recursive and *//* so the start of the ray cannot be assumed to be the viewpt *//* Recursion won't be involved until we take on specular light */

void ray_trace(model_t *model, /* pointer to model container */ double base[3], /* location of viewer or previous hit */double dir[3], /* unit vector in direction of object */double intensity[3], /* intensity return location */double total_dist, /* distance ray has traveled so far */obj_t *last_hit) /* obj that reflected this ray or NULL*/

{

The ray trace function should rely upon find_closest_object to identify the nearest objectthat is hit by the ray. If none of the objects in the scene is hit, NULL is returned. The distance from the base of the ray to the nearest hitpoint is returned in mindist.

closest = find_closest_obj(model->scene, base, dir, NULL, &mindist);

if (closest == NULL) return;

add mindist to total_dist set intensity to the ambient reflectivity of closest divide intensity by total_dist

}

79

Hit functions

Given the viewpoint, ray direction and a pointer to an obj_t the mission of a hit function is to determine if the ray hits the object. If it does, the hit point and the normal vector at the hitpoint should be returned.

Given V, D and an object structure O the mission of a hit function is to determine if a ray based at V traveling in direction D hits O.

All points on the ray may be expressed as V + t D for ∞ < t < ∞

Ray direction: D = (P V) / ||P V||Distance to hit point: th = || H V || Location of hit point: H = V + thD

80

Viewpoint V

Pixel P

Screen

Hit point H

Prototype for the hits functions

To determine if a ray hits an object you must add a hits_objtype function to your sphere.c, plane.c etc modules. These modules should also contain the loading and dumping code for the specific object type. A sample prototype is shown below: double hits_plane(double *base, /* the (x, y, z) coords of origin of the ray */ double *dir, /* the (x, y, z) direction of the ray */obj_t *obj); /* the object to be tested for the hit. */

Pointers to the hits' function should be stored in the object structure at the time it is created

obj->hits = hits_plane;

81

Determining if a ray hits a plane

This basic strategy will be used in all hits functions:

0 Assume that V represents the start of the ray and D is a unit vector in its direction

1 Derive an equation for an arbitrary point P on the surface of the object.

2 Recall that all points on the ray are expressed as V + tD3 Substitute V + tD for P in the equation derived in (1).

4 Attempt to solve the equation for t. 5 If a solution th can be found, then H = V + th D.

A plane in three dimensional space is defined by two parameters

A normal vector N = (nx, ny, nz)A point Q = (qx, qy, qz) through which the plane passes.

A point P = (px, py, pz) is on the plane if and only if:

N dot (P Q) = 0 because, if the two points P, Q lie in the plane, then the vector from one

to the other (P Q) also lies in the plane and is necessarily perpendicular to the plane's normal.

We can rearrange this expression to get:

N dot P N dot Q = 0 N dot P = N dot Q (1)

Recall that the the location of any points on a ray based at V with direction D is given by:

V + t D

Therefore we may replace the P in equation (1) by V + tD and get:

N dot (V + tD) = N dot Q (2)

82

Some algebraic simplification yields allow us to solve this for t

N dot (V + tD) = N dot Q (2)

N dot V + N dot tD = N dot QN dot tD = N dot Q N dot Vt (N dot D) = (N dot Q N dot V)th = (N dot Q N dot V) / (N dot D)

The location of the hitpoint that should be stored in the obj_t is thus:

H = V + thD

The normal at the hitpoint which must also be saved in the obj_t is just N Unlike other quadric surfaces, there is only a single point at which a ray intercepts a plane. Therfore unlike equations we will see later, this one is not quadratic. There are some special cases we must consider:

(1) (N dot D) = 0 In this case the direction of the ray is perpendicular to the normal to the plane. This means the ray is parallel to the plane. Either the ray lies in the plane or misses the plane entirely. We will always consider this case a miss and return 1. Attempting to divide by 0 will cause your program to either fault and die or return a meaningless value.

(2) th < 0 In this case the hit lies behind the viewpoint rather than in the direction of the screen.

This should also be considered a miss and 1 should be returned.

(3) The hit lies on the view point side of the screen.

H = (hx, hy, hz) if hz > 0 the hit is on the wrong side

and 1 should be returned.

83

Determining if a ray hits a sphere.

Assume the following:

V = viewpoint or start of the ray

D = a unit vector in the direction the ray is traveling

C = center of the sphere

r = radius of the sphere.

The arithmetic is much simpler if the center of the sphere is at the origin. So we start by moving it there! To do so we must make a compensating adjustment to the base of the ray.

C' = C C = (0, 0, 0) = new center of sphereV' = V C = new base of ray

D does not change

A point P on the sphere whose center is (0, 0, 0) necessarily satisfies the following equation:

px2

+ py2

+ pz2 = r2 (1)

All points on the ray may be expressed in the form

P = V' + t D = (v'x + tdx, v'y + tdy, v'z + tdz) (2)

where t is the Euclidean distance from V' to P

Thus we need to find a value of t which yields a point that satisfies the two equations. To do that we take the (x, y, z) coordinates from equation (2) and plug them into equation (1). We will show that this leads to a quadratic equation in t which can be solved via the quadratic formula.

(v'x + tdx)2 + (v'y + tdy)2 + (v'z + tdz)2 = r2

84

Expanding this expression by squaring the three binomials yields:

(v'x 2 + 2tv'x dx +t2

dx2) + (v'y

2 + 2tv'y dy + t2 dy2) +

(v'z 2 + 2tv'z dz + t2 dz

2) = r2

Next we collect the terms associated with common powers of t

(v'x 2 + v'y

2 + v'z 2) + 2t (v'x dx + v'y dy + v'z dz ) +

t2(dx 2 + dy

2 + dz2 ) = r2

Now we reorder terms as decreasing powers of t and note that all three of the parenthesized trinomials represent dot products.

(D dot D)t2 + 2 (V' dot D) t + V' dot V' r2 = 0

We now make the notational changes:

a = D dot Db = 2 (V' dot D)c = V' dot V' r2

to obtain the following equation

at2 + bt + c = 0

whose solution is the standard form of the quadratic formula:

th = b +/ sqrt(b2 4ac)

2a

85

Recall that quadratic equations may have 0, 1, or 2 real roots depending upon whether the discrimant:

(b2 4ac)

is negative, zero, or positive. These three cases have the following physical implications:

negative => ray doesn't hit the spherezero => ray is tangent to the sphere hitting it at one point

(we will consider this a miss).positive => ray does hit the sphere and would pass through its interior

(this is the only case we consider a hit).

Furthermore, the two values of t are the distances from the base of the ray to the points(s) of contact with the sphere. We always seek the smaller of the two values since we seek to find the “entry wound” not the “exit wound”.

Therefore, the hits_sphere() function should return

th = b sqrt(b2 4ac) 2a

if the discriminant is positive and

th = 1

otherwise.

86

Determining the coordinates of the hit point on a sphere.

The hits_sphere() function should also fill in the coordinates of the hit in the obj_t structure.

The (x, y, z) coordinates are computed as follows.

H = V + th D

Important items to note are:

The actual base of the ray V and not the translated base V' must be used

The vector D must be a unit vector in the direction of the ray.

Determining the surface normal at the hit point.

The normal at any point on the surface of a sphere is a vector from the center to the point. Thus

N = P C (note that N will be a unit vector <==> r = 1)

Therefore a unit normal may be constructed as follows:

Nu = (H C) / || (H C) ||

87

Functions and parameters

Functions provide a useful way to partition a large program in to a collection of small manageable entities

Each function should be:1 small (ideally less than 20 lines of code)2 perform a single, well defined task (e.g., print the attributes of a sphere)3 designed as to minimize information flow and shared data

Proper use of functions 1 speeds development2 reduces the difficulty of the debugging process3 facilitates the process of adding new functionality to a program.

A C language function is defined as follows

return-type function-name(parameter-1-type parameter-1-name,:paramter-n-type parameter-n-name){ local variables; executable code; }

A specific example is:

int adder(int a,int b){ int sum;

sum = a + b; return(sum);}

88

Parameter passing

There are many possible mechanisms by which parameters may be passed.

The standard C language uses callbyvalue.

In this approach a copy of the parameter is placed on the stack. The called function is free to modify the copy as it wishes, but this will have no effect on the value held by the caller. /* p23.c */

/* Parameter passing 1 */

int try_to_mod(int a){ printf("The address of try's a is %p\n", &a); a = 15;}

main(){ int a = 20;

printf("The address of main's a is %p\n", &a); try_to_mod(a); printf("a = %d \n", a);}

class/215/examples ==> a.outThe address of main's a is 0xbffff884The address of try's a is 0xbffff870a = 20

89

Allowing functions to modify their parameters

When it is necessary for a function to directly modify a variable or structure that is owned by the calling program, it is necessary to pass a pointer to the entity to the called function.

/* p24.c *//* Parameter passing 2 */

int try_to_mod(int* a){ printf("The address of try's a is %p\n", a); *a = 15;}

main(){ int a = 20;

printf("The address of main's a is %p\n", &a); try_to_mod(&a); printf("a = %d \n", a);}

The address of main's a is 0xbffff884The address of try's a is 0xbffff884a = 15

90

Passing structures as parameters

Passing of large structures by value should never be done because of the inefficiency of copying large amounts of data onto the stack.

The call shown below requires copying 512K bytes!!! /* p25.c */

struct large{ double tab[64 * 1024];} table;

void try_to_mod(struct large t){ t.tab[1000] = 100.0;}

main(){ table.tab[1000] = 52.0; try_to_mod(table); printf("%lf \n", table.tab[1000]);}

Note that the value of table.tab[1000] does not change, demonstrating that try_to_mod() altered a copy.

class/215/examples ==> a.out52.000000

91

Passing pointers to structures

The correct way to handle this is with a pointer as before.

/* p25.c */

struct large{ double tab[64 * 1024];} table;

void try_to_mod(struct large* t){ t->tab[1000] = 100.0;}

main(){ table.tab[1000] = 52.0; try_to_mod(&table); printf("%lf \n", table.tab[1000]);}

class/215/examples ==> a.out 100.000000

92

Multimodule programs

Large programs should be broken into multiple source modules. Each source module should contain a collection of functions and possibly data structures that work on a particular aspect of the problem at hand.

Scope of function definitions

Functions defined in one source module (p1.c) may be invoked by functions defined in another source module (p2.c), if the object files p1.o and p2.o are combined by the Unix linker ld.

Under most circumstances ld is automagically invoked by gcc whenever and executable program is being constructed.

class/215/examples ==> cat p27.c

/* p27.c */

int adder(int a,int b){ return(a + b);}


/* p28.c */

int main(){ int sum; sum = adder(2, 5); printf("sum = %d \n", sum);}

93

Building a program consisting of multiple source modules:

One approach is to use gcc with the c option to build p27.o and p28.oThe files p27.o and p28.o are object files but they are not executable.

class/215/examples ==> gcc -c -g p27.cclass/215/examples ==> gcc -c -g p28.c

Then gcc is used to invoke ld to link p27.o, p28.o and the standard C library files.The file addem is an executable object file.

class/215/examples ==> gcc -o addem -g p27.o p28.oclass/215/examples ==> addemsum = 7

An alternative approach is to perform compilation an linking all in one step:

class/215/examples ==> gcc -o addem -g p27.c p28.cclass/215/examples ==> addemsum = 7

The disadvantage of this method is that it is necessary to recompile all of the source files making up the program each time any one source module changes.

94

Function prototypes and the proper use of .h files

Note that in the above example there is no prototype definition of adder available in p28.c This is illegal in Java and C++ and will produce a syntax error. This is legal in C but it is dangerous. The program works correctly only because:

1 The default type returned by an unknown function is int.2 The parameters being passed are also assumed to be of the correct type and are never type converted.

When these conditions are not true, radically wrong computations may result:class/215/examples ==> cat p29.c p30.c

/* p29.c */

double adder(double a,double b){ return(a + b);}

/* p30.c */

int main(){ double sum;

sum = adder(2, 5); printf("sum = %lf \n", sum); sum = adder(2.0, 5.0); printf("sum = %lf \n", sum);}

class/215/examples ==> gcc -g p29.c p30.cclass/215/examples ==> a.outsum = -1073743736.000000 sum = 0.000000

95

Both parameter and return values are

incorrect here

Parameters are correct but return

value is still wrong

Results are badly, but differently, broken in

both cases

Proper use of header files

To correct this problem create a file p29.h

/* p29.h */

double adder(double a,double b);

and include it in p31.c


/* p31.c */

#include "p29.h"

int main(){ double sum;

sum = adder(2, 5); printf("sum = %lf \n", sum); sum = adder(2.0, 5.0); printf("sum = %lf \n", sum);}

The availability of the prototype causes:

1 The return value to be properly treated as double rather than the default int2 The integer parameters in the first call to be converted to double before the call.

class/215/examples ==> gcc -g p29.c p31.c

class/215/examples ==> a.outsum = 7.000000 sum = 7.000000

96

A makefile for a multimodule program:

The Unix make program is a handy utility that can be used to build things ranging from programs to documents. Elements of significance include:

targets labels that appear in column 1 and are followed by a the character “:” . The make command can take a target as an operand as in make ray.

dependencies are files that are enumerated following the name of the target. If any dependency is newer than the target, the target will be rebuilt.

rules are specified in lines following the target and specify the procedure for building the target. Rules must start with a tab character. In the example below the tab has been expanded as spaces but you may not enter spaces.

The following makefile can be used build the executable ray tracer named ray (assuming that it requires only the .o files enumerated in the command).

ray: sphere.o light.o model.o veclib3d.o main.o ray.h veclib3d.h gcc -o ray -g sphere.o veclib3d.o light.o model.o main.o -lm

.c.o: $<- gcc -c -Wall -g $< 2> $(@:.o=.err) cat $*.err

The dependency .c.o: is a special dependency called the suffix rule. It is telling make to use the commands that follow whenever it needs to make a .o file from a .c file. There are a number of predefined macro based names:

$@ -- the current target's full name$? -- a list of the target's changed dependencies$< -- similar to $? but identifies a single file dependency and is

used only in suffix rules

$* -- the target file's name without a suffix

Another handy macro based facility permits one to change prefixes on the fly. The macro $(@:.o=.err) says use the target name but change the .o to .err.

The same result effect may be obtained using $*.err as is done in the subsequent cat command.

97

Using macros in makefiles

Make macros are similar in spirit to Unix environment variables. In fact environment variables can be accessed in make files via macro calls. However, it is typically the case that the macros are defined within the makefile. Here is a makefile that is used to build a complete raytracer. A macro is defined by using the syntax MACRONAME = macro value. Many people use the convention of making names all capital but that is not required.

All of the .o files necessary to build in are defined using the macro name RAYOBJS. The \ character at the end of all but the last line is the standard Unix contuation character. The # character at the start of a line turns the line into a comment.

A macro is invoked using the syntax $(MACRONAME). The result of the invocation is that the string $(MACRONAME) is replaced by the current value of the macro.

RAYOBJS = main.o projection.o list.o model.o \ object.o plane.o material.o veclib3d.o \ image.o raytrace.o tplane.o psphere.o pplane.o \ sphere.o refsphere.o fplane.o \ projplane.o illum.o light.o texplane.o texture.o

INCLUDE = ray.h rayhdrs.h

# CFLAGS = -DAA_SAMPLES=12

ray: $(RAYOBJS) ray.h veclib3d.h rayhdrs.h makefile gcc -Wall -o ray -g $(RAYOBJS) -lm

$(RAYOBJS): $(INCLUDE)

DEBUG = -DDBG_PIX -DDBG_HIT -DDBG_WORLD -DDBG_AMB -DDBG_FIND

.c.o: $< -gcc -c -Wall $(CFLAGS) $(DEBUG) -g $< 2> $(@:.o=.err) cat $*.err

98

Executing make

Therefore when this makefile is invoked after modifying main.c, The commands that are actually executed are:

gcc -c -Wall -DDBG_PIX -DDBG_HIT -DDBG_WORLD -DDBG_AMB -DDBG_FIND -g main.c 2> main.err

cat main.err

gcc -Wall -o ray -g -pg main.o projection.o list.o model.o object.o plane.o material.o veclib3d.o image.o raytrace.o tplane.o psphere.o pplane.o sphere.o refsphere.o fplane.o projplane.o illum.o light.o texplane.o texture.o -lm

99

Debugging large programs

Although gdb is a powerful tool that should always be used in unit testing and when trying to identify the location of a fatal fault, it is not always the best tool for identifying errors during integrated system tests for large, complex programs.

Here it is often the case that having the program print out crucial elements of its internal state for offline analysis by the programmer may be required. After an error is identified from the print file, then gdb often is the best tool to pinpoint the source of the problem.

Finally, it is generally true that very large programs are never truly bug free. Testing usually stops when the failure rate of the program becomes acceptably low in someone's point of view. There are good economic reasons for this.

Therefore it is common to build debugging code into large programs. This code is never removed because the chances are good that additional failures will occur as the program is used and certainly when it is modified.

The make facility in conjunction with the C preprocessor provide a good mechanism for selective activation and deactivation of debugging code. It is required that you make use of this facility

Recall the macro:

DEBUG = -DDBG_PIX -DDBG_HIT -DDBG_WORLD -DDBG_AMB -DDBG_FIND

that was present in the makefile. The D flag allows one to specify to gcc that a particular symbol is defined in a sense that will be described later. In its current state the macro arms many debugging statements. If that leads to information overload the debugging data could be reduced by selective arming:

DEBUG = -DDBG_PIX -DDBG_HIT -DDBG_AMB

or when its thought that the program is ready for prime time debugging can be completely disabled by simply commenting out the macro definition and rebuilding the system.

# DEBUG = -DDBG_PIX -DDBG_HIT -DDBG_WORLD -DDBG_AMB -DDBG_FIND

100

Including debugging code in your program

When confronted with a problem in your program, since you don't know what the problem is, its hard to know where to put the debugging code. Hence the best technique is to put it everywhere. Just write it in as you go. To paraphrase Prof. Brooks: “You will anyway”.

Here are some typical examples that I have used. Note: The only new line character \n you will see is the first fprintf(). This is done because you want all of the data associated with a single ray to appear on one line. You also want to carefully format so that the columns line up. The human brain has awesome pattern recognition skills but you need to show it a pattern!

Debug code should be enclosed in #ifdef / #endif blocks. In this way it easy to enable or disable debug prints by simply recompiling with the desired set of symbols defined.

in make_image(): (This one should generally always be on when debug is active since the pixel coords are key to all that we see)

for (y = 0; ... ) {

for (x = 0; x < proj->win_size_pixel[0]; x++) {

#ifdef DBG_PIX fprintf(stderr, "\nPIX %4d %4d - ", x, y);#endif

in make_pixel(): (This one you might want to turn off once you are REAL SURE map_pix_to_world() is functional. )

map_pix_to_world(x, y, world);#ifdef DBG_WORLD fprintf(stderr, "WRL (%5.1lf, %5.1lf) - ", world[0], world[1]);#endif

101

in ray_trace(): If find_closest_obj() believes it hit something its useful to see what it thought was the nearest hit, how far away it was and where the hit occured.

closest = find_closest_obj(model->scene, base, dir, NULL, &mindist);

if (closest == NULL) return;

#ifdef DBG_HIT fprintf(stderr, " HIT %4d: %5.1lf (%5.1lf, %5.1lf, %5.1lf) - ", closest->objid, mindist, closest->hitloc[0], closest->hitloc[1], closest->hitloc[2]);#endif

and after computing the ambient intensity of the pixel value

#ifdef DBG_AMB fprintf(stderr, "AMB (%5.1lf, %5.1lf, %5.1lf) - ", intensity[0], intensity[1], intensity[2]);

#endif

in find_closest_obj(): After firing the ray at each object print the id of the object and the distance to the hit (which of course may be 1) if there was a miss. DBG_FIND should typically be used in initial testing with only one or two objects. If you have too many this will give you info overload.

if (obj != last_hit) { dist = obj->hits(base, dir, obj);#ifdef DBG_FIND fprintf(stderr, "FND %4d: %5.1lf - ", obj->objid, dist);#endif

102

Running the program:

Its a good idea to start with a small pixmap. Here is one that is 5 x 3 pixels. This is way to small to produce a useful image but if you have a broken program, it is not going to produce one anyhow so this is where to start:

ray 5 3 < a2t1.txt > a2t1.pxWindow size in pixels 5 3

Window size in world coordinates 8 6

Location of Viewpoint 0.0 0.0 3.0

Dumping object of type Plane Material data -Ambient - 5.000 5.000 0.000Diffuse - 0.000 0.000 0.000Specular - 0.000 0.000 0.000

Plane data normal - 0.000 0.000 -1.000point - 0.000 0.000 -5.000

PIX 0 0 - WRL ( -4.0, -3.0) - FND 100: 15.5 - HIT 100: 15.5 (-10.7, -8.0, -5.0) - AMB ( 0.3, 0.3, 0.0) - PIX 1 0 - WRL ( -2.0, -3.0) - FND 100: 12.5 - HIT 100: 12.5 ( -5.3, -8.0, -5.0) - AMB ( 0.4, 0.4, 0.0) - PIX 2 0 - WRL ( 0.0, -3.0) - FND 100: 11.3 - HIT 100: 11.3 ( 0.0, -8.0, -5.0) - AMB ( 0.4, 0.4, 0.0) - PIX 3 0 - WRL ( 2.0, -3.0) - FND 100: 12.5 - HIT 100: 12.5 ( 5.3, -8.0, -5.0) - AMB ( 0.4, 0.4, 0.0) - PIX 4 0 - WRL ( 4.0, -3.0) - FND 100: 15.5 - HIT 100: 15.5 ( 10.7, -8.0, -5.0) - AMB ( 0.3, 0.3, 0.0) - PIX 0 1 - WRL ( -4.0, 0.0) - FND 100: 13.3 - HIT 100: 13.3 (-10.7, 0.0, -5.0) - AMB ( 0.4, 0.4, 0.0) - PIX 1 1 - WRL ( -2.0, 0.0) - FND 100: 9.6 - HIT 100: 9.6 ( -5.3, 0.0, -5.0) - AMB ( 0.5, 0.5, 0.0) - PIX 2 1 - WRL ( 0.0, 0.0) - FND 100: 8.0 - HIT 100: 8.0 ( 0.0, 0.0, -5.0) - AMB ( 0.6, 0.6, 0.0) - PIX 3 1 - WRL ( 2.0, 0.0) - FND 100: 9.6 - HIT 100: 9.6 ( 5.3, 0.0, -5.0) - AMB ( 0.5, 0.5, 0.0) - PIX 4 1 - WRL ( 4.0, 0.0) - FND 100: 13.3 - HIT 100: 13.3 ( 10.7, 0.0, -5.0) - AMB ( 0.4, 0.4, 0.0) - PIX 0 2 - WRL ( -4.0, 3.0) - FND 100: 15.5 - HIT 100: 15.5 (-10.7, 8.0, -5.0) - AMB ( 0.3, 0.3, 0.0) - PIX 1 2 - WRL ( -2.0, 3.0) - FND 100: 12.5 - HIT 100: 12.5 ( -5.3, 8.0, -5.0) - AMB ( 0.4, 0.4, 0.0) - PIX 2 2 - WRL ( 0.0, 3.0) - FND 100: 11.3 - HIT 100: 11.3 ( 0.0, 8.0, -5.0) - AMB ( 0.4, 0.4, 0.0) - PIX 3 2 - WRL ( 2.0, 3.0) - FND 100: 12.5 - HIT 100: 12.5 ( 5.3, 8.0, -5.0) - AMB ( 0.4, 0.4, 0.0) - PIX 4 2 - WRL ( 4.0, 3.0) - FND 100: 15.5 - HIT 100: 15.5 ( 10.7, 8.0, -5.0) - AMB ( 0.3, 0.3, 0.0

103

Pointers to functions

Pointer variables may also hold the address of a function and be used to invoke the function indirectly:

class/215/examples ==> gcc p32.cclass/215/examples ==> cat p32.c

/* p32.c */

int adder(int a,int b){ return(a + b);}

main(){ int (*ptrf)(int, int); // declare pointer to function int sum;

ptrf = adder; // point it to adder (note no &)

sum = (*ptrf)(3, 4); // invoke it printf("sum = %d \n", sum);}class/215/examples ==> a.outsum = 7 class/215/examples ==>

104

Function pointers as doityourself polymorphism

Recall the the obj_t structure contained a function pointer:

typedef struct obj_type{ struct obj_type *next; /* objects are linked */ int objtype; /* light, sphere, floor, etc */ double (*hits)(double *base, double *dir,

struct obj_type obj); :

The hits pointers must be set in the initialization module. In the sphere_init function this pointer should be set as follows:

obj->hits = sphere_hits;

where the sphere_hits() function is declared as follows.

double sphere_hits(double *base, /* the (x, y, z) coords of origin of the ray */ double *dir, /* the (x, y, z) direction of the ray */obj_t *obj) /* the object to be tested for the hit. */{

}

105

The find_closest_object() function.

When a ray is fired from the viewpoint through a pixel, depending upon the nature of the objects in the scene, the ray may pass through several objects or it may hit none at all. The color of the pixel will be derived from the material properties of the first object the ray hits.

The ray trace function should rely upon find_closest_object to return a point to the closest objectthat is hit by the ray. If none of the objects in the scene is hit, NULL must be returned. The distance from the base of the ray to the nearest hitpoint is returned in mindist.

closest = find_closest_object(model->scene, base, dir, NULL, &mindist);

The find_closest_object() function just processes the complete object list and calls the appropriate hits function for each object found. obj = lst->head; while ((obj != NULL)) { if (obj != last_hit) { dist = obj->hits(base, dir, obj); : :

This approach provides a much cleaner and more efficient mechanism than the functionally equivalent way:

obj = lst->head; while ((obj != NULL)) { switch (obj->objtype) case SPHERE: dist = sphere_hits(base, dir, obj); break; case PLANE: etc

From the software reliability perspective an important advantage of this approach is that when a new object type is added it is not necessary to modify find_closest_object().

106

Procedural surfaces

Procedural surfaces are those in which an object's reflectivity properties are modulated as a function of the location of the hit point on the surface of the object.

There are literally an infinite number of ways to do this. In the next few pages we propose a framework for incorporating procedurally shaded surfaces into raytraced images.

107

Implementation of procedural shaders

Construction of such shaders is facilitated by the use of both inheritance and polymorphism within a C language framework. The procedurally shaded plane is an extremely lightweight refinement of the plane_t. In fact it is such a lightweight refinement that there is no need for a pplane_t data structure.

The distinction between a standard plane and a procedurally shaded plane is made at object initialization time by the pplane_init() function when it establishes a single function pointer that provides the polymorphic behavior.

That function pointer is taken from a table of pointers to programmer provided functions are contained in the module pplane.c and perform the procedural shading. These procedural shading functions are passed pointers to the obj_t structure and to the intensity vector whose (r,g, b) components are filled in procedurally. Here is an example in which there are three possible shaders. static void (*plane_shaders[])(obj_t *obj, double *intensity) ={ pplane0_amb, pplane1_amb, pplane2_amb};#define NUM_SHADERS sizeof(plane_shaders)/sizeof(void *)

Note that:1. The number of elements in the array is not explicitly specified. 2. The value NUM_SHADERS can be computed by dividing the size of the table by the

size of a single pointer.

The index of the shader to be used is supplied in the model description as shown below.

8 6 world x and y dims0 0 3 viewpoint (x, y, z)20 pplane8 8 8 ambient0 0 0 diffuse0 0 0 specular0 0 1 plane is a "back wall"0 0 -6 located 6 units behind the screen1 shader selector index

108

The pplane_init() functionAs shown below the pplane_init() function simply invokes the plane_init() function to construct the object and then overrides the default getamb() function, replacing it with the shader function whose index is provided in the model description files. obj_t *pplane_init(FILE *in,int objtype){ obj_t *new; double dndx; int ndx;

new = plane_init(in, objtype);

ndx = vl_get1(in, &dndx); if (ndx != 1) return(0);

ndx = dndx; if (ndx >= NUM_SHADERS) return(0);

new->getamb = plane_shaders[ndx]; return(new);}

The getamb element of the obj_t() structure is a pointer to a void function which is passed pointers to the object structure and the intensity vector.

typedef struct obj_type{ struct obj_type *next; /* Next object in list */ int objid; /* Numeric serial # for debug */ int objtype; /* Type code (14 -> Plane ) */

/* Optional plugins for retrieval of reflectivity *//* useful for the ever-popular tiled floor */

void (*getamb)(struct obj_type *, double *intensity);

109

The getamb() function

Recall that in the ambient only raytracer the last steps of the operation are: add mindist to total_dist set intensity to the ambient reflectivity of closest divide intensity by total_dist

The first inclination is to implement the small amount of code in step 2 in the obvious way:

intensity[0] = closest>material.ambient[0]; intensity[1] = closest>material.ambient[1]; intensity[2] = closest>material.ambient[2];

However that approach would make it not easy to override the default behavior. Thus a better approach is to replace the three lines above by:

obj>getamb(obj, intensity);

During the object_init() object constructor sets the getamb() function pointer to the default_getamb() function which contains the three lines of code we just replaced.

While this adds a slight bit of run time overhead, it also provides us with an easy hook with which we may override the default_getamb() with a custom shader.

110

Examples of procedural shadersAlternating stripes of various shapes appear on the back wall in the figure previously shown. The algorithm used to generate that part of the image is given here. A vector V pointing from the point defining the location of the plane to the hitpoint is computed first. Then the following function of V is computeed.

f(V) = 1000 + vx vy2 / 100 + vx vy / 100

Although f(V) is clearly a continuous function from R3 to R1, the striping effect is obtained by converting the value to integer and then altering the color value that is returned based upon whether or not the integer is even or odd.

The constant value 100 affects the width of the stripe. The value 1000 (known as “Westall's hack) is used to avoid a nasty “doublewide” stripe where the value of the function ranges between [1 and +1] since all values in that range are integerized to 0. Note that this procedure is designed specifically for a back wall as the x and y coordintates of the hitpoint vary but the z coordinate does not. Applying it to a floor would produce a solid color (why)??

void pplane1_amb(obj_t *obj,double *value){ double vec[3]; plane_t *p = (plane_t *)(obj->priv); int isum; double sum; vl_copy3(obj->material.ambient, value); vl_diff3(p->point, obj->hitloc, vec); sum = 1000 + vec[0] * vec[1] * vec[1] / 100 + vec[0] * vec[1] / 100; isum = sum; if (isum & 1) value[0] = 0; // zap red else value[2] = 0; // zap blue}

The tiling effect seen on the floor can be achieved (on the back wall) by computing separate functions fx(vx) and fy(vy), integerizing them separately, adding the integers and selecting color based upon whether the sum is even or odd.

111

Continuously modulated shading

The image shown below is produced by a procedural shader that continously modulates the ambient reflectivity.

The modulation function is shown below. A vector V in the direction from the point defining the plane location to the hitpoint is computed first. Then the angle that the vector makes with the positive X axis is computed. Finally the red, green and blue components are modulated using the function 1 + cos(ω t ++ φ ) where the angular frequency ω is 2 for all three colors, and phase angles φ are 0, 2π /3, and 4π / 3 respectively. Different effects may be obtained by using different frequencies and phase angles for each color, and, as shown in the example images, it is also possible to combine continuous modulation with striping or tiling.

vl_diff3(p->point, obj->hitloc, vec);

v1 = (vec[0] / sqrt(vec[0] * vec[0] + vec[1] * vec[1])); t1 = acos(v1);

if (vec[1] < 0) // acos() returns values in [0,PI] t1 = 2 * M_PI - t1; // extend to [0, 2PI] here

value[0] *= (1 + cos(2 * t1)); value[1] *= (1 + cos(2 * t1+ 2 * M_PI / 3)); value[2] *= (1 + cos(2 * t1+ 4 * M_PI / 3));

112

Tables of function pointers The model dumping and loading operation provides another example in which tables of function pointers can replace a big messy switch construct and simplify the adding of new object types. The following example

1 declares an array of pointers to object constructor functions2 initializes the pointers to point to the appropriate functions

obj_t *dummy_init(FILE *in, int objtype){ return(NULL); }

obj_t *(*obj_loaders[])(FILE *in, int objtype) ={ dummy_init, /* placeholder for type 10 (light) */ dummy_init, /* placeholder for type 11 */ dummy_init, /* placeholder for type 12 */ sphere_init, /* object type 13 */ plane_init /* object type 14 */};

int model_init(FILE *in,list_t *lst){ char buf[256]; int objtype; obj_t *new;

/* now load the objects in the scene */ while (fscanf(in, "%d", &objtype) == 1) { fgets(buf, 256, in); /* consume rest of line */ if ((objtype >= FIRST_TYPE) && (objtype <= LAST_TYPE)) { if ((new = (*obj_loaders[objtype - FIRST_TYPE])(in, objtype))

== NULL) return(-2);

add new object to proper list (scene or lights) }

else /* invalid object id */ {

fprintf(stderr, “Invalid object type %d \n”, objtype); return(-1); } }}

113

Malloc'd objects, garbage collection, and memory leaks

The Java language provides for "automagic" garbage collection in which storage for an object is magically reclaimed when all references to an object have gone out of existence.

C provides no such mechanism.

A memory leak is said to have occurred when:

1. the last pointer to a malloc'd object is reset or2. the last pointer to a malloc'd object is a local variable in a function from which a

return is made,

In these cases the malloc'd memory is no longer accessible. Excessive leaking can lead to poor performance and, in the extreme, program failure.

Therfore C programmers must recognize when last pointer to malloc'd storage is about to be lost and use the free() function call to release the storage before it becomes impossible to do so.

Several examples of incorrect pointer use and memory leaking have been observed in student programs

114

Problem 1: The instant leak.

This is an example of an instant leak. The memory is allocated at the time temp is declared and leaked when temp is reassigned the address of the first object in the list.

obj_t *temp = malloc(sizeof(obj_t));

temp = list->head;while (temp != NULL) -- process list of objects --

Two possible solutions:

Insert free(temp) before temp = list>head;

This eliminates the leak, but what benefit is there to allocating storage and instantly freeing it???

Change the declaration to obj_t *temp = NULL;

This is the correct solution.

A rational rule of thumb is never malloc memory unless you are going to write into it!

Another good rule of thumb is to never declare a pointer without initializing it.

115

Problem 2: The traditional leak

Here storage is also allocated for univec at the time it is declared.

The call to vl_unitvec3() writes into that storage.

If the storage is not malloc'd, then univec will not point to anything useful and the call to vl_unitvec3() will produce a segfault or will overwrite some other part of the program's data. So this malloc() is necessary.

{ double *univec = malloc(3 * sizeof(double));

vl_unitvec3(dir, univec); : more stuff involving univec : return; }

However, the instant the return statement is executed, the value of univec becomes no longer accessible and the memory has been leaked.

Here the correct solution is to add

free(univec);

just before the return;

A rational rule of thumb is: malloc'd storage must be freed before the last pointer to it is lost.

116

Problem 3: Overcompensation

The concern about leakage might lead to an overcompensation. For example, an object loader might do the following:

{obj_t *new_obj;

: new_obj = malloc(sizeof(obj_t)); : if (list->head == NULL) { list->head = list->tail = new_obj; } else { list->tail->next = new_obj; list->tail = new_obj; }

free(new_obj);

return(0);}

This problem is the reverse of a memory leak. A live pointer to the object exists through the list structure, but the storage has been freed.

The results of this are:1. Usually attempts to reference the freed storage will succeed. 2. The storage will eventually be assigned to another object in a later call to malloc().3. Then “both” objects will occupy the same storage.

Rational rule of thumb: Never free an object while live pointers to the object exist. Any pointers to the freed storage that exist after the return from free() should be set to NULL.

To fix this problem the free(new_obj) must be deleted from the code. If the objects in the object list are to be freed, it is safe to do so only at the end of the raytrace.

It is not imperative to do so at that point because the Operating System will reclaim all memory used by the process when the program exits.

117

Problem 3b: Overcompensation revisited

The free() function must be used only to free memory previously allocated by malloc()

unsigned char buf[256];::free(buf);

is a fatal error.

The free() function must be not be used to free the same area twice.

buf = (unsigned char *)malloc(256); : free(buf); : free(buf);

is also fatal.

118

The general solution: Reference counting

For programs even as complicated as the raytracer it is usually easy for an experienced programmer to know when to free() dynamically allocated storage.

In programs as complicated as the Linux kernel it is not.

A technique known as reference counting is used.

typedef struct obj_type{ int refcount; : :} obj_t;

At object creation time:

new_obj = malloc(sizeof(obj_t));new_obj->refcount = 1;

When a new reference to the object is created

my_new_ptr = new_obj;my_new_ptr->refcount += 1;

When a reference is about to be reused or lost

my_new_ptr->refcount -= 1;if (my_new_ptr->refcount == 0) free(my_new_ptr);my_new_ptr = NULL;

In a multithreaded environment such as in an OS kernel it is mandatory that the testing and update of the reference counter be done atomically. This issue will be addressed in CPSC 322.

119

An alternate approach to shortterm memory allocation

As discussed earlier the malloc() function allocates memory from the heap which typically grows upward from the bottom of memory. The alloca() function may be used to dynamically allocate memory from the stack which grows downward. On some systems the runtime environment is set up in such a way that there is considerably more memory available in the heap than the stack. On others times it may be the case that neither the size of the heap nor the stack is constrained.

The alloca() function should be used instead of malloc() when allocating

1 relatively small objects (such as a single 3 component double precision vector) that

2 should be freed at the end of the function in which they are created

The syntax of the call to alloca() is identical to calls to malloc().

Example:

double *unitvec = (double *)alloca(3 * sizeof(double));

Reasons why alloca() is preferable to malloc() include

1 alloca() operates by simply decrementing the stack pointer by the amount of storage requested and then returning the new value of the stack pointer. Thus it is much more efficent than malloc()

2 Because the entire stack frame of a function is popped when the function returns:

a there is no need to try to remember to free the storage, b no way that a memory leak can occur, and c the implicit free operation unlike the free() function is computationally free!

120

Special cautions

1 It is illegal to free() storage allocated by alloca(). If you try to do this and are lucky, you will get an instant segfault. If you are not so lucky, you will corrupt the malloc()/free() system and cause the program to fail later in some weird and wonderous way that will be very hard to diagnose.

2 A function may never return a pointer to storage allocated with alloca(). If you should try to do the following you will generally not get an instant segfault. But you will get all manner of weird and wonderous behavior when the stack resident storage the object occupies is “recycled” by functions whose stack frames later overlay the memory where the object resides !!!

obj_t *newobj = alloca(sizeof(obj_t));

newobj->objectid = nextid++;newobj->objnext = NULL;

-- etc --

return(newobj);

121

Diffuse illumination

Diffuse illumination is associated with specific light sources but is reflected uniformly in all directions. A white sheet of paper has a high degree of diffuse reflectivity. It reflects light but it also scatters it so that you cannot see the reflection of other objects when looking at the paper.

To model diffuse reflectivity requires the presence of one or more light sources. The first type of light source that we will consider is the point light source. This idealized light source emits light uniformly in all directions. It may be located on either side of the virtual screen but is, itself, not visible.

The lights will be managed in a way analogous to the visible objects, but will reside on a separate list.

typedef struct model_type{ proj_t *proj; list_t *lights; list_t *scene;} model_t;

122




struct light_type{ double center[3]; }

struct light_type{ double center[3]; }

Properties of point light sources

The location of the light is carried in the light_t structure.

typedef struct light_type{ double center[3];} light_t;

The brightness of the light is carried in the emissivity vector of the obj_t. By using r, g, b components of different magnitudes it is possible to create lights of any color. The color of a pixel that is illuminated by a light is the componentwise product of the emissivity and the diffuse reflectivity of the visible object. Therefore, in a raytraced scene consisting of a single plane and a single light it is not possible to determine from the image whether a red material is being illuminated by a white light or a white material is being illuminated by a red light! Furthermore, if a pure red light illuminates a pure green sphere, the result is purely invisible! This effect is counter to our experience because in the “real world” monochromatic lights and objects with pure monochromatic reflectivity are not commonly encountered.

It could be argued that the emissivity should properly reside in either the material_t or the light_t.

typedef struct obj_type{ struct obj_type *next; /* Next object in list */ int objid; /* Numeric serial # for debug */

/* Reflectivity for reflective objects */

material_t material;

/* These fields used only in illuminating objects (lights) */

double emissivity[3]; /* For lights */

Specifying a light in a model description file

10 code for light4 4 4 emissivity (a white light)8 6 -2 location

123

The diffuse illumination procedure

Diffuse illumination should not be computed in the with ray_trace() function. The procedure is sufficiently complicated that it should be implemented in a separate module that we will call illuminate.c It will contain a function called diffuse_illumination() that will be called by the ray_trace() function.

/**//* This function traces a single ray and returns the composite *//* intensity of the light it encounters. It is recursive and *//* so the start of the ray cannot be assumed to be the viewpt */

void ray_trace(list_t *objs, /* object list */double *base, /* world coord of ray start pt */double *dir, /* vector direction of ray */double *pix, /* floating point [r,g,b] */double total_dist) /* total distance so far */{

Test each nonlight object to see if it is hit by the ray and set “closest” to point to the nearest such object hit.

If closest is NULLreturn;

Add the distance from base of the ray to the hit point to total_distAdd the ambient reflectivity of the object to the intesity vectorAdd the diffuse reflectivity of the object at the hitpoint to the intensity vectorScale the intensity vector by 1 / total_dist.

}

Important notes:

The ambient reflectivity is a property of the object as a whole.The diffuse reflectivity is a function of the specific hit point. The contribution of all lights that are visible from the hit point must be summed.

124

Computing the diffuse reflectivity of an object

A function having the following prototype provides a useful front end for the diffuse computation.It should be called from ray_trace() at as shown in the pseudo code on the previous page. void diffuse_illumination(model_t *model, /* pointer to model structure */obj_t *hitobj, /* object that was hit by the ray */double *intensity) /* where to add intensity */{

for all lights on the light list{

process_light()}

}

Determining if a light illuminates the hit point

We use idealized point light sources which are themselves invisible, but do emit illumination. Thus lights themselves will not be visible in the scene but the effect of the light will appear.

The process_light() function determines performs this portion of the alogorithm.

int process_light(list_t *lst, /* List of all objects */obj_t *hitobj, /* The object hit by the ray */obj_t *lobj, /* the current light source */double *ivec) /* [r, g, b] intensity vector */ {

if the hitobj occludes itself return

find_closest_object() along a ray from hitloc to the center of the lightif one exists and is closer than the light // the light is occluded by the object returncompute the illumination and add it to *pix;

}

125

Testing for occlusion

The diagram below illustrates the ways in which objects may occlude lights. The small yellow spheres represent lights and the large blue ones visible objects.

We will assume convex objects. An object is selfoccluding if the angle between the surface normal and a vector from the hit point toward the light is larger than 90 degrees.

A simple test for this condition is that an object is not selfoccluding if

the dot product of a vector from the hit point to that light with the surface normal is positive.

To see if a light is occluded by another object, it is necessary to see if a ray fired from the hitpoint to the light hits another object before it reaches the light. This can be accomplished via a call to find_closest_object() The light is occluded if and only if

(1) the ray hits some new objectAND(2) the distance to the hit point on the new object is less than the distance to the light.

126

Self occluded

Hit Point Surface normal

θ

Notoccluded

Occluded by

Computing the illumination

If the light does illuminate the hitpoint, its effect must be added to the pixel intensity vector *pix.

*(ivec + 0) += diffuse[0] * lobj>emissivity[0] * cos(Θ) / dist_from_hit_to_light *(ivec + 1) += diffuse[1] * lobj>emissivity[1] * cos(Θ) / dist_from_hit_to_light *(ivec + 2) += diffuse[2] * lobj>emissivity[2] * cos(Θ) / dist_from_hit_to_light

The above computation assumes that the diffuse[] vector was filled in by a call to:hitobj>getdiff(hitobj, diffuse). Such a mechanism will allow procedural shading to work properly in conjunction with diffuse illumination.

If procedural shading is not supported then diffuse[] should be replaced in the assignments shown above by hitobj>material.diffuse[]

127

Debugging diffuse illumination

Though the procedure is straightforward, there are many places where a minor error may render a light completely inoperative. Attempting to do problem diagnosis by looking at an all black image is not very productive.

There are so many things going on here, I recommend a different form of debugging than used in the simpler environment earlier. In the process light function you should dump relevant elements of the computation.

#ifdef DBG_DIFFUSE light_t *lt = lobj->priv; vl_vecprn1("hit object id was ", &hitobj->objid); vl_vecprn3("hit point was ", hitobj->hitloc); vl_vecprn3("normal at hitpoint ", hitobj->normal); vl_vecprn1("light object id was ", &lobj->objid); vl_vecprn3("light center was ", lt->center); vl_vecprn3("unit vector to light is ", dir); vl_vecprn1("distance to light is ", &dist); vl_vecprn1("cosine is ", &cos);#endif

When another object occludes the light:

#ifdef DBG_DIFFUSE/* If occluded by another object */ vl_vecprn1("hit object occluded by ", &obj->objid); vl_vecprn1("distance was ", &close);#endif return(0); }

When the illumination is finally computed:

#ifdef DBG_DIFFUSE vl_vecprn3("Emissivity of the light ", emiss); vl_vecprn3("Diffuse reflectivity ", diffuse); vl_vecprn3("Current ivec ", intensity);#endif

128

Two dimensional arrays and matrix operations in C

Two dimensional arrays are declared as in Java:

double x[4][5];

The declaration above creates 4 x 5 = 20 double precision values. The values are stored in the following order:

x[0][0], x[0][1], ... x[0][4], x[1][0], ...., x[3][4]

This is consistent with subscripting techniques commonly used in mathematics in which the first subscript represents a row and the second represents a column.

129

Passing two dimensional arrays as parameters

/**//* multiply two three-D matrices together */void vl_matmult3(double inleft[][],double inright[][],double out[][]){ double x;

x = inleft[2][1];}

==> gcc -c twod.ctwod.c: In function `vl_matmult3':twod.c:11: invalid use of array with unspecified bounds

The problem here is that the compiler has now way to know how many columns there are in each row of the matrix. Recall the way that we implicitly handled two dimensional pixmaps


To obtain the offset of the start of a particular row we had to know how many columns were in the pixmap.

Defining arrays passed as parameters

The correct way is to specify the length of the columns (or both rows and columns)

void vl_matmult3(double inleft[][3],double inright[][3],double out[3][3]){ double x;

x = inleft[2][1];} ==> gcc -c -Wall twod3.c ==>

130

Passing matricies as actual arguments in the calling function

For the parameter passing to work correctly, all that is technically required is for the caller to provide the address of the first element in the matrix. As is shown below there are many ways to accomplish this mission but may of them produce undesirable compiler warnings.

3 void vl_matmul3( 4 double inleft[][3], 5 double inright[][3], 6 double out[][3]) 7 {

10 }

12 int main() 13 { 14 double m1[3][3]; 15 double m2[3][3]; 16 double *mp; 21 mp = m1[0]; 23 vl_matmul3(m1, m1, m2); 24 vl_matmul3(&m1[0][0], m1, m2); 25 vl_matmul3(&m1[0], m1, m2); 26 vl_matmul3(mp, m1, m2); 27 vl_matmul3(&m1, m1, m2); 28 return(0); 29 }

==> gcc -Wall -c twod4.ctwod4.c: In function `main':twod4.c:24: warning: passing arg 1 of `vl_matmul3' from incompatible pointer typetwod4.c:26: warning: passing arg 1 of `vl_matmul3' from incompatible pointer typetwod4.c:27: warning: passing arg 1 of `vl_matmul3' from incompatible pointer type ==>

131

These examples show that:For the 2D array use the name of the first row if you wish to set a pointer to the start of the array

To pass the array as a parameter use either its name or the address of its first row.

However, if the receiving function understands what the structure of the array is, it canin fact recover from all of the above faux pas as shown in the following example. void vl_matprn3(char *label,double mat[3][3]);

void vl_idmat3(double mtx[3][3]);

int main(){ double m1[3][3]; double *mp; double v1[3]; double *vp;

vp = v1; mp = m1[0];

vl_idmat3(m1); m1[2][1] = 4; m1[1][2] = 7; vl_matprn3("17 ",m1); vl_matprn3("18 ",&m1[0][0]); vl_matprn3("19 ",&m1[0]); vl_matprn3("20 ",mp); vl_matprn3("21 ",&m1); return(0);}

132

17

1.000 0.000 0.000 0.000 1.000 7.000 0.000 4.000 1.000

18

1.000 0.000 0.000 0.000 1.000 7.000 0.000 4.000 1.000

19

1.000 0.000 0.000 0.000 1.000 7.000 0.000 4.000 1.000

20

1.000 0.000 0.000 0.000 1.000 7.000 0.000 4.000 1.000

21

1.000 0.000 0.000 0.000 1.000 7.000 0.000 4.000 1.000

133

Matrix multiplication

The matrix product of two 3 x 3 matrices is also a 3 x 3 matrix. The multiplication rule is as follows:

product[i][j] = the dot product of the ith row of the left matrix with the jth column of the right matrix.

1.0 1.0 0.0 1.0 0.0 2.0 1.0 1.0 2.0 -1.0 1.0 0.0 x 0.0 1.0 0.0 = -1.0 1.0 -2.0 0.0 0.0 1.0 -2.0 0.0 1.0 -2.0 0.0 1.0

Notes:1. Matrix multiplication is not commutative: A x B != B x A in general2. Since the elements of a column of a matrix don't occupy adjacent locations in

memory you can't use vl_dot3 directly in this computation.

The identity matrix is given by:

1.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 1.0

Multiplying any matrix on the left or on the right by the identity matrix yields the original matrix

134

Multiplication of a matrix times a vector.

The product of a 3 x 3 matrix with a 3d column vector is a 3d vector. The multiplication rule is as follows:

product[i] = the dot product of the ith row of the matrix with the vector.

1.0 1.0 0.0 1.0 1.0 -1.0 1.0 0.0 x 0.0 = -1.0 0.0 0.0 1.0 -2.0 -2.0

Notes:

The product is defined only with the matrix on the left and the column vector on the rightIf the (logical) column vector is actually stored as a 3 element array in C then vl_dot3 may be used in the computation.

135

The transpose of a matrix:

The transpose of a three by three matrix is also three by the matrix. Its elements are given by a simple rule:

transpose[i][j] = original[j][i] T

1.0 3.0 2.0 1.0 1.0 -2.0 1.0 2.0 -2.0 = 3.0 2.0 0.0 -2.0 0.0 1.0 2.0 -2.0 1.0

Notes:The diagonal elements of a matrix and its transpose are identical. Off diagonal elements are interchanged in a symmetrical way.

The transpose of a matrix is in general not the same as the inverse of a matrix.

Safely transposing a matrix in place

We have emphasized the danger of in place transpose when the input and output parameters might be aliases to the same location. However, there is a safe and effecient way to transpose in place:

for (i = 0; i < 3; i++){ for (j = 1; j < 3; j + 1) { swap_int(matrix[i] + j, matrix[j] + i); } }

void *swap_int(int *i1,int *i2){ int tmp tmp = *i1; *i1 = *i2; *i2 = tmp;}

136

The cross product of two vectors

Given two linearly independent (not parallel) vectors:

V = (vx, vy, vz)W = (wx, wy, wz)

The cross product sometimes called outer product is a vector which is orthogonal (perpendicular to) both of the original vectors.

V x W = (vy wz vz wy, vz wx vx wz , vx wy vy wx)

(1, 1, 1) x (0, -1, 0) = (1, 0, -1)

Notes: The vector (0, 1, 0) is the negative y axis. Therefore, any vector that is perpendicular to it must lie in the y = 0 plane. The projection of the vector (1, 1, 1) onto the y = 0 plane is the vector (1, 0 1). The vector (1, 0, 1) is then perpendicular to this vector and lies in the y=0 plane.

In a righthanded coordinate system

X x Y = ZY x Z = XZ x X = Y

137

Projection

Assume that V and N are unit vectors. The projection Q of V on N is shown in red. It is a vector in the same direction as N but having length cos(θ ). Therefore

Q = (N dot V) N

Now assume that N is a normal to a plane shown as a yellow line. The projection P of V onto the plane is shown in magenta and is given by V + G where G is the vector shown in green.

Since G and Q have clearly have the same length but point in opposite directions, G = Q

Therefore the projection of a vector V onto a plane with normal N is given by:

P = V (N dot V) N

or (possibly)

vl_diff3(vl_scale3(vl_dot3(N, V), N), V, P);

In building your new linear algebra routines it is desirable to build upon existing ones where possible but extreme levels of nesting of function calls as shown here can complicate debugging.

138

θ

V

N

Q

P

Rotation matrices

Rotation matrices are used to rotate coordinate systems in 3space. They have some special properties:

The three rows are mutually orthogonal unit vectors. That is, the dot product of any pair of rows is 0.

The three columns are also mutually orthogonal unit vectors.

The inverse of a rotation matrix is its transpose.

The 1st row of a rotation matrix is a vector which will be mapped to [1, 0, 0] under the rotation. The 2nd row is a vector will be mapped to [0, 1, 0] and the third row is a vector that will be mapped to [0, 0, 1].

This example shows that the middle row is mapped to (0, 1, 0)

|r0,0 r0,1 r0,2 | | r1,0 | | 0 | |r1,0 r1,1 r1,2 | | r1,1 |= | 1 | |r2,0 r2,1 r2,2 | | r1,2 | | 0 |

139

Operations on matrices and vectors:

It will be necessary to add the following functions to your vector algebra library. It is crucial that you support aliasing of parameters in a nondestructive way. For example, a call of the following format must work correctly:

vl_matmul3(m1, m2, m1);

For this to work correctly, you will need to compute the answer in a local matrix and then copy it back to the output matrix after it has been computed.

/**//* Construct an identity matrix */

void vl_idmat3(double mtx[][3]);

/**//* Compute the outer product of two input vectors */

void vl_cross3(double *v1, /* Left input vector */double *v2, /* Right input vector */double *v3) /* Output vector */

/**//* 3 x 3 matrix multiplier */

void vl_matmul3(double x[][3], /* Left input matrix */double y[][3], /* Right input matrix */double z[][3]); /* Result matrix */

140

/**//* 3 x 3 matrix transpose */

void vl_xpose3(double x[][3], /* Left input matrix */double z[][3]);

/**//* Perform a linear transform in four dimensional space *//* By applying a 3 x 3 matrix to a 3 x 1 column vector */

void vl_xform3(double y[][3], /* Xform matrix */double *x, /* Input vector */double *z) /* Output vector *

/**//* project a vector onto a plane */

void vl_project3(double *n, /* plane normal */double *v, /* input vector */double *w) /* projected vector */{

141

Examples of usage

#include "ray.h"int main(){ double m1[3][3]; double m2[3][3]; double m3[3][3]; double v1[3]; double v2[3]; double v3[3]; double xaxis[3] = {1.0, 0.0, 0.0}; double yaxis[3] = {0.0, 1.0, 0.0}; double zaxis[3] = {0.0, 0.0, 1.0};

vl_idmat3(m1); // create an identity matrix m1[0][1] = 2; m1[0][2] = 3; m1[1][0] = 15; m1[1][2] = 16; m1[2][0] = 22; m1[2][1] = 23; vl_matprn3("Original", m1); vl_transpose3(m1, m2); vl_matprn3("Transpose", m2); vl_matmul3(m1, m2, m3); vl_matprn3("Product", m3);

Original 1.000 2.000 3.000 15.000 1.000 16.000 22.000 23.000 1.000

Transpose 1.000 15.000 22.000 2.000 1.000 23.000 3.000 16.000 1.000

Product 14.000 65.000 71.000 65.000 482.000 369.000 71.000 369.000 1014.000

142

v1[0] = 1; v1[1] = 2; v1[2] = 3;

vl_xform3(m2, v1, v2); vl_vecprn3("Transformed vector", v2);

vl_cross3(xaxis, yaxis, v2); vl_vecprn3("X x Y is:", v2);

vl_cross3(yaxis, zaxis, v2); vl_vecprn3("Y x Z is:", v2);

vl_cross3(zaxis, xaxis, v2); vl_vecprn3("Z x X is:", v2);

vl_cross3(xaxis, zaxis, v2); vl_vecprn3("X x Z is:", v2);

vl_cross3(xaxis, xaxis, v2); vl_vecprn3("X x X is:", v2);}

Transformed vector 97.000 73.000 38.000

X x Y is: 0.000 0.000 1.000

Y x Z is: 1.000 0.000 0.000

Z x X is: 0.000 1.000 0.000

X x Z is: 0.000 -1.000 0.000

X x X is: 0.000 0.000 0.000

143

Finite rectangular planes.

The planes that we have previously used have all been of unbounded size. We can also define a finite or bounded plane object.

The point field used in the definition of an unbounded plane represents the location of the “lower left” corner of the finite, rectangular plane. The plane normal plays its usual role. Additional quantites are required:

The vector xdir[3] when projected into the plane and represents the direction of increase of the x coordinate. The vector should be projected and converted into a unit vector either at the time the finite plane definition is loaded or at the time the finite plane description is dumped to the stderr log.

The vector xsize[2] vector provides the width and height of the rectangular plane in the x and y directions respectively.

144

Data structures for the finite planeIn the C language, there is no “built in” support for derived classes and inheritance, but, as seen earlier, we can build it in ourselves, by adding a priv pointer the the plane type. typedef struct plane_type{ double normal[3]; : void *priv; /* Data for specialized types */} plane_t;

typedef struct fplane_type{ double xdir[3]; /* x axis direction */ double size[2]; /* width x height */ double rotmat[3][3]; /* Rotation matrix */ double lasthit[2]; /* used for textures */} fplane_t;

Alternatively (but more easily) one could just junk up the plane_t structure with finite plane and/or textured plane attributes.

Input Data for the finite plane

15 finite plane0 0 0 r g b ambient0 6 6 r g b diffuse0 0 0 r g b specular

1 0 3 normal4 1 -3 point

1 2 0 x direction5 5 size

145

Initializing the fplane_t

In a true object oriented language we would have the following inheritance structure

When a new plane_t was created, constructors for obt_t, plane_t, and fplane_t would be automatically activated in that order. As we have seen there are no constructors for C structures, but we can emulate the behavior through explicit calls and avoid needless duplication of code.

obj_t *obj_init(...){ allocate new obj_t; link it into the object list; return(obj);}

obj_t *plane_init(...){ obj = obj_init(...); allocate new plane_t; link it to obj->priv; load material properties load plane geometry return(obj);}

obj_t *fplane_init(...){ obj = plane_init(...); pln = obj->priv; allocate new fplane_t; link it to pln->priv; read xdir, size; project xdir onto infinite plane compute required rotation matrix return(obj);}

An analogous strategy can be used in fplane_dump().

146

obj_t plane_t fplane_t

The fplane_hits() function

The obj>hits() pointer for a finite plane should point to fplane_hits().

double fplane_hits(double *base, /* the (x, y, z) coords of origin of the ray */ double *dir, /* the (x, y, z) direction of the ray */obj_t *obj) /* the object to be tested for the hit. */{

Even though an fplane_hits() function is required, it would be a very bad idea to paste the internals of plane_hits() inline here. Instead, plane_hits() should be invoked to determine if and where the ray hits the infinite plane in which the finite plane is contained. t = plane_hits(base, dir, obj); if (t < 0) return(t);

Arrival here means that the ray hit the infinite plane and that the location of the hit has been stored in obj>hitloc[], The next task is to determine is within the bounds of the prescribed rectangular area.

In general, this seems like a very difficult problem. But there is a case for which the answer is simple. Suppose the base of the rectangle happened to be at (0, 0, 0), the xdir[] vector was (1,0.0) and the plane normal is (0, 0, 1). In that case, the rectangular finite plane is based at the origin and lies in the (x, y) plane. Therefore the following test could be applied. if ((obj->hit[0] > fp->size[0]) || (obj->hit[0] < 0.0)) return(-1);

if ((obj->hit[1] > fp->size[1]) || (obj->hit[1] < 0.0)) return(-1);

147

Transforming the coordinates of the finite plane.

A twostep coordinate system transformation may be applied to the original obj>hitloc[] to permit use of the simple test on the previous page:

(1) translate (move) the lower left corner of the finite plane to the origin.and(2) rotate the coordinate system so that

the plane normal rotates into the positive Zaxis and the xdir[] vector rotates into the Xaxis

Step 1 can be accomplished via a simple:

vl_diff3(pln->point, obj->hitloc, newhit);

Constructing the rotation is slightly more complicated. Once the rotation has been constructed the second step may be accomplished via:

vl_xform3(rot, newhit, newhit);

After this is done, the simple test on the previous page may be applied to newhit.

148

Rotating an arbitrary vector pair of orthogonal vectors into the x and z axes

int main(){ double rot[3][3]; double irot[3][3]; double norm[3]; double xdir[3];

double v1[3]; double v2[3]; double v3[3];

/* plane normal */

norm[0] = 1.0; norm[1] = 1.0; norm[2] = 1.0;

/* the x direction */

xdir[0] = 1.0; xdir[1] = 0.0; xdir[2] = -1.0;

/* The first row of the rotation matrix will rotate into *//* the x-axis so that's where we put xdir */

vl_unitvec3(xdir, rot[0]);

/* We want the normal to end up on the z-axis so we make it *//* the third row of the rotation matrix */

vl_unitvec3(norm, rot[2]);

/* Once two rows of a rotation are set, the third one *//* is automatic... it has to be orthogonal to the *//* other two. */

vl_cross3(rot[2], rot[0], rot[1]); vl_matprn3("Rotation matrix", rot);

149

/* Demonstrate that the normal does indeed rotate into the z-axis */

vl_xform3(rot, norm, v2); vl_vecprn3("\nRotated normal ", v2);

/* and that xdir rotates into the x-axis */

vl_xform3(rot, xdir, v3); vl_vecprn3("Rotated xdir ", v3);

/* The inverse of a rotation is its transpose */

vl_xpose3(rot, irot); vl_xform3(irot, v3, v1); vl_vecprn3("Rotated back xdir", v1);}

Rotation matrix

0.707 0.000 -0.707 -0.408 0.816 -0.408 0.577 0.577 0.577

Rotated normal 0.000 0.000 1.732Rotated xdir 1.414 0.000 0.000Rotated back xdir 1.000 0.000 -1.000

150

Scope and storage class of variables

The scope of a variable refers to those portions of a program wherein it may be accessed.

Failure to understand scoping rules can lead to two problems:

(1) Syntax errors (easy to find and fix)

(2) Accidentally using the wrong instance of a variable (sometimes very hard one to find).

Two general rules apply

(1) The declaration of a variable must precede any use of it.

(2) If a particular line of code is in the scope of multiple variables of the same name the innermost declaration of the variable is the one that is used.

Specific refinements of these rule include:

(1) the scope of any variables declared outside any function is all code in the source module that appears after the definition.

(2) the scope of any variable declared inside a basic block is all code in that block and any blocks nested within that block that appears after the definition.

151

Improper definition location

1 /* p13.c */ 2 3 /* this program demonstrates some of the characteristics */ 4 /* of variable scoping in C. */ 5 6 /* The scope of y and z is all lines that follow their */ 7 /* definitions. Thus z may be used in f1 and f2 */ 8 /* but y may be used only in f2 */ 9 10 int z = 12; 11 12 int f1( 13 int x) 14 { 15 x = x + y + z; 16 return(x); 17 } 18 19 int y = 11; 20 21 int f2( 22 int x) 23 { 24 x = x + y + z; 25 26 } 27

class/215/examples ==> gcc p13.cp13.c: In function `f1':p13.c:15: `y' undeclared (first use in this function)p13.c:15: (Each undeclared identifier is reported only oncep13.c:15: for each function it appears in.)p13.c: At top level:p13.c:19: `y' used prior to declaration

152

Overlapping scope

Example program p14.c illustrates that multiple declarations of a variable having a single name is legal and results in overlapping scope.

In this program there do exist three different variables named y. When the program accesses y which y is used is governed by the innermost definition rule.

/* p14.c */

/* This program illustrates that multiple different *//* declarations of a variable having the same name *//* may have overlapping scope. */

int y = 11;

int main( ){ int y = 12;

if (1) { int y, z; y = 92;

printf(“inner y = %d \n”, y); } printf("middle y = %d \n", y);}

class/215/examples ==> p14inner y = 92 middle y = 12

For sane debugging never use multiple variables wit h the same name and overlapping scope.

Note that if you always call your loop counter variable i and you always declare it at the start of each function, you are not violating this guidelines. In this case there are multiple i's but their scopes don't overlap.

153

Storage class

The storage class of a variable the area of memory in which a variable is stored.

The two available areas are commonly referred to as the heap and the stack.

Stack resident variables include:

parameters passed to functionsvariables declared inside basic blocks that are not declared staticmemory areas dynamically allocated with alloca()

Heap resident variables include:

variables declared outside all functionsvariables declared inside basic blocks that are declared static.memory areas dynamically allocated with malloc()

Storage for heap resident variables is assigned at the time a program is loaded and remain assigned for the life of a program.

Stack resident variables are created at entry to the basic block that contains them and deleted at exit from the block.

154

NonPersistence of stack resident variables

The example below appears to contradict the claim that storage is assigned to a stack resident variable only for the time in which the block is active.

At first entry to f1() the variable x is unintialized. The variable x is set to 55 before returning from f1()The variable is still 55 at entry on the second call.

class/215/examples ==> cat p15.c/* p15.c */

void f1(void){ int x;

printf("At entry to f1 x = %d \n", x);

x = 55;}

main(){ f1(); f1();}

class/215/examples ==> p15At entry to f1 x = -1073743532 At entry to f1 x = 55

155

Example p16.c shows that the claim was indeed true and it was only bad luck that made it appear otherwise.

/* p16.c */

void f1(void){ int x;

printf("At entry to f1 x = %d \n", x); x = 55;}

void f2(void){ int z;

printf("At entry to f2 z = %d \n", z); z = 102;}

main(){ f1(); f2(); f1();}

class/215/examples ==> p16At entry to f1 x = -1073743644 At entry to f2 z = 55 At entry to f1 x = 102

It can be observed from the output above that in this particular case the variable y in f1() and z in f2() are in fact occupying the same physical storage.

156

Details of stack allocation

/* p27.c */

int adder(int a,int b){ int d; int e; d = a + b; e = d - a; return(d);}

At entry to the function adder the stack is organized as follows

Parm - 2 (b)Parm - 1 (a)Return address <- SP

The compiler produces a prologue to the body of the function which looks like. The ebp register is known as the base pointer or the frame pointer. All stack resident variables are addressed using base/displacement addressing with ebp serving as the base.

adder: pushl %ebp ;save caller's frame ptr movl %esp, %ebp ;set up my frame pointer subl $8, %esp ;allocate local vars

157

After the prolog completes the stack looks as follows

(ebp + 12) Parm - 2 (b)(ebp + 8) Parm - 1 (a)(ebp + 4) Return address (ebp + 0) Saved ebp <- BP(ebp - 4) local var (d)(ebp - 8) local var (e) <- SP

d = a + b;

movl 12(%ebp), %eax ;load b into eax addl 8(%ebp), %eax ;add a to eax movl %eax, -4(%ebp) ;store sum at d

e = d - a;

movl 8(%ebp), %edx ;load a into edx movl -4(%ebp), %eax ;load d into eax subl %edx, %eax ;subtract a from d movl %eax, -8(%ebp) ;save result in e

return(d); movl -4(%ebp), %eax ;copy d to return reg leave ;Copies EBP to ESP then POPS EBP

After the leave executes

(ebp + 12) Parm - 2 (b)(ebp + 8) Parm - 1 (a)(ebp + 4) Return address <- SP

ret ;POPS EIP

158

The static and extern modifiers

The static and extern modifiers are used as follows:

static int num;extern int extnum;

The action of the static modifier is dependent upon the location of the declaration. When used inside the body of a function, static forces the variable to

1 reside on the heap instead of the stack and thus

2 safely retain its value across function calls

int public_val;int adder(int a){ static int sum; sum += a; return(sum);}

The extern modifier can be used to access a public variable that is declared in another source module. If, in another module, I need to access the variable public_val that is declared in the module above I can declare.

extern int public_val;main(){

public_val = 15;}

159

Use of static on variables declared outside function bodies

The action of the static modifier is dependent upon the location of the declaration. When used outside the body of a function as in declaration of private_val, static

1 limits the scope of the variable to this source module.

2 but the variable still remains on the heap.

static int private_val;int adder(int a){ static int sum; sum += a; return(sum);}

This will defeat the ability of extern modifier to access the variable.

extern int private_val;

main(){

private_val = 15;}

The two source files will compile correctly but the linker ld will fail because p34.c no longer publishes the address of private_val;

class/215/examples ==> gcc p34.c p35.c/tmp/ccNrTn6K.o(.text+0x12): In function `main':: undefined reference to `private_val'collect2: ld returned 1 exit statusclass/215/examples ==>

160

The tiled plane

The tiled plane is new object that has characteristics of both the infinite and finite planes. Like the original infinite plane it is of unbounded size. Thus, the plane_hits() function in your original plane.c can be used to determine where the object is hit.

As can be observed, the plane is comprised of a collection of tiles. The tiles share characteristics of the finite_plane. The tiles have x and y dimensions and a vector which, when projected onto the plane determines the xaxis direction of the tiling.

Unlike both the basic plane and the finite plane, the tiled plane has two sets of colors. The foreground color may be stored as usual in the material_t component of the obj_t structure, but we need to save the background color in the tplane_t structure itself.

161

Implementation of the tiled plane

Like the finite plane the tiled planed is derived from the infinite plane:

The tplane_t structure

typedef struct tplane_type{ double xdir[3]; /* orientation of the tiling */ double size[2]; /* size of the tiling */ material_t background; /* background color */} tplane_t;

162

obj_t plane_t tplane_t

Using the function pointers of the obj_t to provide polymorphic behavior

In the discussion of the procedural plane, it was shown how the getamb(), getdiff(), and getspec() functions that could optionally be overridden by various objects provided a useful way to emulate the polymophism of a true object oriented language.

typedef struct obj_type{ struct obj_type *next; /* Next object in list */ int objid; /* Numeric serial # for debug */ int objtype; /* Type code (14 -> Plane ) */ double (*hits)(double *base, double *dir, struct obj_type *); /* Hits function. */

/* Optional plugins for retrieval of reflectivity *//* useful for the ever-popular tiled floor */

void (*getamb)(struct obj_type *, double *); void (*getdiff)(struct obj_type *, double *); void (*getspec)(struct obj_type *, double *);

material_t material;

double emissivity[3]; /* For lights */

void *priv; /* Private type-dependent data */ double hitloc[3]; /* Last hit point */ double normal[3]; /* Normal at hit point */} obj_t;

163

Invoking the polymorphic methods from raytrace() and illuminate()

Recall that the main benefit of the polymorphic approach is that it allows us add new object types having specialized reflectivty models without having to modify and junk up existing functions such as raytrace() and illuminate() with constructs such as:

if (closest->objtype == TILED_PLANE) do thiselse if (closest->objtype == TEXTURED_PLANE) do thatelse if (closest->objtype == PROCEDURAL_PLANE) do something else stillelse provide default behavior

Instead the ambient reflectivity should be recovered in raytrace() using:

/* Hit something not a light */

closest->getamb(closest, intensity); diffuse_illumination(lst, closest, intensity); vl_scale3(1 / total_dist, intensity, intensity);

and illuminate() as:

hitobj->getdiff(hitobj, diffuse);

164

Loading a tiled plane object

As can be seen, the tiled plane specification is comprised of a finite plane spcification followed by the alternate tile coloring.

14 tiled plane0 0 0 r g b ambient (foreground tiles)6 1 8 r g b diffuse0 0 0 r g b specular

1 0 1 normal0 0 0 point is the lower left corner of a forground tile 1 1 -1 x direction1.25 0.5 size of a tile

0 0 0 r g b ambient (background tiles)8 4 0 r g b diffuse0 0 0 r g b specular

Thus it would be reasonable to either:

derive the tplane_t from the plane_t and copy the inner workings of fplane_init() to tplane_init() or

derive the tplane_t from the fplane_t and have tplane_init() invoke fplane_init().

I elected to derive it from plane_t and thus my code duplicates the elements of fplane_t that are shown in red.

typedef struct tplane_type{ double xdir[3]; /* orientation of the tiling */ double size[2]; /* size of the tiling */ double rotmat[3][3]; /* Rotation matrix */ material_t background; /* background color */} tplane_t;

165

Loading the tiled plane description

Here is the bulk of what is required to set up a tplane_t object.

/**/obj_t *tplane_init(FILE *in,list_t *lst,int objtype){ int pcount; tplane_t *tp; plane_t *p; obj_t *obj;

/* Invoke "constructor" for "parent class" */ Need to invoke plane_init() here

/* Create the tplane_t object and point the plane_t to it */

Allocate a tplane_t and set the priv pointer in the plane_t

/* override the default reflectivity functions */

obj->getamb = tp_amb; /* have to write these */ obj->getdiff = tp_diff; obj->getspec = tp_spec;

/* Load xdir and size fields as done in fplane */ Have to copy in code from fplane here

/* Finally load the background material reflectivity */ Need to call material load here

return(obj);}

166

The reflectivity functions tp_amb, tp_diff, and tp_spec

From a high level perspective, the mission of these fellows is easy:

Determine if the obj>hitloc lies in a “foreground” tileIf so, copy the reflectivity stored in the obj_t)If not, copy the “background” reflectivity stored in the tplane_t.

The hard part is the first step so we abstract that out to the tp_select() function which will return 1 for “foreground” and 0 for “background”.

/**/void tp_diff(obj_t *obj,double *value){ plane_t *pln = (plane_t *)obj->priv; tplane_t *tp = (tplane_t *)pln->priv;

if (tp_select(obj)) vl_copy3(obj->material.diffuse, value); else vl_copy3(tp->background.diffuse, value);}

The tp_amb() and tp_spec() functions are clearly trivial modifications to tp_diff().

167

Determining if a foreground or background tile has been hit

First consider the simple case: obj>hitloc[]

The plane normal is the positive zaxis The plane point (lower left corner of a foreground tile) is the origin obj>hitloc[] contains the hit point location (note obj>hitloc[2] == 0).

The relative tile number in the x and y directions of the tile that contains obj>hitloc[] are then

relx = (int) obj->hitloc[0] / tp->size[0];

and

rely = (int) obj->hitloc[1] / tp->size[1];

For example

suppose tp>size = {2, 3} and obj>hitloc = {7,2, 6.5, 0.0}

then

relx = 3;rely = 2;

Having done this, the tp_select() function simply returns 0 if relx + rely is even and 1 if relx + rely is odd. This condition may be expressed as:

(relx + rely) % 2;

168

Complicating factors

While the preceding algorithm was simple it has a couple of holes that remained to be filled.

Suppose tp>size = {1, 1};

Consider the two hitloc's {0.5, 0.5, 0} and {0.5, 0,5, 0.0}

The two locations are clearly in different but adjacent tile squares. The dividing line between the two tiles is the yaxis. Points in adjacent squares must have different colors.

Unfortunately the algoithm just described will yield relx = rely = 0 for both points.

This will create ugly “double wide” strips of tiles along the x and y axes.

There are various hacks that can be used to prevent this. A particularly ugly one (Westall's hack) is:

relx = (int)(10000 + obj->hitloc[0] / tp->size[0]);

and

rely = (int)(10000 + obj->hitloc[1] / tp->size[1]);

169

Planes of arbitrary orientation

In the general case

the plane normal is not aligned with the zaxisthe xdir vector is not aligned with the xaxisthe point at the base of a tile square is not a the origin.

In this case the solution is analogous to the one used in the fplane_hits() function.

Subtract the coordinates of the plane>point which defines the lower left corner of a foreground tile square from obj>hitloc[] obtaining a translated hit position called newhit[].

Construct a rotation matrix that will rotate

the plane normal into the zaxisthe xdir vector into the xaxis

Apply the rotation to newhit[]. Note: if newhit[2] != 0 at this point something is fatally flawed!!

Compute relx and rely using newhit[] and proceed as previously described.

170

UnionsA union is a structured data that can be used to overlay different data types upon the same storage.

union fp_type{ unsigned char bvals[4]; float fval;} x;

main(){ x.fval = 34.25; printf(“%02x %02x %02x %02x \n”, x.bvals[0], x.bvals[1], x.bvals[2], x.bvals[3]);}

acad/cs215/examples/mwray18 ==> gcc union.cacad/cs215/examples/mwray18 ==> a.out00 00 09 42

The amount of storage allocated for a union is the size of the largest component. In this example both have the same size but that is not necessary.

171

Unnamed unionsAnother possible use for a union would be to embed the definition of all specific object types within the obj structure:

typedef struct obj_type{ int objid; int objtype; union { sphere_t sphere; plane_t plane; };} obj_t;obj_t ob;

main(){ ob.sphere.radius = 5;}

Note that the union was not given a name in the above code. We can name it but if we do, the name must be used in referencing the internal objects.

typedef struct obj_type{ int objid; int objtype; union { sphere_t sphere; plane_t plane; } u;} obj_t;

obj_t ob;

main(){ ob.u.sphere.radius = 5;}

172

Bitfields

Its possible to subdivide words into bitfields having individual names

struct bf_type{ unsigned int p1:4; unsigned int p2:8; unsigned int p3:4;} bf;

main(){ bf.p1 = 12; bf.p2 = 7; bf.p3 = 4; printf("%04x\n", bf); printf("%04x\n", bf.p1); printf("%04x\n", bf.p2); printf("%04x\n", bf.p3);

}

407c000c00070004

Because of endian issues bitfields are inherently not portable.

173

Pointers to pointers

Pointers to pointers are declared using **.

If a variable is declared as

item_t **double_p;

Then double_p is an item_t** which is a pointer to an item_t**double_p is an item_t* which is a pointer to item_t**double_p is an item_t

In making (incorrect) assignments involving multiple levels of indirection it is common to see compiler warnings regarding “different levels of indirection”. These should not be ignored. Exercise: given the following declarations which of the following will not generate compiler warnings:

item_t **dp;item_t *sp;item_t i;

sp = i;dp = &i;dp = &sp;sp = *dp;i = *sp;

174

Passing the address of a table of pointers

In practice double pointers are occasionally useful in two contexts. One is in passing the address of a table of pointers as a pointers as a parameter.

main(int argc,char **argv){ **argv is the first character of the program name *(*(argv + 1)) is the first character of the first parameter *(*(argv + 2) + 1) is the second character of the second

parameter }

Because the syntax is a bit on the opaque side you are strongly encouraged to run offline tests such as the one shown below to convince yourself you really know what you are doing.

main(int argc,char **argv){ printf("%c %c %c \n", **argv, *(*(argv + 1)), *(*(argv + 2) + 1));}acad/cs215/examples/mwray18 ==> a.out Hello Worlda H o

175

Allowing a function to fill in the address of allocated storage

Double pointers can also be used if a subroutine is used to allocate a data stucture and return its address to the caller. Previously far we have recommended the following approach:

obj_t * obj_load(){ obj_t *new; new = = malloc(sizeof (obj_t)); return(new);}

main(){ obj_t *newobj; newobj = obj_load();}

An alternative approach is:

int obj_load(obj_t **new){ *new = malloc(sizeof (obj_t)); return(0);}

main(){ obj_t *newobj; int rc; rc = obj_load(&newobj);}

176

The Machine Model Memory - Clemson Universitywestall/texnh/courses/215.f05/notes… · This is the...

Documents

Transcript of The Machine Model Memory - Clemson Universitywestall/texnh/courses/215.f05/notes… · This is the...