COMP 6761 Advanced Computer Graphics - Concordia University

COMP 6761 Advanced Computer Graphics

Lecture Notes

Peter Grogono

These notes may be photocopied for studentstaking COMP 6761 at Concordia University.

c Peter Grogono 2002, 2003

Department of Computer ScienceConcordia University

Montreal, Quebec

CONTENTS CONTENTS

Contents

1 Introduction 1

1.1 Getting Started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Callbacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2.1 Display . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2.2 Reshaping Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2.3 Keyboard Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2.4 Mouse Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2.5 Idle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3 OpenGL Naming Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3.1 Type Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3.2 Function Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.4 General Features of OpenGL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.4.1 States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.4.2 Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.5 Drawing Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.5.1 Primitive Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.5.2 GLUT Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.5.3 Quadric Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.6 Hidden Surface Elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.7 Animation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2 Transformations and Projections 14

2.1 Matrices in OpenGL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2 Projection Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2.1 Orthogonal Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2.2 Perspective Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.3 Model View Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3 Building Models and Scenes 21

3.1 A Digression on Global Variables . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.2 Matrix Stacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.2.1 Pops and Pushes Don’t Cancel! . . . . . . . . . . . . . . . . . . . . . . . 25

3.2.2 Animation with Stacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.3 Viewing the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

ii

CONTENTS CONTENTS

4 Lighting 32

4.1 Lighting Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.1.1 Hiding Surfaces and Enabling Lights . . . . . . . . . . . . . . . . . . . . 32

4.1.2 Kinds of Light . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.2 Material Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4.3 Light Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4.4 Lighting Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4.5 Lighting in Practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4.6 Normal Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

5 Special Effects 41

5.1 Blending . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

5.2 Fog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

5.3 Reflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

5.4 Display Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

5.5 Bezier Curves and Surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

5.5.1 Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

5.5.2 Surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

5.6 Menus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

5.7 Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

5.8 Other Features of OpenGL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

5.8.1 Textures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

5.8.2 NURBS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

5.8.3 Antialiasing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

5.8.4 Picking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

5.8.5 Error Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

5.9 Program Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

6 Organization of a Graphics System 57

6.1 The Graphics Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

6.1.1 Per Vertex Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

6.1.2 Primitive Assembly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

6.1.3 Rasterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

6.1.4 Pixel Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

6.1.5 Fragment Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

6.2 Rasterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

6.2.1 Drawing a Straight Line . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

6.2.2 Drawing a Circle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

iii

CONTENTS CONTENTS

6.2.3 Clipping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

7 Transformations — Again 67

7.1 Scalar Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

7.2 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

7.3 Affine Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

7.4 Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

7.4.1 Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

7.4.2 Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

7.4.3 Rotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

7.5 Non-Affine Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

7.5.1 Perspective Transformations . . . . . . . . . . . . . . . . . . . . . . . . . 71

7.5.2 Shadows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

7.5.3 Reflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

7.6 Working with Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

8 Rotation 77

8.1 Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

8.2 2D Rotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

8.2.1 Representing 2D Rotations with Matrices . . . . . . . . . . . . . . . . . 79

8.2.2 Representing 2D Rotations with Complex Numbers . . . . . . . . . . . 80

8.3 3D Rotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

8.3.1 Representing 3D Rotations with Matrices . . . . . . . . . . . . . . . . . 81

8.3.2 Representing 3D Rotations with Quaternions . . . . . . . . . . . . . . . 82

8.3.3 A Proof that Unit Quaternions Represent Rotations . . . . . . . . . . . 85

8.3.4 Quaternions and Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 86

8.3.5 Quaternion Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . 87

8.4 Quaternions in Practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

8.4.1 Imitating a Trackball . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

8.4.2 Moving the Camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

8.4.3 Flying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

9 Theory of Illumination 95

9.1 Steps to Realistic Illumination . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

9.1.1 Intrinsic Brightness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

9.1.2 Ambient Light . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

9.1.3 Diffuse Lighting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

9.1.4 Attenuation of Light . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

9.1.5 Coloured Light . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

iv

LIST OF FIGURES LIST OF FIGURES

9.1.6 Specular Reflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

9.1.7 Specular Colours . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

9.1.8 Multiple Light Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

9.2 Polygon Shading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

9.2.1 Flat Shading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

9.2.2 Smooth Shading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

10 The Theory of Light and Colour 102

10.1 Physiology of the Eye . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

10.2 Achromatic Light . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

10.3 Coloured Light . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

10.4 The CIE System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

10.4.1 Using Gamuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

10.5 Other Colour Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

10.5.1 RGB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

10.5.2 CMY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

10.5.3 YIQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

10.6 Gamma Correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

11 Advanced Techniques 112

11.1 Ray-Tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

11.1.1 Recursive Ray Tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

11.1.2 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

11.2 Radiosity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

11.2.1 Computing Form Factors . . . . . . . . . . . . . . . . . . . . . . . . . . 119

11.2.2 Choosing Patches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

11.2.3 Improvements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

11.3 Bump Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

11.4 Environment Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

11.5 The Accumulation Buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

References 126

List of Figures

1 A simple display function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 OpenGL Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3 A simple reshape function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

4 Primitive specifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

v


5 Drawing primitives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

6 A coloured triangle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

7 Perspective projection using gluPerspective() . . . . . . . . . . . . . . . . . . 16

8 An OpenGL program with a perspective transformation . . . . . . . . . . . . . 19

9 Translation followed by rotation . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

10 Rotation followed by translation . . . . . . . . . . . . . . . . . . . . . . . . . . 20

11 Programming with global variables . . . . . . . . . . . . . . . . . . . . . . . . . 22

12 Programming with fewer global variables . . . . . . . . . . . . . . . . . . . . . . 23

13 Programming with fewer global variables, continued . . . . . . . . . . . . . . . 24

14 Drawing an arm — first version . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

15 Drawing an arm — improved version . . . . . . . . . . . . . . . . . . . . . . . . 25

16 Drawing a Maltese Cross . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

17 Pushing and popping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

18 Zooming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

19 Using gluLookAt() and gluPerspective() . . . . . . . . . . . . . . . . . . . . 29

20 Parameters for glMaterialfv() . . . . . . . . . . . . . . . . . . . . . . . . . . 33

21 Using glMaterial() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

22 Parameters for glLightfv() . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

23 Parameters for glLightModel() . . . . . . . . . . . . . . . . . . . . . . . . . . 36

24 Computing normals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

25 Computing average normals on a square grid . . . . . . . . . . . . . . . . . . . 40

26 Fog Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

27 Parameters for glFog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

28 Two three-point Bezier curves with their control points . . . . . . . . . . . . . 46

29 Control parameters for Bezier curves . . . . . . . . . . . . . . . . . . . . . . . . 47

30 Using Bezier surfaces for the body of a plane . . . . . . . . . . . . . . . . . . . 49

31 Points generated by the code of Figure 30 . . . . . . . . . . . . . . . . . . . . . 50

32 Functions for menu callback and creation . . . . . . . . . . . . . . . . . . . . . 50

33 A C function for lines with slope less than 1. . . . . . . . . . . . . . . . . . . . 62

34 Drawing a circle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

35 Computing points in the first octant . . . . . . . . . . . . . . . . . . . . . . . . 64

36 Plotting eight symmetrical points . . . . . . . . . . . . . . . . . . . . . . . . . . 64

37 Sutherland-Hodgman Polygon Clipping . . . . . . . . . . . . . . . . . . . . . . 65

38 Labelling the regions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

39 Varieties of Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

40 Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

41 Quaternion multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

vi


42 Mouse callback function for trackball simulation . . . . . . . . . . . . . . . . . 88

43 Projecting the mouse position . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

44 Updating the trackball quaternion . . . . . . . . . . . . . . . . . . . . . . . . . 90

45 Translating the camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

46 Unit vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

47 Auxiliary function for translating the camera . . . . . . . . . . . . . . . . . . . 92

48 Rotating the camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

49 Callback function for flying the plane . . . . . . . . . . . . . . . . . . . . . . . . 94

50 Illuminating an object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

51 Calculating R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

52 Gouraud Shading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

53 Phong Shading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

54 Dynamic range and perceptible steps for various devices . . . . . . . . . . . . . 103

55 CIE Chromaticity Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

56 Ray Tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

57 Lighting in the ray-tracing model . . . . . . . . . . . . . . . . . . . . . . . . . . 116

58 Effect of glAccum(op, val) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

vii

1 Introduction

The course covers both practical and theoretical aspects of graphics programming at a fairlyadvanced level. It starts with practice: specifically, writing graphics programs with OpenGL.During the second part of the course, we will study the theory on which OpenGL and othergraphics libraries are based.

OpenGL is an industry-standard graphics library. OpenGL programs run on most platforms.All modern graphics cards contain hardware that handles OpenGL primitive operations, whichmakes OpenGL programs run fast on most platforms. OpenGL provides a high-level interface,making it easy to learn and use. However, graphics libraries have many common features and,having learned OpenGL, you should find it relatively easy to learn another graphics system.

OpenGL is the basic library: it includes GL (Graphics Library) and GLU (Graphics LibraryUtilities). GLU does not contain primitives; all of its functions make use of GL functions.

GL does not know anything about the windows system of the computer on which it is running.It has a frame buffer, which it fills with appropriate data, but displaying the frame buffer onthe screen is the responsibility of the user.

GLUT (Graphic Library Utility Toolkit) provides the functionality necessary to transfer theOpenGL frame buffer to a window on the screen. GLUT programs are platform-independent:a GLUT program can be compiled and run on a unix workstation, a PC running Windows,or a Mac, with substantially the same results. Lectures in this course will be based on GLUTprogramming.

If you don’t use GLUT, you will have to understand how to program using windows. On aPC, this means learning either the Windows API or MFC (although interfacing OpenGL andMFC is not particularly easy). A good source of information for the Windows API is TheOpenGL SuperBible by Richard S. Wright, Jr. and Michael Sweet (Waite Group Press, 2000).On a unix workstation, you will need a good understanding of X Windows.

The OpenGL hardware accelerators on graphics cards require special drivers. At Concordia,these drivers have been installed on Windows systems but not on linux systems (usuallybecause linux drivers have not been written for the newest graphics cards). Consequently,OpenGL programs run much faster (often 5 or 10 times faster) under Windows than underlinux.

1.1 Getting Started

These notes follow roughly the same sequence as Getting Started with OpenGL by Peter Gro-gono, obtainable from the university Copy Centre. As mentioned above, we assume the useof GLUT. The programming language is C or C++ (OpenGL functions are written in C butcan be called from a C++ program).

Any program that uses GLUT must start with the directive

#include <GL/glut.h>

This assumes that header files are stored in .../include/GL, which is where they are supposedto be. In Concordia labs, the header files may be in .../include, in which case you shoulduse the directive

#include <glut.h>

1 INTRODUCTION 1.1 Getting Started

In most GLUT programs, the main function consists largely of GLUT “boiler plate” code andwill look something like this:

int main (int argc, char *argv[])f

glutInit(&argc, argv);glutInitDisplayMode(GLUT_SINGLE | GLUT_RGBA);glutInitWindowSize(800, 600);glutInitWindowPosition(100, 50);glutCreateWindow("window title");glutDisplayFunc(display);glutMainLoop();

gAll of these functions come from GLUT, as indicated by the prefix glut-. Their effects areas follows:

˘ glutInit initializes GLUT state. It is conventional to pass the command line argumentsto this function. This is useful if you are using X but rather pointless for Windows. AnX user can pass parameters for window size, etc., to a GLUT program.

˘ glutInitDisplayMode initializes the display mode by setting various bits. In this exam-ple, the bit GLUT_SINGLE requests a single buffer (the alternative is double buffers, whichare needed for animation) and the bit GLUT_RGBA requests colours (red, green, blue, and“alpha”, which we will discuss later). Note the use of | (not ||) to OR the bits.

˘ glutInitWindowSize sets the width and height of the graphics window in pixels. If thiscall is omitted, GLUT will use the system default values.

˘ glutInitWindowPosition sets the position of the window. The arguments are the po-sition of the left edge and the top edge, in screen coordinates. Note that, in screencoordinates, 0 is the top of the screen, not the bottom. If this call is omitted, GLUT willuse the system default values.

˘ glutCreateWindow creates the window but does not display it. The window has thesize and position specified by the previous calls and a title given as an argument to thisfunction.

˘ glutDisplayFunc registers a callback function to update the display. Callbacks areexplained below.

˘ glutMainLoop enters a loop in which GLUT handles events generated by the user andthe system and responds to them. Events include: key strokes, mouse movements, mouseclicks, and window reshaping operations.

As soon as the “main loop” starts, GLUT will respond to events and call the functions thatyou have registered appropriately. The only callback registered above is display. In orderfor this program to compile, you must have written a function of the form shown in Figure 1which GLUT will call whenever it needs to update the display.

The function display calls OpenGL functions that we will look at in more detail later. Fornow we note that OpenGL functions have the prefix gl- and:

˘ glClear clears various bits. This call sets all pixels in the frame buffer to the defaultvalue.

˘ glColor3f sets the current colour. The parameters are the values for red, green, andblue. This call asks for bright red.

2

1 INTRODUCTION 1.2 Callbacks

void display()f

glClear(GL_COLOR_BUFFER_BIT);glColor3f(1.0, 0.0, 0.0);glBegin(GL_LINES);

glVertex2f(-1.0, 0.0);glVertex2f(1.0, 0.0);

glEnd();glFlush();

g

Figure 1: A simple display function

˘ glBegin starts a block in which OpenGL expects calls to functions that construct primi-tives. In this case, the mode GL_LINES specifies line drawing, and we provide two vertexesfor each line.

˘ glVertex2f specifies the position of a vertex in 2D.

˘ glFlush forces the window to be refreshed. This call typically has no effect if you arerunning OpenGL on a PC. It is needed when the program is running on a server and theclient screen must be refreshed.

1.2 Callbacks

For each callback function, you need to know: how to register the callback, how to declareit, and what it does. The following sections provide this information. The callback functionscan have any name; the names used here (e.g., display) are typical.

1.2.1 Display

Registration

glutDisplayFunc(display);

Declaration

void display();

Use The display function is called by GLUT whenever it thinks the graphics window needsupdating. Since no arguments are passed, the display function often uses global variables orcalls other functions to obtain its data.

1.2.2 Reshaping Events

Registration

glutReshapeFunc(reshape);

Declaration

void reshape(int width, int height);

3


Use The reshape function is called whenever the user reshapes the graphics window. Thearguments give the width and height of the reshaped window in pixels.

1.2.3 Keyboard Events

GLUT provides two callback functions for keyboard events: one for “ordinary” keys (techni-cally: ASCII graphic characters); and one for “special” keys, such as function (F) keys andarrow keys.

Registration

glutKeyboardFunc(keyboard);

Declaration

void keyboard(unsigned char key, int x, int y);

Use The keyboard function is called when the user presses a “graphic” key. These are thekeys for characters that are visible on the screen: letters, digits, symbols, and space. The esc

characters is also recognized (with code 27).

The values of x and y give the position of the mouse cursor at the time when the key waspressed.

Registration

glutSpecialFunc(special);

Declaration

void special(int key, int x, int y);

Use The special function is similar to the keyboard function but is called when the user pressesa non-graphic character key. The key is identified by comparing it to a GLUT constant. Theconstants are:

GLUT_KEY_F1 GLUT_KEY_F8 GLUT_KEY_LEFTGLUT_KEY_F2 GLUT_KEY_F9 GLUT_KEY_RIGHTGLUT_KEY_F3 GLUT_KEY_F10 GLUT_KEY_UPGLUT_KEY_F4 GLUT_KEY_F11 GLUT_KEY_DOWNGLUT_KEY_F5 GLUT_KEY_F12 GLUT_KEY_PAGE_UPGLUT_KEY_F6 GLUT_KEY_HOME GLUT_KEY_PAGE_DOWNGLUT_KEY_F7 GLUT_KEY_END GLUT_KEY_INSERT

1.2.4 Mouse Events

Registration

glutMouseFunc(mouse);

Declaration

void mouse(int button, int state, int x, int y);

Use This function is called when the user presses or releases a mouse button. The valueof button is one of GLUT_LEFT_BUTTON, GLUT_MIDDLE_BUTTON, or GLUT_RIGHT_BUTTON. Thevalue of state is GLUT_UP or GLUT_DOWN. The values of x and y give the position of the mousecursor at the time when the key was pressed.

4


The values x and y are measured in pixels and are relative to the graphics window. The topleft corner of the window gives x D 0 and y D 0; the bottom right corner gives x D widthand y D height, where width and height are the values given during initialization or byreshaping. Note that y values increase downwards .

Registration

glutMotionFunc(motion);glutPassiveMotionFunc(passiveMotion);

Declaration

void motion(int x, int y);void passiveMotion(int x, int y);

Use These functions are called when the mouse is moved within the graphics window. If anymouse button is pressed, motion is called. If no buttons are pressed, passiveMotion is called.

The values of x and y are the same as for the mouse callback. However, if you press a buttondown while the mouse is in the graphics window, and then drag the mouse out of the window,the values of x and y may go outside their respective ranges — that is, x may become negativeor greater than width, and y may become negative or greater than height.

1.2.5 Idle

Registration

glutIdleFunc(idle);

Declaration

void idle();

Use The idle function is called whenever OpenGL has nothing else to do. It is a very importantcallback, because it enables animated graphics. A typical idle callback function looks like this:

void idle()f

// Update model values....glutPostRedisplay();

gThe effect of glutPostRedisplay() is to inform GLUT that the graphics window needsrefreshing — that is, that GLUT should invoke the display callback function.

You could call the display function from within the idle function. Although this usuallyworks, it is not recommended . The reason is that GLUT handles many events and canpostpone refreshing the display until there are no outstanding events. For example, if the useris dragging the mouse or reshaping the window during an animation, the graphics windowshould not be redisplayed until the operation is completed.

This section includes all of the GLUT functions that you need to get started. Later, we willlook at GLUT functions for more advanced applications, such as menu management.

5

1 INTRODUCTION 1.3 OpenGL Naming Conventions

Suffix Data type C Type OpenGL Typeb 8-bit integer signed char GLbytes 16-bit integer short GLshorti 32-bit integer int or long GLint, GLsizeif 32-bit floating point float GLfloat, GLclampfd 64-bit floating point double GLdouble, GLclampdub 8-bit unsigned integer unsigned char GLubyte, GLbooleanus 16-bit unsigned integer unsigned short GLushortui 32-bit unsigned integer unsigned int GLuint, GLenum, GLbitfield

Nothing void GLvoid

Figure 2: OpenGL Types

1.3 OpenGL Naming Conventions

1.3.1 Type Names

OpenGL uses typedefs to give its own names to C types, as shown in Figure 2 (see also Table 1of Getting Started with OpenGL on page 7). You don’t have to use these types, but uses themmakes your programs portable. The suffixes in the left column are used in function names, asdescribed in the next section.

1.3.2 Function Names

OpenGL provides an API for C, not C++. This means that function names cannot be over-loaded and, consequently, naming conventions are required to distinguish similar functionswith different parameter types. The structure of an OpenGL function name is as follows:

˘ The name begins with the prefix gl (primitive library functions) or glu (utility libraryfunctions).

˘ The prefix is followed by the function name. The first letter of the function name is inupper case.

˘ There may be a digit to indicate the number of parameters required. E.g., 3 indicatesthat 3 parameters are expected.

˘ There may be a letter or letter pair indicating the type of the parameters. The codes aregiven in the left column of Figure 2.

˘ There may be a v indicating that the arguments are pointers to arrays rather than theactual values.

For example, in the call

glVertex3f(x, y, z);

there must be three arguments of type GLfloat. The same effect could be achieved by called

glVertex3fv(pc);

provided that pc is a pointer to an array of (at least) three floats.

6

1 INTRODUCTION 1.4 General Features of OpenGL

The official OpenGL documentation sometimes lists all the allowed forms (the OpenGL Refer-ence Manual does this) and sometimes uses an abbreviated form (the OpenGL ProgrammingGuide does this). The form

void glVertexf234gfsifdg[v](TYPE coords);

stands for 24 different function prototypes. The functions may have 2, 3, or 4 parameters oftype short, integer, float, or double, and they may be passed as individual values or as apointer to a suitable array.

In these notes, we will normally use particular functions that are appropriate for the appli-cation. However, you should always be aware that a different form might be better for yourapplication.

1.4 General Features of OpenGL

1.4.1 States

It is best to think of OpenGL as a Finite State Machine (with a lot of states!). The effectof many functions is to change a state variable. The state is restored or changed again byanother call , not by default.

For example, if you call glColor3f(0,0,1), then everything you draw will be coloured brightblue until you call glColor again with different arguments.

The state-machine concept seems simple enough, but it can give a lot of trouble in programdevelopment. It often happens that your program is doing something unexpected becauseOpenGL is in the wrong state, but it is hard to find out which state variable is wrong. Problemsare even worse when you work with multiple windows, because some parts of the OpenGL stateapplies to individual windows and other parts apply to all windows.

A partial solution is to modify state in a systematic and structured way. For example, if afeature is turned on somewhere, it is a good idea to turn it off in the same scope. You mayuse lighting for some parts of a scene and not others. In this case, you display function shouldlook like this:

void display()f

// initialization....// display parts of the scene that require lightingglEnable(GL_LIGHTING);....// display parts of the scene that do not require lightingglDisable(GL_LIGHTING);....

gIf the calls to glEnable and glDisable had been hidden in functions called by display, wecould not tell when lighting was in effect by looking at the display function.

7

1 INTRODUCTION 1.5 Drawing Objects

1.4.2 Coordinates

Graphics programs make heavy use of coordinate systems and it is easy to get confused. Thereare two important sets of coordinates that are fundamental to all applications.

Window Coordinates The window in which the graphics scene is displayed width w andheight h, measured in pixels. Coordinates are given as pairs (x, y). The top left corner of thewindow is (0, 0) and the bottom left corner is (w, h). X values increase from left to right andY values increase from top to bottom.

Model Coordinates The model, or scene, that we are displaying is three-dimensional. (Wecan use OpenGL for 2D graphics but most of the applications in this course assume 3D.) Bydefault:

˘ the origin is at the centre of the window, in the plane of the window

˘ the X axis points to the right of the window

˘ the Y axis points to the top of the window

˘ the Z axis points towards the viewer

Note that:

˘ The Y axis of the model is inverted with respect to the Y axis of the window. In themodel, Y points upwards, in accordance with engineering and mathematical conventions.

˘ The model coordinates are right-handed. Since the X and Y directions are fixed byconvention, the Z axis must point towards the viewer. (Hold your right hand so thatyour thumb, first finger, and second finger are at right-angles to one another. Point yourthumb (X axis) to the right and your first finger (Y axis) upwards; then your secondfinger (Z axis) is pointing towards you.)

Knowing the direction of the coordinates is not enough: OpenGL displays things only if theyare in the viewing volume . By default, the viewing volume contains points (x, y, z) suchthat �1 � x � 1, �1 � y � 1, and �1 � z � 1. (We will see later how to alter these values.Note that the coordinates in Figure 1 satisfy the conditions for visibility.) Objects outsidethe viewing volume not visible on the screen.

OpenGL has to transform model coordinates to window coordinates. The mapping takes(x, y, z) in the model to (w(xC 1)=2, h(1�y)=2). The origin of the model, (0, 0, 0), is mappedto the centre of the window, (w=2, h=2). Z coordinates in the model are ignored (this is notalways true, as we will see later).

Figure 3 shows a simple reshape callback function. It assumes that the model is contained ina box bounded by jxj � 5, jyj � 5, and jzj � 5. It is designed so that, whatever the shapeof the new window, the whole of the model is visible and the model is not distorted. Thisimplies that the shorter side of the window must be 10 units long.

1.5 Drawing Objects

A scene in a graphics program is composed of various objects. At the lowest level, there areprimitive objects , such as vertexes, lines, and polygons. From primitive objects, we can

8


void reshape (int width, int height)f

GLfloat w, h;if (width > height)f

w = (5.0 * width) / height;h = 5.0;

gelsef

w = 5.0;h = (5.0 * height) / width;

gglViewport(0, 0, width, height);glMatrixMode(GL PROJECTION);glLoadIdentity();glOrtho(-w, w, -h, h, -5, 5);glutPostRedisplay();

g

Figure 3: A simple reshape function

build common objects , such as cubes, cones, and spheres. We can also construct specialobjects such as Bezier curves and surfaces.

All drawing (the technical term is “rendering”) is performed inside the display function.

1.5.1 Primitive Objects

The code for rendering OpenGL primitives has the form:

glBegin(mode);....glEnd();

where mode has one of the values shown in Figure 4. The explanations in Figure 4 aresufficient for most modes but the three modes shown in Figure 5 need some care for correctuse.

Note that, for GL_QUAD_STRIP, the order in which the vertexes are given is not the same asthe order that is used to determine the front face. In the program, the vertexes appear in thesequence v0, v1, v2, . . .. For display purposes (see Figure 5(c)), the quadrilaterals generatedare v0 v2 v3 v1, v2 v4 v5 v3, and so on.

There are some obvious constraints on the number of vertexes in a sequence of calls. However,OpenGL is tolerant: it simply ignores extra vertexes. For example, if you set the mode toGL_TRIANGLES and provide eight vertexes, OpenGL will draw two triangles and ignore the lasttwo vertexes.

A polygon with more than three sides may be convex or concave . OpenGL can draw convexpolygons only. In practice, surfaces are usually constructed using polygons with a small

9


number of edges, either triangles or quadrilaterals. For triangles, the problem of complexitydoes not arise; if the quadrilaterals are approximately rectangular, they will be convex.

Another problem that arises with more than four vertexes is planarity. Whereas a set of threepoints defines a unique plane, a set of four or more points may or may not lie in a plane.Since the calculations that OpenGL performs assume planarity, you may get funny results ifyou provide non-planar polygons.

A polygon has a front face and a back face. The faces can have different colours, and thedistinction between front and back is important for lighting. The rule for deciding which isthe front face is:

If the order in which the vertexes are displayed appears to be counter-clockwisein the viewing window, we are looking at the front of the polygon.

There are two functions that affect the way in which polygons are displayed.

glPolygonMode(face, mode);

accepts the arguments shown in the following table.

face modeGL_FRONT_AND_BACK (default) GL_FILL (default)GL_FRONT GL_LINEGL_BACK GL_POINT

GL_FILL means that the entire polygon will be shaded with the current colour; GL_LINE meansthat its outline will be drawn; and GL_POINT means that only the vertexes will be drawn.

If the mode is GL_FILL, then the function

glShadeModel(shading);

Mode Value EffectGL_POINTS Draw a point at each of the n vertices.GL_LINES Draw the unconnected line segments v0v1, v2v3, . . . vn�2vn�1.GL_LINE_STRIP Draw the connected line segments v0v1, v1v2, . . . , vn�2vn�1.GL_LINE_LOOP Draw a closed loop of lines v0v1, v1, v2, . . . , vn�2vn�1, vn�1v0.GL_TRIANGLES Draw the triangle v0v1v2, then the triangle v3v4v5, and so on.GL_TRIANGLE_STRIP Draw the triangle v0v1v2, then use v3 to draw a second triangle,

and so on (see Figure 5 (a)).GL_TRIANGLE_FAN Draw the triangle v0v1v2, then use v3 to draw a second triangle,

and so on (see Figure 5 (b)).GL_QUADS Draw the quadrilateral v0v1v2v3, then the quadrilateral v4v5v6v7,

and so on.GL_QUAD_STRIP Draw the quadrilateral v0v1v3v2, then the quadrilateral v2v3v5v4,

and so on (see Figure 5 (c)).GL_POLYGON Draw a single polygon using v0, v1, . . . , vn�1 as vertices (n � 3).

Figure 4: Primitive specifiers

10


rv0

rv1

rv2

rv3

rv4

rv5

��A

AAA

��

��

��

AAAA

HHHH

�� rv0

rv1 rv2

rv3rv4

��

��

@@

HHHH

��

rv0

rv1

rv2

rv3

rv4

rv5

rv6

rv7

HHHH��

��

��

HHHH

(a) GL TRIANGLE STRIP (b) GL TRIANGLE FAN (c) GL QUAD STRIP

Figure 5: Drawing primitives

glBegin(GL_TRIANGLES);glColor3f(1, 0, 0);glVertex3f(0, 0.732, 0);glColor3f(0, 1, 0);glVertex3f(-0.5, 0, 0);glColor3f(0, 0, 1);glVertex3f(0.5, 0, 0);glEnd();

Figure 6: A coloured triangle

determines how the face will be coloured. By default, the shading mode is GL_SMOOTH. OpenGLcolours each vertex as specified by the user, and then interpolates colours in between. Figure 6shows code that will draw a triangle which has red, green, and blue vertexes, and intermediatecolours in between. If the shading mode is GL_FLAT, the entire triangle will be give then colourthat is in effect when its first vertex is drawn.

1.5.2 GLUT Objects

The GLUT library provides several objects that are rather tedious to build “by hand”. Theseobjects are fully defined, with normals, and they are useful for experimenting with lighting.A “wire” object displays as a wire frame; this is not very pretty but may be useful duringdebugging. A “solid” object looks like a solid, but you should define its material propertiesyourself, as we will see later. For a sphere, “slices” are lines of longitude and “stacks” arelines of latitude. Cones are analogous. For a torus, “sides” run around the torus and “rings”run around the “tube”. The prototypes for the GLUT objects are:

void glutWireSphere (double radius, int slices, int stacks);void glutSolidSphere (double radius, int slices, int stacks);void glutWireCube (double size);void glutSolidCube (double size);void glutWireTorus (double inner, double outer, int sides, int rings);void glutSolidTorus (double inner, double outer, int sides, int rings);void glutWireIcosahedron ();void glutSolidIcosahedron ();

11


void glutWireOctahedron ();void glutSolidOctahedron ();void glutWireTetrahedron ();void glutSolidTetrahedron ();void glutWireDodecahedron ();void glutSolidDodecahedron ();void glutWireCone (double radius, double height, int slices, int stacks);void glutSolidCone (double radius, double height, int slices, int stacks);void glutWireTeapot (double size);void glutSolidTeapot (double size);

1.5.3 Quadric Objects

Quadrics are surfaces that can be generated by an equation of the second degree: that is, bychoosing values of ai in

a1 x2 C a2 y2 C a3 z2 C a4 y z C a5 z x C a6 x y C a7 x C a8 y C a9 z C a10 D 0.

Quadrics provided by OpenGL include spheres, cylinders, and disks. Since the top and bottomdiameters of a cylinder can be set independently, cones can be drawn as well. Drawing aquadric requires several steps. These steps usually occur during initialization:

˘ Declare a pointer to a quadric descriptor:

GLUquadricObj *pq;

˘ Allocate a default descriptor object:

pq = gluNewQuadric();

˘ Set the drawing style for the quadric:

gluQuadricDrawStyle(pq, GLU_FILL);

Possible values for the second argument are: GLU_POINT, GLU_LINE, GLU_SILHOUETTE,and GLU_FILL.

Within the display function:

˘ Draw the quadric using one of:

gluSphere(pq, radius, slices, stacks);gluCylinder(pq, baseRadius, topRadius, height, slices, stacks);gluDisk(pq, innerRadius, outerRadius, slices, rings);gluPartialDisk(pq, innerRadius, outerRadius, slices, rings, startAngle,sweepAngle);

The first argument, pq, is the pointer returned by gluNewQuadric. The dimensionshave type double. The arguments slices, stacks, and rings indicate the number ofsegments used to draw the figure, and are integers. Larger values mean slower displays:values from 15 to 25 are adequate for most purposes. The angles for a partial disk mustbe given in degrees, not radians.

A sphere has its centre at the origin. The base of the cylinder is at the origin, and thecylinder points in the CZ direction. A disk has its centre at the origin and lies in the Z

plane.

When you have finished using quadrics:

12

1 INTRODUCTION 1.6 Hidden Surface Elimination

˘ Delete the quadric descriptor.

gluDeleteQuadric(pq);

The quadric descriptor contains the information about how to draw the quadric, as set bygluDrawStyle, etc. Once you have created a descriptor, you can draw as many quadrics asyou like with it.

1.6 Hidden Surface Elimination

To obtain a realistic view of a collection of primitive objects, the graphics system must displayonly the objects that the viewer can see. Since the components of the model are typicallysurfaces (triangles, polygons, etc.), the step that ensures that invisible surfaces are not ren-dered is called hidden surface elimination. There are various ways of eliminating hiddensurfaces; OpenGL uses a depth buffer.

The depth buffer is a two-dimensional array of numbers; each component of the array corre-sponds to a pixel in the viewing window. In general, several points in the model will map toa single pixel. The depth-buffer is used to ensure that only the point closest to the viewer isactually displayed.

To enable hidden surface elimination, modify your graphics program as follows:

˘ When you initialize the display mode, include the depth buffer bit:

glutInitDisplayMode(GLUT RGBA | GLUT DEPTH);

˘ During initialization and after creating the graphics window, execute the following state-ment to enable the depth-buffer test:

glEnable(GL DEPTH TEST);

˘ In the display() function, modify the call to glClear() so that it clears the depth-bufferas well as the colour buffer:

glClear(GL COLOR BUFFER BIT | GL DEPTH BUFFER BIT);

1.7 Animation

If your program has an idle callback function that changes the values of some global variables,OpenGL will display your model repeatedly, giving the effect of animation. The display willprobably flicker, however, because images will be alternately drawn and erased. To avoidflicker, modify your program to use double buffering. In this mode, OpenGL renders theimage into one buffer while displaying the contents of the other buffer.

˘ When you initialize the display mode, include the double buffer bit:

glutInitDisplayMode(GLUT RGBA | GLUT DOUBLE);

˘ At the end of the display() function include the call

glutSwapBuffers();

If you want to eliminate hidden surfaces and animate, you will have to use this call duringinitialization:

glutInitDisplayMode(GLUT RGBA | GLUT DEPTH | GLUT DOUBLE);

13

2 TRANSFORMATIONS AND PROJECTIONS

2 Transformations and Projections

Graphics programming makes heavy use of coordinate transformations. Suppose we want todisplay a scene with two houses. It doesn’t make sense to specify the coordinates of all of thevertices of one house and then repeat the process all over again for the other house. Instead,we would define a house, translate all of its coordinates to one part of the scene, and thentranslate the same set of points to another part of the scene. We might also rotate or scale thehouse. In fact, we might translate, rotate, or scale the entire scene. All of these operationsrequire coordinate transformations .

Coordinate transformations change the coordinates of a point from (x, y, z) to (x0, y 0, z0). Thecommon kinds of transformation are:

˘ Translation :

x0 D x C a

y 0 D y C b

z0 D z C c

˘ Scaling :

x0 D r x

y 0 D s y

z0 D t z

˘ Rotation :

x0 D x cos � � y sin �y 0 D x sin � C y cos �z0 D z

(This is a rotation about the Z axis. Equations for rotations about the other axes aresimilar.)

Scaling and rotation can be represented as matrix transformations. For example, the rotationabove can be written

24

x0

y 0

z0

35 D

24

cos � � sin � 0

sin � cos � 0

0 0 1

35

24

x

y

z

35

We cannot represent translation as a matrix transformation in this way. However, if we use4� 4 matrices, we can represent all three transformations because

2664

1 0 0 a

0 1 0 b

0 0 1 c

0 0 0 1

3775

2664

x

y

z

1

3775 D

2664

x C a

y C b

z C c

1

3775

We can view the use of 4� 4 matrices simply as a trick for making translation work or as amove to four-dimensional affine space. The graphics programming is the same in either case.

14

2 TRANSFORMATIONS AND PROJECTIONS 2.1 Matrices in OpenGL

2.1 Matrices in OpenGL

OpenGL maintains several matrices, of which the most important are the projection matrixand the model view matrix . Since the transformation from model coordinates to windowcoordinates is achieved by multiplying these two matrices together, one matrix would actuallysuffice. Splitting the transformation into two parts is convenient because

˘ the projection determines our view of the model and

˘ the model view matrix determines our position and orientation within the model.

Thus we can think of the projection matrix as applying to the entire model and the modelview matrix as applying to parts of the model.

To manipulate matrices:

˘ Call glMatrixMode(mode) to choose a matrix. The value of mode is either GL_PROJECTIONor GL_MODELVIEW.

˘ Call glLoadIdentity() to set this matrix to the identity matrix.

˘ Call a projection function to set the value of the projection matrix.

˘ Call transformation functions to change the value of the model view matrix. The most fre-quently used transformation functions are glTranslatef(x, y, z), glRotatef(� , x, y, z),and glScalef(r, s, t).

We consider projection matrices first and then model view matrices.

2.2 Projection Matrices

2.2.1 Orthogonal Projections

The simplest projection matrix gives an orthogoal projection. This is done very simply byignoring Z values and scaling X and Y values to fit the current viewport. The call

glOrtho(left, right, bottom, top, near, far)

defines an orthogonal transformation for which the point at (x, y, z) in the model will bevisible if left � x � right, bottom � y � top, and near � z � far. These positions define theviewing volume : a point is visible only if it is within the viewing volume. Note that valuesof Z are constrained even though they do not affect the position of the projected point in thewindow. The matrix constructed by this call is

ortho D

2666666664

2

right�left 0 0 �rightCleftright�left

0 2

top�bottom 0 � topCbottomtop�bottom

0 0 � 2

far�near � farCnearfar�near

0 0 0 1

3777777775

Note that if any pair of arguments are equal, the matrix will have infinite elements and yourprogram will probably crash.

The default orthogonal projection is equivalent to

glOrtho(-1, 1, -1, 1, -1, 1);

15

2 TRANSFORMATIONS AND PROJECTIONS 2.2 Projection Matrices

HHHHHHHH

HHHHHHHH

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

O ˛``````````````````````````````

r r

r

r

r

near� -

far� -

wHHHHY

HHHHj

h

6

?

rHHjX

6Y

�Z

Figure 7: Perspective projection using gluPerspective()

In other words, the default viewing volume is defined by jxj � 1, jyj � 1, and jzj � 1.

A blank window is a common problem during OpenGL program development. In manycases, the window is blank because your model is not inside the viewing volume.

2.2.2 Perspective Projections

A perspective transformation makes distant objects appear smaller. One way to visualize aperspective transformation is to imagine yourself looking out of a window. If you copy thescene outside onto the glass, without moving your head, the image on the glass will be aperspective transformation of the scene.

The simplest way to obtain a perspective transformation in OpenGL is to call

gluPerspective(angle, aspectRatio, near, far);

The effect of this call is shown in Figure 7. The angle between the top and bottom of thescene, as seen by the viewer, is angle. The value of aspectRatio is the width of the windowdivided by its height. The values of near and far determine the closest and furthest points inthe viewing volume.

Here is a typical reshape function using gluPerspective:

void reshape(int width, int height)f

glViewport(0, 0, width, height);glMatrixMode(GL PROJECTION);glLoadIdentity();gluPerspective(30, (GLfloat)width / (GLfloat)height, 5, 40);glutPostRedisplay();

g

16

2 TRANSFORMATIONS AND PROJECTIONS 2.2 Projection Matrices

The new window has size width� height. The first line of the function makes the viewportfit the window. The second and third lines set the projection matrix to the identity matrix.The fourth line turns it into a perspective projection with a vertical angle of 30ı and thesame aspect ratio as the new viewport. The viewing volume is bounded in the Z direction by�40 � z � �5.

The arguments near and far must satisfy far > near > 0. The model coordinates are negativebecause the Z axis is pointing towards the viewer.

A perspective projection will appear correct only if the window subtends the given angle atthe viewer’s eye. If the value of angle is � , the height of the window is h, and the distancefrom the viewer to the window is d , then

� D 2 tan�1

�h

2d

�

or

2 tan�

2D

h

d

For example, the value of angle in the reshape function above is 30ı. If the scene is viewedin a window 6 inches high, the viewer should place his or her eyes about 11 inches from thescreen.

Changing the value of angle gives the same effect as zooming a camera lens: the viewpointremains the same, but the angle of view changes.

The function gluPerspective is not an OpenGL primitive (its name begfins with glu, so itis an OpenGL utility function). The OpenGL primitive function for perspective projection isglFrustum, which we provide with the boundaries of the viewing volume:

glFrustum(left, right, bottom, top, near, far);

The arguments look (and, in fact, are) the same as those for glOrtho. The difference is thatnear determines the distance of the viewer from the near plane and the shape of the viewingvolume is a “frustum” (truncated rectangular pyramid) rather than a cuboid (brick-shapedsolid). The matrix constructed by the call is shown below — compare this to the matrixgenerated by glOrtho on page 15.

frustum D

2666666664

2 nearright�left 0 �rightCleft

right�left 0

0 2 neartop�bottom � topCbottom

top�bottom 0

0 0 � farCnearfar�near �2 far near

far�near

0 0 �1 0

3777777775

Let v D [X, Y, Z, 1]T . When we multiply this vector by ortho, the transformed X and Y

coordinates are independent of Z: with an orthogonal transformation, distance from the

17

2 TRANSFORMATIONS AND PROJECTIONS 2.3 Model View Transformations

viewer does not affect size. (We have abbreviated right to r , left to l , etc.)

ortho � v D

26666666664

��2 X C r C l

r � l

2 Y � t � b

t � b

�2 Z C f C n

f � n

1

37777777775

However, if we multiply frustum by v, we obtain the transformed coordinates below afternormalization (that is, scaling so that the fourth component is 1). Note that the X and Y

coordinates depend on the value of Z, moving towards the origin as Z gets larger.

frustum � v D

26666666664

�2 n X C Z (r C l)

Z (r � l)

�2 n Y C Z (t C b)

Z (t � b)

2 n f C Z (f C n)

Z (f � n)

1

37777777775

It might seem easiest to put the near plane very close and the far plane very far away becausethis reduces the chance that the model will be outside the viewing volume. The drawback isthat precision may be lost in depth comparison calculations. The number of bits lost is aboutlog2

� farnear

�. For example, if you set near D 1 and far D 1000, you will lose about 10 bits of

precision.

Figure 8 shows a very simple OpenGL program that uses a perspective transformation. The dis-play function includes a translation that moves the model 10 units in the negative z direction,placing it comfortably in the viewing volume, which extends from z D �5 to z D �20. Theoutput statement in function reshape reveals when and how often the function is called; whenI ran this progam under Windows, reshape was called once, with width D height D 300.

2.3 Model View Transformations

Getting the projection right is the easy part. Manipulating the model view matrix is harderbecause there is more work to do.

As we have seen, there are transformations for translating, rotating, and scaling. The orderin which these transformations are applied is important.

The graphics software simulates a camera, transforming a three-dimensional object, viewedfrom a certain angle, into a rectangular, two-dimensional image. There are two ways ofthinking about a viewing transformation, and it is helpful to be able to think using both.

˘ A viewing transformation has the effect of moving the model with respect to thecamera.

˘ A viewing transformation has the effect of moving the camera with respect to themodel.

18


void display ()f

glClearColor(0, 0.1, 0.4, 0);glClear(GL COLOR BUFFER BIT);glMatrixMode(GL MODELVIEW);glLoadIdentity();glTranslatef(0, 0, -10);glutWireCube(1);glFlush();

g

void reshape(int width, int height)f

cout << "Reshape " << width << " " << height << endl;glViewport(0, 0, width, height);glMatrixMode(GL PROJECTION);glLoadIdentity();gluPerspective(30, (GLfloat)width / (GLfloat)height, 5, 20);glutPostRedisplay();

g

void main (int argc, char **argv)f

glutInit(&argc, argv);glutCreateWindow("Perspective Transformation");glutDisplayFunc(display);glutReshapeFunc(reshape);glutMainLoop();

g

Figure 8: An OpenGL program with a perspective transformation

Naturally, the two approaches are inverses. We can think of the transformation

glTranslatef(0, 0, -10);

used in the program of Figure 8 as either moving the model 10 units in the �Z direction ormoving the camera 10 units in the CZ direction.

Initially, the camera and the model are both situated at the origin, (0, 0, 0), with the cameralooking in the CZ direction. If we want to see the model, we have either to move it awayfrom the camera, or move the camera away from the model.

For most purposes, it is easiest to visualize transformations like this: the camera remainsfixed, and the transformation moves the origin to a new location. All drawing takes placeat the current origin. For example, when we call glutWireCube, the cube is drawn with itscentre at the current origin. Viewed in this way, the translation above moves the origin to(0, 0,�10) and continues drawing there.

Physicists use the term frame of reference , or frame for short, for a coordinate systemwith an origin and axes. With this terminology, the effect of a transformation is to move the

19


void display ()f

glClearColor(0, 0.1, 0.4, 0);glClear(GL COLOR BUFFER BIT);glMatrixMode(GL MODELVIEW);glLoadIdentity();glTranslatef(0, 0, -10);glRotatef(15, 0, 1, 0);glutWireCube(1);glFlush();

g

Figure 9: Translation followed by rotation

void display ()f

glClearColor(0, 0.1, 0.4, 0);glClear(GL COLOR BUFFER BIT);glMatrixMode(GL MODELVIEW);glLoadIdentity();glRotatef(15, 0, 1, 0);glTranslatef(0, 0, -10);glutWireCube(1);glFlush();

g

Figure 10: Rotation followed by translation

frame, leaving the camera where it is, and to draw objects with respect to the new frame.

Although particular kinds of transformations commute, transformations in general do notcommute.

Consider the two versions of the display function shown in Figures 9 and 10.

˘ In Figure 9, we first translate the frame of reference 10 units in the �Z direction andthen rotate it 15ı about the Y axis. Finally, we draw the cube. The effect is that thecube appears in the middle of the window, rotated 15ı about the vertical axis.

˘ In Figure 10, we first rotate the frame 15ı about the Y axis and then translate it 10units in the �Z direction. Since the axes have been rotated, direction of the Z axis haschanged, and the cube moves to the side of the window.

20

3 BUILDING MODELS AND SCENES

3 Building Models and Scenes

3.1 A Digression on Global Variables

A common criticism of OpenGL is that programs depend too much on global variables. Thecriticism is valid in the sense that most small OpenGL programs, especially example programs,do make heavy use of global variables. To some extent, this is inevitable, because the call-back functions have fixed parameter lists and do not return results: the only way they cancommunicate is with global variables.

For example, suppose we want the mouse callback function to affect the display. Since themouse callback function receives the mouse coordinates and returns void, and the displayfunction receives nothing and returns void, these functions can communicate only by meansof global variables.

It is impossible to avoid having a few global variables. However, the number of global variablescan be made quite small by following standard encapsulation practices. For example, thecurrent GLUT context (current window, its width and height, position of the mouse, etc.)can be put inside an object. Then there needs to be only one global variable, probably apointer to this object, and callback functions reference this object.

Figure 11 shows a small program written using global variables. It is typical of small OpenGLexample programs. Figures 12 and 13 show an equivalent program written with fewer globalvariables. In fact, the only global variable in the second version is ps, which is a pointerto an object that contains all the state necessary for this application. The display, reshape,and mouse functions communicate by “sending messages” to this unique object. This tech-nique extends nicely to larger programs. For example, multiple windows can be handled byassociating one object with each window.

21

3 BUILDING MODELS AND SCENES 3.1 A Digression on Global Variables

int mwWidth;int mwHeight;

GLfloat xPos;GLfloat yPos;

void display (){

glClearColor(0, 0.1, 0.4, 0);glClear(GL_COLOR_BUFFER_BIT);glMatrixMode(GL\MODELVIEW);glLoadIdentity();glTranslatef(0, 0, -10);glRotatef(360 * xPos, 0, 1, 0);glutWireCube(1);glFlush();

}

void reshape(int width, int height){

mwWidth = width;mwHeight = height;glViewport(0, 0, width, height);glMatrixMode(GL_PROJECTION);glLoadIdentity();gluPerspective(30, (GLfloat)width / (GLfloat)height, 5, 20);glutPostRedisplay();

}

void mouse(int x, int y){

xPos = (GLfloat)x / (GLfloat)mwWidth;yPos = (GLfloat)y / (GLfloat)mwHeight;glutPostRedisplay();

}

void main (int argc, char **argv){

glutInit(&argc, argv);glutCreateWindow("Perspective Transformation");glutDisplayFunc(display);glutReshapeFunc(reshape);glutMotionFunc(mouse);glutMainLoop();

}

Figure 11: Programming with global variables

22

3 BUILDING MODELS AND SCENES 3.1 A Digression on Global Variables

class State{public:

State(int w, int h) : mwWidth(w), mwHeight(h) {}int mwWidth;int mwHeight;GLfloat xPos;GLfloat yPos;

};

State *ps;

void display (){

glClearColor(0, 0.1, 0.4, 0);glClear(GL_COLOR_BUFFER_BIT);glMatrixMode(GL_MODELVIEW);glLoadIdentity();glTranslatef(0, 0, -10);glRotatef(360 * ps->xPos, 0, 1, 0);glutWireCube(1);glFlush();

}

void reshape(int width, int height){

ps->mwWidth = width;ps->mwHeight = height;glViewport(0, 0, width, height);glMatrixMode(GL_PROJECTION);glLoadIdentity();gluPerspective(30, (GLfloat)width / (GLfloat)height, 5, 20);glutPostRedisplay();

}

void mouse(int x, int y){

ps->xPos = (GLfloat)x / (GLfloat)ps->mwWidth;ps->yPos = (GLfloat)y / (GLfloat)ps->mwHeight;glutPostRedisplay();

}

Figure 12: Programming with fewer global variables

23

3 BUILDING MODELS AND SCENES 3.2 Matrix Stacks

void main (int argc, char **argv){

const int WIDTH = 600;const int HEIGHT = 400;glutInit(&argc, argv);glutCreateWindow("Perspective Transformation");glutInitWindowSize(WIDTH, HEIGHT);glutDisplayFunc(display);glutReshapeFunc(reshape);glutMotionFunc(mouse);ps = new State(WIDTH, HEIGHT);glutMainLoop();

}

Figure 13: Programming with fewer global variables, continued

glTranslatef(0, 0, LEN / 2);glScalef(DIAM, DIAM, LEN);glutWireCube(1);glScalef(1/DIAM, 1/DIAM, 1/LEN);glTranslatef(0, 0, LEN / 2 + RAD);glutWireSphere(RAD, 15, 15);glTranslatef(0, 0, - LEN - RAD);

Figure 14: Drawing an arm — first version

3.2 Matrix Stacks

In this section, we develop a program that draws a Maltese Cross. The cross has six arms,pointing in the directions ˙X , ˙Y , and ˙Z. Each arm has a square cross-section and endswith a sphere.

Figure 14 shows the code for one arm of the cross. The arm has diameter DIAM and lengthLEN; the sphere has radius RAD. The origin is at the base of the arm. The arm is obtainedby changing the scale, drawing a cube, and resetting the original scale. When this codehas finished, the origin is restored by a translation that reverses the effect of the earliertranslations.

There are two undesirable features of the code in Figure 14. First, the need to “undo” thescaling transformation. (In particular, note that the user might want to obtain“flat” armsby setting one diameter to zero. In this case, the scaling cannot be reversed.) Second, thesame problem applies to the frame of reference: we have to reverse the effect of translation tomaintain the position of the origin.

OpenGL matrices are implemented as matrix stacks. To avoid reversing transformations, wecan stack the matrices that we need using glPushMatrix and restore them when we havefinished using glPopMatrix. Figure 15 is the revised code for drawing an arm. This codeleaves the frame of reference unchanged. The indentation is not required, of course, but ithelps the reader to understand the effect of the transformations.

24


glPushMatrix();glTranslatef(0, 0, LEN / 2);glPushMatrix();

glScalef(DIAM, DIAM, LEN);glutWireCube(1);

glPopMatrix();glTranslatef(0, 0, LEN / 2 + RAD);glutWireSphere(RAD, 15, 15);

glPopMatrix();

Figure 15: Drawing an arm — improved version

void display ()f

glClear(GL COLOR BUFFER BIT | GL DEPTH BUFFER BIT);glMatrixMode(GL MODELVIEW);glLoadIdentity();arm();glRotatef(90, 1, 0, 0);arm();glRotatef(180, 1, 0, 0);arm();glRotatef(270, 1, 0, 0);arm();glRotatef(90, 0, 1, 0);arm();glRotatef(270, 0, 1, 0);arm();glFlush();

g

Figure 16: Drawing a Maltese Cross

We can put the code of Figure 15 into a function called arm. This function has an importantproperty that should be respected by all drawing functions: it leaves the reference frameunchanged . Drawing scenes with functions that do not have this property can be veryconfusing! The display function calls arm six times to draw the Maltese Cross: see Figure 16.

3.2.1 Pops and Pushes Don’t Cancel!

When pushing and popping matrices, it is important to realize that the sequence

glPopMatrix();glPushMatrix();

does have an effect: the two calls do not cancel each other out. To see this, look at Figure 17.The left column contains line numbers for identification, the second column contains code, thethird column shows the matrix at the top of the stack after each function has executed, and

25


# Code Stack1 glLoadIdentity(); I

2 glPushMatrix(); I I

3 glTranslatef(1.0, 0.0, 0.0); T I

4 glPushMatrix(); T T I

5 glRotatef(10.0, 0.0, 1.0, 0.0); T �R T I

6 glPopMatrix(); T I

7 glPushMatrix(); T T I

Figure 17: Pushing and popping

the other columns show the matrices lower in the stack. The matrices are shown as I (theidentity), T (the translation), and R (the rotation). At line 4, there are three matrices onthe stack, with T occupying the top two places. Line 5 post-multiplies the top matrix by R.Line 6 pops the product T �R off the stack, restoring the stack to its value at line 3. Line 7pushes the stack and copies the top matrix. Note the difference between the stack entries atline 5 and line 7.

3.2.2 Animation with Stacks

The ability to stack matrices is particularly important for animation. Suppose that we wantto draw a sequence of images showing a robot waving its arms. There will be at least twoangles that change with time: let us say that heta is the angle between the body and theupper arm, and � is the angle between the upper arm and the forearm. Then the animationwill include roughly the following steps.

1 Push frame 1.

1.1 Draw the body.

1.2 Translate to the left shoulder.

1.3 Push frame 2.

1.3.1 Rotate through heta.1.3.2 Draw the upper arm and translate along it.1.3.3 Rotate through �.1.3.4 Draw the forearm and hand.

1.4 Pop, restoring frame 2 (the left shoulder).

1.5 Translate to the right shoulder.

1.6 Push frame 3.

1.6.1 Rotate through heta.1.6.2 Draw the upper arm and translate along it.1.6.3 Rotate through �.1.6.4 Draw the forearm and hand.

1.7 Pop, restoring frame 3 (the right shoulder).

2 Pop, restoring frame 1 (the body).

26

3 BUILDING MODELS AND SCENES 3.3 Viewing the Model

3.3 Viewing the Model

So far, we have used a convention that is quite common in OpenGL programs: the reshapecallback function sets the projection matrix and the display callback function sets the modelview matrix. The partial program shown in Figure 18 shows a variation on this theme: thedisplay function sets both the projection and the model view matrices. The reshape functionupdates the viewport and sets the global variables width and height, which are needed forgluPerspective. The Y mouse coordinate is used to set the angle of view in gluPerspective;this has the effect that moving the mouse down (up) in the window zooms the image towards(away from) the viewer.

The function gluLookAt() defines a model-view transformation that simulates viewing (“look-ing at”) the scene from a particular viewpoint. It takes nine arguments of type GLfloat. Thefirst three arguments define the camera (or eye) position with respect to the origin; the nextthree arguments are the coordinates of a point in the model towards which the camera isdirected; and the last three arguments are the components of a vector pointing upwards. Inthe call

gluLookAt ( 0.0, 0.0, 10.0,0.0, 0.0, 0.0,0.0, 1.0, 0.0 );

the point of interest in the model is at (0, 0, 0), the position of the camera relative to thispoint is (0, 0, 10), and the vector (0, 1, 0) (that is, the Y -axis) is pointing upwards.

Although the idea of gluLookAt() seems simple, the function is tricky to use in practice.Sometimes, introducing a call to gluLookAt() has the undesirable effect of making the imagedisappear altogether! In the following code, the effect of the call to gluLookAt() is to movethe origin to (0, 0,�10); but the near and far planes defined by gluPerspective() are atz D �1 and z D �5, respectively. Consequently, the cube is beyond the far plane and isinvisible.

glMatrixMode(GL_PROJECTION);glLoadIdentity();gluPerspective(30, 1, 1, 5);glMatrixMode(GL_MODELVIEW);glLoadIdentity();gluLookAt ( 0, 0, 10, 0, 0, 0, 0, 1, 0 );glutWireCube(1.0);

Figure 19 demonstrates how to use gluLookAt() and gluPerspective() together. The twoimportant variables are alpha and dist. The idea is that the extension of the object in the Z-direction is less than 2; consequently, it can be enclosed completely by planes at z D dist� 1

and z D distC 1. To ensure that the object is visible, gluLookAt() sets the camera positionto (0, 0, dist).

Changing the value of alpha in Figure 19 changes the size of the object (see Section 2.2.2).The height of the viewing window is 2 (dist� 1) tan (alpha=2); increasing alpha makes theviewing window larger and the object smaller.

Changing the value of dist also changes the size of the image in the viewport, but in adifferent way. The perspective changes, giving the effect of approaching (if dist gets smallerand the object gets larger) or going away (if dist gets larger and the object gets smaller).

27


const int WIDTH = 600;const int HEIGHT = 400;int width = WIDTH;int height = HEIGHT;GLfloat xMouse = 0.5;GLfloat yMouse = 0.5;GLfloat nearPlane = 10;GLfloat farPlane = 100;GLfloat distance = 80;

void display ()f

glMatrixMode(GL PROJECTION);glLoadIdentity();gluPerspective(20 + 60 * yMouse, GLfloat(width) / GLfloat(height),

nearPlane, farPlane);glClear(GL COLOR BUFFER BIT | GL DEPTH BUFFER BIT);glMatrixMode(GL MODELVIEW);glLoadIdentity();glTranslatef(0, 0, -distance);// Display sceneglutSwapBuffers();

g

void mouseMovement (int mx, int my)f

xMouse = GLfloat(mx) / GLfloat(width);yMouse = 1 - GLfloat(my) / GLfloat(height);glutPostRedisplay();

g

void reshapeMainWindow (int newWidth, int newHeight)f

width = newWidth;height = newHeight;glViewport(0, 0, width, height);

g

Figure 18: Zooming

28


const int SIZE = 500;float alpha = 60.0;float dist = 5.0;

void display (void){

glClear(GL_COLOR_BUFFER_BIT);glMatrixMode(GL_MODELVIEW);glLoadIdentity();gluLookAt ( 0.0, 0.0, dist,

0.0, 0.0, 0.0,0.0, 1.0, 0.0 );

glutWireCube(1.0);}

int main (int argc, char *argv[]){

glutInit(&argc, argv);glutInitWindowSize(SIZE, SIZE);glutInitWindowPosition(100, 50);glutCreateWindow("A Perspective View");glMatrixMode(GL_PROJECTION);glLoadIdentity();gluPerspective(alpha, 1.0, dist - 1.0, dist + 1.0);glutDisplayFunc(display);glutMainLoop();

}

Figure 19: Using gluLookAt() and gluPerspective()

It is possible to change alpha and dist together in such a way that the size of a key objectin the model stays the same while the perspective changes. This is a rather simple techniquein OpenGL, but it is an expensive effect in movies or television because the zoom control ofthe lens must be coupled to the tracking motion of the camera. Hitchcock used this trick togood effect in his movies Vertigo and Marnie.

The code extracts in the following example are taken from a program called viewpoints.cpp.You can obtain the complete source code for this program from

http://www.cs.concordia.ca/~faculty/grogono/viewpoints.cpp

The display function of this program consists mainly of a switch statement for which thecases are determined by keys pressed by the user. There is also an idle function callback thatperforms the following computation:

carDirection += 0.05f;if (carDirection > TWOPI)

carDirection -= TWOPI;carXPos = TRACKMIDDLE * sin(carDirection);

29


carYPos = TRACKMIDDLE * cos(carDirection);

In the simplest case, the camera stands still and we watch the car going around the track.

case DISTANT:gluLookAt(

250.0, 0.0, 20.0 * height,0.0, 0.0, 0.0,0.0, 0.0, 1.0 );

drawScenery();glTranslatef(carXPos, carYPos, carZPos);glRotatef(RAD2DEG * carDirection, 0.0, 0.0, -1.0);drawCar();break;

In the next mode, the camera stays in the same position but pans to follow the car. This iseasily done by using the car’s position as the viewpoint of gluLookAt.

case INSIDE:gluLookAt(

85.0, 0.0, height,carXPos, carYPos, 0.0,0.0, 0.0, 1.0 );

drawScenery();glTranslatef(carXPos, carYPos, carZPos);glRotatef(RAD2DEG * carDirection, 0.0, 0.0, -1.0);drawCar();break;

In the next mode, we see the scene from the driver’s point of view. The call to gluLookAtestablishes an appropriate point of view, assuming the car is at the origin and the car isdrawn without any further transformation. We then apply inverse transformations to showthe scenery moving with respect to the car.

case DRIVER:gluLookAt(

2.0, 0.0, height,12.0, 0.0, 2.0,0.0, 0.0, 1.0 );

drawCar();glRotatef(RAD2DEG * carDirection, 0.0, 0.0, 1.0);glTranslatef(- carXPos, - carYPos, carZPos);drawScenery();break;

The next mode is the hardest to get right. We are in the car but looking at a fixed object inthe scene. The first rotation counteracts the rotation of the car. The we call gluLookAt tolooks from the driver’s position to the house at (40, 120). We then draw the car — which isnow rotating with respect to the camera, reverse the rotation and translation transformations,and draw the scenery.

case HOUSE:glRotatef(RAD2DEG * carDirection, 0.0, -1.0, 0.0);gluLookAt(

30


2.0, 0.0, height,40.0 - carXPos, 120.0 - carYPos, carZPos,0.0, 0.0, 1.0 );

drawCar();glRotatef(RAD2DEG * carDirection, 0.0, 0.0, 1.0);glTranslatef(- carXPos, - carYPos, carZPos);drawScenery();break;

31

4 LIGHTING

4 Lighting

The techniques that we have developed so far enable us to draw various shapes, to view themin perspective, and to move them around. With these techniques alone, however, it is hard tocreate the illusion of reality. The missing dimension is lighting : with skillful use of lighting,we can turn a primitive computer graphic into a realistic scene.

OpenGL provides simple but effective facilities for lighting. It compromises between realismand efficiency: there are more sophisticated algorithms for lighting than those that OpenGLuses, but they require significantly more processing time. OpenGL is good enough for mostpurposes and fast enough to animate fairly complex scenes on today’s PCs.

4.1 Lighting Basics

4.1.1 Hiding Surfaces and Enabling Lights

Since lighting does not make much sense without hidden surface removal, we will assume inthis section that initialization includes these calls:

glutInitDisplayMode(GLUT DOUBLE | GLUT RGB);glEnable(GL DEPTH TEST);

and that the display function clears the depth buffer bit:

glClear(GL COLOR BUFFER BIT | GL DEPTH BUFFER BIT);

OpenGL performs lighting calculations only if GL LIGHTING is enabled. There are eight lightsand their names are GL LIGHTn, where n D 0, 1, 1, . . . , 7. For simple applications, we need touse only the first light. During initialization, we execute

glEnable(GL LIGHTING);glEnable(GL LIGHT0);

As usual, OpenGL works with states: we can turn lighting on or off at any time by callingglEnable(GL LIGHTING) and glDisable(GL LIGHTING). This is useful for scenes which arepartly lit and partly unlit.

4.1.2 Kinds of Light

Nature provides only one kind of light, in the form of photons. We can simulate photons, butonly with large amounts of computation. Research has shown that we can obtain realisticillumination in a reasonably efficient way by dividing light into four categories. Note that wenever see the light itself, but only the surfaces that are illuminated by it. Even when we seea “beam” of light, for example in a movie theatre, we are actually seeing illuminated dustparticles, not light.

Ambient light is light that pervades every part of the scene. Ambient light has no directionand it illuminates every object in the same way, regardless of its position or orientation. Ifthere is no ambient light, the scene has a harsh, “outer space” quality, in which an surfacethat is not directly illuminated appears completely black — and therefore invisible.

Diffuse light comes from a particular direction but is scattered in all directions by the surfaceit hits. The brightness of a diffusely lit surface varies with the direction of the light: a simpleassumption is that the brightness is proportional to cos � , where � is the angle between the

32

4 LIGHTING 4.2 Material Properties

light rays and the normal to the surface. The colour of a surface lit by diffuse light is thecolour of the surface: a red object appears red, etc.

Specular light comes from a particular direction and is reflected in a cone. A mirror is analmost perfect specular surface: a light ray is reflected as a light ray (the angle of the coneis zero). Glossy objects reflect specular light in a cone whose angle depends on the shininessof the material. If you think of a sphere illuminated by a small, bright light, it will have acircular highlight: the size of the highlight will be small for a very shiny sphere and larger fora matte sphere. Unlike diffuse light, the colour of specular light depends more on the lightsource than on the illuminated surface. A red sphere lit by a white light has a white highlight.

Emissive light is light that appears to be coming from the object. In computer graphics,we cannot actually construct objects that emit light; instead, we create the illusion that theyare emitting light by making their colour independent of the other lighting in the scene.

4.2 Material Properties

The function that we have been using, glColor, does not provide enough information forlighting. In fact, it has no effect when lighting is enabled. Instead, we must define the colourmaterial properties of each surface by calling glMaterial{if}[v].

Each call to glMaterial has three arguments: a face, a property, and a value for that property.Vector properties are usually passed by reference, as in this example:

GLfloat deepBlue[] = 0.1, 0.5, 0.8, 1.0 ;glMaterialfv(GL FRONT, GL AMBIENT AND DIFFUSE, deepBlue);

The first argument must be one of GL_FRONT, GL_BACK, or GL_FRONT_AND_BACK. It deter-mines which face of each polygon the property will be applied to. The most common andefficient form is GL_FRONT. Figure 20 gives possible values of the second argument and cor-responding default values of the third argument. Figure 21 provides examples of the use ofglColorMaterial.

The default specular colour is black, which means that objects have matte surfaces. To createa shiny object, set GL_SPECULAR to white and specify the shininess as a single number between0 and 128. The object will look as if it is made of plastic; creating other materials, such asmetals, requires a careful choice of values.

Parameter Meaning DefaultGL_DIFFUSE diffuse colour of material (0.8, 0.8, 0.8, 1.0)

GL_AMBIENT ambient colour of material (0.2, 0.2, 0.2, 1.0)

GL_AMBIENT_AND_DIFFUSE ambient and diffuseGL_SPECULAR specular colour of material (0.0, 0.0, 0.0, 1.0)

GL_SHININESS specular exponent 0.0

GL_EMISSION emissive colour of material (0.0.0, 0.0, 1.0)

Figure 20: Parameters for glMaterialfv()

33

4 LIGHTING 4.3 Light Properties

/* Data declarations */GLfloat off[] = { 0.0, 0.0, 0.0, 0.0 };GLfloat white[] = { 1.0, 1.0, 1.0, 1.0 };GLfloat red[] = { 1.0, 0.0, 0.0, 1.0 };GLfloat deep_blue[] = { 0.1, 0.5, 0.8, 1.0 };GLfloat shiny[] = { 50.0 };GLfloat dull[] = { 0.0; }

/* Draw a small, dark blue sphere with shiny highlights */glMaterialfv(GL_FRONT, GL_AMBIENT_AND_DIFFUSE, deep_blue);glMaterialfv(GL_FRONT, GL_SPECULAR, white);glMaterialfv(GL_FRONT, GL_SHININESS, shiny);glutSolidSphere(0.2, 10, 10);

/* Draw a large, red cube made of non-reflective material */glMaterialfv(GL_FRONT, GL_AMBIENT_AND_DIFFUSE, red);glMaterialfv(GL_FRONT, GL_SPECULAR, off);glMaterialfv(GL_FRONT, GL_SHININESS, dull);glutSolidCube(10.0);

/* Draw a white, glowing sphere */glMaterialfv(GL_FRONT, GL_AMBIENT_AND_DIFFUSE, off);glMaterialfv(GL_FRONT, GL_SPECULAR, off);glMaterialfv(GL_FRONT, GL_SHININESS, dull);glMaterialfv{GL_FRONT, GL_EMISSION, white);glutSolidSphere(10.0, 20, 20);

Figure 21: Using glMaterial()

4.3 Light Properties

As mentioned above, OpenGL provides up to eight lights, each with its own properties. Sincelighting calculations must be performed for each light, using a large number of lights will slowdown the program. For most applications, it is best to use one or two lights only, to obtainacceptable performance. However, the realism of a scene can be greatly enhanced by multiplelights and there are occasions where a rich image is more important than fast animation.

Light properties are set by calling glLight{if}[v]with three arguments: the light (GL_LIGHT0,GL_LIGHT1, etc.); the property name; and the property value. Figure 22 describes each prop-erty and gives the default value for GL_LIGHT0. The default values for other lights are allzero. This means that if you enable GL_LIGHT0 and do nothing else, you will see somethingbut, if you enable any other light you won’t see anything unless you specify its properties.

A light has three of the four colour components: diffuse, ambient, and specular, but notemissive. (We have seen why a surface should have these colours but it is not obvious whya light needs them as well. We will discuss this later, in the theory part of the course.)

A light has a position specified in four-dimensional coordinates (x, y, z,w). The fourth coor-dinate, w, has a special significance: if it is zero, the light is at infinity and the other three

34

4 LIGHTING 4.3 Light Properties

Parameter Meaning DefaultGL_DIFFUSE diffuse colour (1.0, 1.0, 1.0, 1.0)

GL_AMBIENT ambient colour (0.0, 0.0, 0.0, 1.0)

GL_SPECULAR specular colour (1.0, 1.0, 1.0, 1.0)

GL_POSITION position (0.0, 0.0, 1.0, 0.0)

GL_CONSTANT ATTENUATION constant attenuation 1.0

GL_LINEAR_ATTENUATION linear attenuation 0.0

GL_QUADRATIC_ATTENUATION quadratic attenuation 0.0

GL_SPOT_CUTOFF cutoff angle of spotlight 180.0

GL_SPOT_DIRECTION direction of spotlight (0.0, 0.0,�1.0)

GL_SPOT_EXPONENT exponent of spotlight 0.0

Figure 22: Parameters for glLightfv()

coordinates give its direction; if it is 1, the light is at the position specified. For example,the default position is (0, 0, 1, 0) which specifies a light in the positive Z direction (behindthe viewer in the default coordinate system) and infinitely far away. The position (5, 1, 0, 1)

defines a light on the left, slightly raised, and in the Z plane of the viewer. If w D 0, we havea directional light and, if w D 1, we have a positional light .

Once again, the choice of light position trades realism and efficiency. If the light is at infinity,its rays are parallel and lighting computations are fast. If the light is local, its rays hit eachobject at a different angle and lighting computations take longer.

The attenuation factors determine how the brightness of the light decreases with distance.OpenGL computes attenuation with the formula

a D1

c C ` d C q d2

in which a is the attenuation factor, d is the distance from the light to the object, and c,`, and q are the constant, linear, and quadratic attenuation coefficients, respectively. Thedefault values are c D 1, ` D 0, and q D 0. Clearly, ` and q must be zero for a directionallight because d D 1; in practice, OpenGL ignores these values for directional lights.

Physics tells us that the intensity of light decreases as the inverse square of the distancefrom the source and therefore suggests setting c D ` D 0 and giving q some non-zero value.However, the inverse-square law applies only to point sources of light, which are rather rarein everyday life. For most purposes, the default values, c D 1 and ` D q D 0, are adequate.Giving non-zero values to ` and q may give somewhat more realistic effects but will makeyour program slower.

Every light is a spotlight that emits light in a cone. The angle of the cone is set byGL_SPOT_CUTOFF and its default value is 180ı which means that the light emits in all di-rections. Changing the value of GL_SPOT_CUTOFF gives the effect of a light that emits in aparticular direction. For example, a value of 5ı simulates a highly directional beam such asthe headlight of a car.

If you give GL_SPOT_CUTOFF a value other than 180ı, you should also give appropriate valuesto GL_SPOT_DIRECTION and GL_SPOT_EXPONENT. The direction is simply a vector specified bythe values (x, y, z). The exponent determines how focused the beam is. The light intensity at

35

4 LIGHTING 4.4 Lighting Model

Parameter Meaning DefaultGL_LIGHT_MODEL_AMBIENT ambient light intensity (0.2, 0.2, 0.2, 1.0)

GL_LIGHT_MODEL_LOCAL_VIEWER simulate a close viewpoint GL_FALSEGL_LIGHT_MODEL_TWO_SIDE select two-sided lighting GL_FALSEGL_LIGHT_MODEL_COLOR_CONTROL colour calculations GL_SINGLE_COLOR

Figure 23: Parameters for glLightModel()

an angle � from the centre of the beam is cosx � , where x is the exponent value. The defaultvalue of the exponent is 0; since cos0 � D 1, the illumination is even across the cone.

4.4 Lighting Model

The lighting model determines overall features of the lighting. The lighting model is selectedby glLightModel{if}[v] which takes two arguments: a property and a value. Figure 23shows the property names and their default values.

The ambient light in a scene is light that pervades the scene uniformly, coming from alldirections. A moderate quantity of ambient light, such as provided by the default setting,makes the quality of light softer and ensures that every object is visible.

The other three properties allow you to choose between realism and speed. The local viewerproperty determines the way in which OpenGL calculates specular reflections. (Roughly,specular reflections come from shiny or glossy objects. We discuss it in more detail later.) Ifthe viewer is very distant, rays coming from the scene to the eye are roughly parallel, andspecular reflection calculations can be simplified by assuming that they actually are parallel(GL_FALSE, the default setting). If we want to model lighting accurately from the point ofview of a close viewer, we have to make more detailed calculations by choosing GL_TRUE forthis parameter.

Two-sided lighting illuminates both sides of each polygon; single-sided lighting, the default,illuminates only front surfaces and is therefore much faster. Suppose we are lighting a sphere:all of the polygons face outwards and single-sided lighting is all we need. Suppose, however,that we cut a hole in the sphere so that we can see inside. The inside of the sphere consistsof the back faces of the polygons and, in order to see them, we would need two-sided lighting.

As Figure 23 shows, the default colour calculation is GL_SINGLE_COLOR and it causesOpenGL to calculate a single colour for each vertex. The call

glLightModel(GL LIGHT MODEL COLOR CONTROL, GL SEPARATE SPECULAR COLOR);

makes OpenGL calculate two colours for each vertex. The two colours are used when texturing,to ensure that textured objects are illuminated realistically.

4.5 Lighting in Practice

Lighting a scene is fairly straightforward; the hardest part is to get the light(s) in the rightposition. Positioning is done in two steps. First, the position is defined as a value:

GLfloat pos = f 0, 0, 3, 1 g;Second, glLight is called with the property GL_POSITION:

36

4 LIGHTING 4.6 Normal Vectors

glLightfv(GL LIGHT0, GL POSITION, pos);

When this call is executed, the position given by pos is transformed by the current modelview matrix. This can be a bit confusing, because the coordinate frame is moved by the modelview matrix and the light is positioned with respect to the new coordinate frame. You mayfind it easier to set pos to (0, 0, 0, 1) and then move the frame to wherever you want the light.

Assuming that you have set pos as above, the following code in the display function will drawa stationery object with a fixed light.

glMatrixMode(GL MODELVIEW);glLoadIdentity();gluLookAt(0, 0, 5, 0, 0, 0, 0, 1, 0);glLightfv(GL LIGHT0, GL POSITION, pos);// Draw model

In the next version, the light rotates around the stationery object. Assume that angle iscontinuously updated by the idle function.

glMatrixMode(GL MODELVIEW);glLoadIdentity();gluLookAt(0, 0, 5, 0, 0, 0, 0, 1, 0);glPushMatrix();

glRotatef(angle, 1, 0, 0);glLightfv(GL LIGHT0, GL POSITION, pos);

glPopMatrix();// Draw model

A third possibility is that you want the light to move with the viewer, as if you were watchingthe scene with a miner’s light attached to your hardhat. For this purpose, it’s best to putthe light at the origin (where the camera is) and to set its position before doing any otherviewing transformation. The position is set by

GLfloat pos = 0, 0, 0, 1 ;

and the display function contains the code

glMatrixMode(GL MODELVIEW);glLoadIdentity();glLightfv(GL LIGHT0, GL POSITION, pos);gluLookAt(0, 0, 5, 0, 0, 0, 0, 1, 0);// Draw model

In each case, changing the fourth component of the light position to zero will give directional(rather than positional) light, which is faster to compute but less realistic.

4.6 Normal Vectors

In order to perform lighting calculations, OpenGL needs to know the direction of the normalat each point of the surface that it is rendering. The normal is a vector — usually a unitvector — that “sticks out” of the surface at right angles to it. The normals of a sphere, forexample, would pass through the centre of the sphere if extended far enough backwards.

OpenGL requires a normal to be associated with each vertex. This might seem odd, becausenormals belong to surfaces, not to vertexes. There are two ways of using normals.

37


˘ If we want to draw an object with clearly-distinguished faces, such as a cube, then eachvertex will have several normals associated with it. In the case of a cube, each cornervertex will have three normals, one for each face.

˘ In order to create the illusion of a smooth surface, we compute the average of the normalsof the surfaces that meet at a vertex. For example, if three surfaces meet at a vertex,and their normals are v1, v2, and v3, then the normal for that vertex is calculated as

v1 C v2 C v3

kv1 C v2 C v3kThe effect of this calculation is to smooth out the corners and edges of the object.

The function that sets a normal is glNormal3{bsidf}[v]() and it must be called before thevertex it applies to. Once the normal is set, it can be applied to any number of vertexes. Forexample, to draw a flat triangle in the XY -plane, we could execute:

glBegin(GL TRIANGLES);glNormal3i(0, 0, 1);glVertex3f(-0.5, 0, 0);glVertex3f(0.5, 0, 0);glVertex3f(0, 0.866, 0);

glEnd();

Here are three ways of computing normals.

1. Normal to a triangle. The vector normal to a triangle with vertices (x1, y1, z1), (x2, y2, z2),and (x3, y3, z3) is (a, b, c), where

a D Cˇˇ y2 � y1 z2 � z1

y3 � y1 z3 � z1

ˇˇ

b D �ˇˇ x2 � x1 z2 � z1

x3 � x1 z3 � z1

ˇˇ

c D Cˇˇ x2 � x1 y2 � y1

x3 � x1 y3 � y1

ˇˇ

Figure 24 shows a simple function that computes vector normal for a plane defined bythree points p, q, and r, chosen from an array of points.

2. Normal to a polygon. There is a simple algorithm, invented by Martin Newell, for find-ing the normal to a polygon with N vertexes. For good results, the vertexes should lieapproximately in a plane, but the algorithm does not depend on this. If the vertexes havecoordinates (xi , yi , zi) for i D 0, 1, 2, ..., N � 1, the normal n D (nx, ny , nz) is computedas

nx DX

0�i<N

(yi � yiC1)(zi C ziC1)

ny DX

0�i<N

(zi � ziC1)(xi C xiC1)

nz DX

0�i<N

(xi � xiC1)(yi C yiC1).

The subscript i C 1 is computed “mod N”: if i D N � 1, then i C 1 D 0. The result n

must be divided by knk Dq

n2x C n2

y C n2z to obtain a unit vector.

38


enum { X, Y, Z };typedef float Point[3];typedef float Vector[3];

Point points[MAX_POINTS];

void find_normal (int p, int q, int r, Vector v){

float x1 = points[p][X];float y1 = points[p][Y];float z1 = points[p][Z];float x2 = points[q][X];float y2 = points[q][Y];float z2 = points[q][Z];float x3 = points[r][X];float y3 = points[r][Y];float z3 = points[r][Z];v[X] = + (y2-y1)*(z3-z1) - (z2-z1)*(y3-y1);v[Y] = - (x2-x1)*(z3-z1) + (z2-z1)*(x3-x1);v[Z] = + (x2-x1)*(y3-y1) - (y2-y1)*(x3-x1);

}

Figure 24: Computing normals

3. Normals for a square grid. A general purpose formula can often be simplified for spe-cial cases. Suppose that we are constructing a terrain using squares and that the X andY coordinates are integer multiplies of the grid spacing, d . The height of the terrain atx D i and y D j is is zi,j . Figure 25 shows 9 points of the terrain, centered at zi,j . TheX coordinates in this view are i � 1, i, and i C 1, and the Y coordinates are j � 1, j ,and j C 1. The appropriate normal for the point (xi, yi) is the average of the normalsto the quadrilaterals A, B, C , and D. Using Newell’s formula to compute these fournormals and adding the resulting vectors gives a vector n with components:

nx D d(zi�1,jC1 � ziC1,jC1 C 2zi�1,j � 2ziC1,j C zi�1,j�1 � ziC1,j�1)

ny D d(�zi�1,jC1 � 2zi,jC1 � ziC1,jC1 C zi�1,j�1 C 2zi,j�1 C ziC1,j�1)

nz D 8 d2

Note that we do not need to include the factor d in the calculation of n since a scalarmultiple does not affect the direction of a vector. The correct normal vector is thenobtained by normalizing n.

The average of n normal vectors can be calculated by adding them and dividing by n. Inpractice, the division by n is not usually necessary, because we can simply add the normalvectors at a vertex and then normalize the resulting vector. Formally, the normalized average

vector of a set of vectors f (xi, yi , zi) j i D 1, 2, . . . , n g is�

X

S,Y

S,Z

S

�, where X D

iDnX

iD1

xi,

39


j � 1

j

j C 1

i � 1 i i C 1

ttt

ttt

ttt� d - � d -

?

d6

?

d6

A B

CD

Figure 25: Computing average normals on a square grid

Y DiDnX

iD1

yi , Z DiDnX

iD1

zi , and S Dp

X 2 C Y 2 C Z2 .

Normalizing Vectors If you include the statement glEnable(GL_NORMALIZE) in your ini-tialization code, OpenGL will normalize vectors for you. You have to do this if, for example,you import a model with vertexes and normals pre-calculated and you then scale this model.However, it is usually more efficient to normalize vectors yourself if you can.

40

5 SPECIAL EFFECTS

5 Special Effects

5.1 Blending

Blending is a very powerful and general feature of OpenGL and we will describe only a fewspecial cases of it. For complete details, see the “red book” (OpenGL Programming Guide,Third Edition, by Mason Woo et al.).

The fourth component of a colour vector, usually referred to as “alpha” (˛), makes the colourpartially transparent, allowing it to be “blended” with another colour.

Normally, when OpenGL has computed the colour of a vertex it stores the colour at the corre-sponding pixel location unless the depth buffer information says that the vertex is invisible, inwhich case the pixel is left unchanged. When blending is being used, the computed colour iscombined with the colour that is already at the pixel, and the new colour is stored. The newcolour is called the source and the existing colour at the pixel is called the destination . If

source colour D (Rs, Gs , Bs, As)

source blending factors D (Sr , Sg, Sb, Sa)

destination colour D (Rd , Gd , Bd , Ad)

destination blending factors D (Sd , Sd , Sd , Sd)

then the final colour of the pixel is

(Rs Sr CRd Dr , Gs Sg CGd Dg , Bs Sb C Bd Db , As Sa CAd Da)

The order in which we draw opaque objects does not usually matter much, because the depthbuffer takes care of hidden surfaces. With blending, however, the order is important, becausethe order in which OpenGL processes the source and destination colours affects the result.

Blending is enabled by calling glEnable(GL_BLEND) and disabled by calling glDisable(GL_BLEND).The blending process is determined by calls to

glBlendFunc(GLenum src, GLenum dst );

There are many possible values for these two arguments. Their uses are suggested by thefollowing examples.

˘ To blend two images: draw the first image with src = GL_ONE and dst = GL_ZERO.Then set ˛ D 0.5 and draw the second image with src = GL_SRC_ALPHA and dst =GL_ONE_MINUS_SRC_ALPHA.

˘ To achieve a “painting” effect, in which each brush stroke adds a little more colour,use ˛ D 0.1 and draw each brush stroke with with src = GL_SRC_ALPHA and dst =GL_ONE_MINUS_SRC_ALPHA.

Here are some extracts from program that achieves a glass-like effect by blending. Duringinitialization, call

glEnable(GL_BLEND);glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);

Set up various colours. Note that the use of ˛ values less than 1. The vector glass is notconst because the program allows the transparency of glass to be changed.

41

5 SPECIAL EFFECTS 5.2 Fog

GLfloat glass[] = { 0.4f, 0.4f, 0.4f, 0.6f };const GLfloat blue[] = { 0.2f, 0.2f, 1.0f, 0.8f };const GLfloat white[] = { 1.0, 1.0, 1.0, 1.0 };const GLfloat polished[] = { 100.0 };

In the display function, display the more opaque object first, then the transparent object.

glPushMatrix();glTranslatef(1.0, 0.0, 0.0);glRotatef(45.0, 1.0, 0.0, 0.0);glMaterialfv(GL_FRONT, GL_AMBIENT_AND_DIFFUSE, blue);glMaterialfv(GL_FRONT, GL_SPECULAR, white);glMaterialfv(GL_FRONT, GL_SHININESS, polished);glutSolidIcosahedron();

glPopMatrix();

glPushMatrix();glRotatef(30.0, 0.0, 1.0, 0.0);glMaterialfv(GL_FRONT, GL_AMBIENT_AND_DIFFUSE, glass);glMaterialfv(GL_FRONT, GL_SPECULAR, white);glMaterialfv(GL_FRONT, GL_SHININESS, polished);glutSolidCube(3.0);

glPopMatrix();

See also (Hill Jr. 2001, pages 545–549).

5.2 Fog

Fog is an easy effect to create and it can be quite useful. A common problem that occurs increating landscapes is that the edge of the terrain looks like the end of the world rather thana smooth horizon; we can use fog to hide such anomalies.

Fog is actually a special case of blending. The fog effect is obtained by blending the desiredcolour of a vertex with the fog colour. The degree of blending is determined by the distanceof the vertex from the viewer. OpenGL provides three modes: in Figure 26, the left columnshows the modes and the right column shows f , the “fog factor”.

To use fog, you have to call glEnable(GL_FOG) and then set the parameters in the formulasby calling glFog: The values in these formulas are set by the following function call. Figure 27shows the values of the arguments.

glFogfifg[v](GLenum param, TYPE value);

Figure 27 shows the values of the arguments of glFog. As the formulas show, you setGL_FOG_DENSITY for modes GL_EXP and GL_EXP2, and you set GL_FOG_START and GL_FOG_ENDfor mode FOG_LINEAR. The default mode is GL_EXP.

You can control the efficiency of fog generation by providing hints. If you call

glHint(GL FOG HINT, GL NICEST);

then OpenGL will calculate fog for every pixel. If you call

glHint(GL FOG HINT, GL FASTEST);

42

5 SPECIAL EFFECTS 5.3 Reflection

GL_LINEAR f Dend� z

end� startGL_EXP f D e�d z

GL_EXP2 f D e�(d z)2

Figure 26: Fog Formulas

param valueGL_FOG_MODE GL_LINEAR, GL_EXP, or GL_EXP2GL_FOG_DENSITY d

GL_FOG_START startGL_FOG_END endGL_FOG_COLOR colour

Figure 27: Parameters for glFog

then OpenGL will calculate for for every vertex, which is usually faster but doesn’t look sonice. If you want OpenGL to decide which mode to use by itself, you write

glHint(GL FOG HINT, GL DONT CARE);

Naturally, you can call glDisable(GL_FOG) to turn the fog effect off.

5.3 Reflection

Reflection is one of several effects that you can obtain with the stencil buffer . A “stencil”is a plastic sheet with holes cut into it. The holes have particular shapes — for example, anarchitect’s stencil has shapes of furniture — and the stencil is used as a guide when drawingthose shapes. A stencil in a graphics program is an area of the window that is used to drawsomething different from the main image.

Stencils can be used for a variety of effects. The following extracts are from a program thatdraws a scene and its reflection in a mirror. Using a stencil for the mirror allows us to drawa scene in which objects in the mirror are transformed differently from other objects.

During initialization:

glClearStencil(0);glEnable(GL_STENCIL_TEST);

Define a mirror in a plane normal to the X -axis. There are two ways of drawing the mirror:if p is true, draw a filled quadrilateral or, if p is false, draw a hollow outline.

void mirror (bool p){

if (p)glBegin(GL_QUADS);

else

43

5 SPECIAL EFFECTS 5.4 Display Lists

glBegin(GL_LINE_LOOP);glVertex3f(cmx, cmy - 0.5, cmz - 2.0);glVertex3f(cmx, cmy - 0.5, cmz + 2.0);glVertex3f(cmx, cmy + 0.5, cmz + 2.0);glVertex3f(cmx, cmy + 0.5, cmz - 2.0);glEnd();

}

Display the scene like this. First, store the shape of the mirror in the stencil buffer.

glClear(GL_STENCIL_BUFFER_BIT);glStencilFunc(GL_ALWAYS, 1, 1);glStencilOp(GL_REPLACE, GL_REPLACE, GL_REPLACE);mirror(true);

As usual, clear the colour buffer and depth buffer bits:

glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);

Draw the mirror frame:

glColor3f(0.7f, 0.7f, 0.7f);mirror(false);

Draw the scene outside the mirror:

glStencilOp(GL_KEEP, GL_KEEP, GL_KEEP);glStencilFunc(GL_NOTEQUAL, 1, 1);scene();

Finally, draw the reflected scene in the mirror. To obtain the mirror image, translate to thecentre of the mirror, reflect the scene in the X -axis (remember that the plane of the mirror isnormal to the X axis), and then reverse the translation.

glStencilFunc(GL_EQUAL, 1, 1);glTranslatef(cmx, cmy, cmz);glScalef(-1.0, 1.0, 1.0);glTranslatef(-cmx, -cmy, -cmz);scene();

5.4 Display Lists

Display lists do not provide any new graphical features; they are used to improve the per-formance of graphics programs. The idea is to perform as many calculations as possible andstore them, instead of performing the calculations every time the scene is displayed.

The following code illustrates the use of display lists. To create a list:

GLuint pic = glGenLists(1);glNewList(pic, GL COMPILE);

44

5 SPECIAL EFFECTS 5.4 Display Lists

// draw the pictureglEndList();

The argument given to glGenLists specifies the number of lists, N , that we want. glGenListsreturns the first number, f , in a range. When we call glNewList, the first argument must bea number ` in the range: f � ` < f CN .

The second argument given to glNewList is either GL_COMPILE, if we want to do the cal-culations only, or GL_COMPILE_AND_EXECUTE if we want to do the calculations and draw thepicture.

After creating a list, we draw the picture in it by calling

glCallList(pic);

where pic is the same number that we gave to glNewList.

A display list can include transformations (translate, rotate, scale) and drawing functions. Asa rule of thumb, you can do something in a display list if it would make sense to do the samething in the display function. However, the values used when the display list is created arefixed. For example, you could include a rotation

glRotatef(angle, 0, 1, 0);

in a display list, but the current value of angle would be “frozen” in the list; changing thevalue of angle would have no effect on the stored image.

A single list can be used many times. For example, if you created a display list with indexperson, you could create a crowd scene like this (assume that rdm returns a random floatingpoint value in a suitable range):

for (p = 0; p < CROWDSIZE; p++)f

glPushMatrix();glTranslatef(rdm(), 0, rdm());glRotatef(rand() % 360, 0, 1, 0);glCallList(person);glPopMatrix();

gEach person would be translated to some point in the XZ plane — but not levitated, sincethere is no Y component — and rotated by a random amount. A problem with this simplealgorithm is that some people might overlap with others.

You can create hierarchical display lists. That is, you can use glCallList between glNewListand glEndList, provided that the list you call has already been defined. The following codeis valid provided that the index couple has been allocated by glGenLists:

glNewList(couple);glCallList(person);glTranslatef(5, 0, 0);glCallList(person);glEndList();

As usual, it is highly advisable to use glPushMatrix and glPopMatrix to ensure that callinga list does not change the frame of reference.

45

5 SPECIAL EFFECTS 5.5 Bezier Curves and Surfaces

r

r

r r r

r

Figure 28: Two three-point Bezier curves with their control points

5.5 Bezier Curves and Surfaces

We can do a limited amount of modelling with the built-in objects provided by GLUT (Sec-tion 1.5.2 and quadrics (Section 1.5.3 but we need more general techniques to build complexmodels. Bezier formulas1 are an example of such a technique.

It might seem that we can use any polynomial (or even function) to generate a curve. This isnot so, for teh following reason. We would like to represent the modelling function using onlya small number of parameters. For example, we can represent a second degree polynomiala x2 C b x C c with just three numbers, a, b, and c. More practically, we could represent thefunction by a small collection of points that it passes through or that control its direction.We also want to perform graphics transformations on these points without changing the curveor surface generated. Consequently, Bezier and other formulas have the following importantproperty: the relationship between the control points and the generated curve orsurface is unaffected by affine transformations .

5.5.1 Curves

We consider curves in 2D space first because they are easier to understand than surfaces in3D space. A Bezier curve is defined by a set of control points . The curve starts at thefirst point and ends at the last point, but it does not pass through the intermediate points.Figure 28 shows two Bezier curves, each with three control points. Tangents to the curve atits end point pass through the middle control point.

A general Bezier curve can have any number of control points. OpenGL provides evaluatorsto draw Bezier curves. The minimal steps are as follows:

˘ Define an array of control points.

˘ During initialization, pass information about control points to an evaluator and enablethe evaluator.

˘ In the display function, compute the points that you need.

The evaluator defines a parametric curve: that is, points on the curve depend on values ofa parameter, u. It is often convenient to allow u to vary from 0 to 1 but OpenGL does notrequire this.

Here is a simple example that generates a four-point Bezier curve.1Bezier formulas were in fact invented by two researchers, both car designers: Pierre Bezier (1910–1999)

at Renault found the formulas and Paul de Casteljau at Citroen developed an algorithm for calculating thecoefficients.

46


Parameter MeaningGL_MAP1_VERTEX_3 (x, y, z) coordinatesGL_MAP1_VERTEX_4 (x, y, z,w) coordinatesGL_MAP1_COLOR_4 (r, g, b, a) colour valuesGL_MAP1_NORMAL (x, y, z) normal direction

Figure 29: Control parameters for Bezier curves

˘ Define an array with 4 3D points. The points lie in the XY plane.

GLfloat pts[4][3] = f -4, -4, 0, -2, 4, 0, 2, -4, 0, 4, 4, 0 g;

˘ Define the evaluator:

glMap1f(GL MAP1 VERTEX 3, 0, 1, 3, 4, &pts[0][0]);

The first argument determines the type of the control points (see below). The next twoarguments specify the range of the parameter u: in this example, 0 � u � 1. Theargument “3” is the stride : it tells the function how many floating-point values to stepover between each point. The argument “4” is the order of the spline, which is equalto the number of points specified. The last argument is the address of the first controlpoint.

˘ Enable the evaluator:

glEnable(GL MAP1 VERTEX 3);

˘ In the display function, draw the curve as a sequence of line segments:

glBegin(GL LINE STRIP);for (int i = 0; i < 50; i++)

glEvalCoord1f((GLfloat) i / 50.0);glEnd();

˘ The same effect can be achieved by calling:

glMapGrid1f(50, 0.0, 1.0);glEvalMesh1(GL LINE, 0, 50);

An evaluator can generate vertex coordinates, normal coordinates, colour values, or texturecoordinates. Figure 29 shows some of the possible values.

5.5.2 Surfaces

Generating a Bezier surface is similar, but two parameters, u and v, are needed for the surfacecoordinates and a rectangular array of control points are required. The functions are specifiedbelow in a general way rather than by specific example as above.

glMap2ffdg(target, u1, u2, ustride, uorder, v1, v2, vstride, vorder, points);

where

target D The control parameter: as Figure 29 but with MAP2

u1 D Minimum value for u

u2 D Maximum value for u

47


ustride D Address difference for u valuesuorder D Number of u values

v1 D Minimum value for vv2 D Maximum value for v

vstride D Address difference between v valuesvorder D Number of v valuespoints D Address of first point

For example, suppose we define the control points with

GLfloat pts[4][4][3] = f ... g;and the points specify vertices. Assume that we want 0 � u � 1 and 0 � v � 1. Then wewould call:

glMap2f(GL MAP2 VERTEX 3, 0, 1, 3, 4, 0, 1, 12, 4, &pts[0][0][0]);

because the u entries are three floats apart and the v entries are four floats apart.

To obtain a point on the surface, call

glEvalCoord2f(u, v);

where (u, v) are the surface coordinates. The vertexes can be calculated four at a time anddrawn as GL QUADS to obtain a surface.

Alternatively, you can use the grid and mesh functions to draw the entire surface. An impor-tant advantage of using these functions is that they generate normals.

glMapGrid2ffdg(nu, u1, u2, nv, v1, v2);

where

nu D Number of u control points to evalauteu1 D Minimum value for u

u2 D Maximum value for u

nv D Number of v control points to evalautev1 D Minimum value for vv2 D Maximum value for v

glEvalMesh2(mode, i1, i2, v1, v2);

where mode is one of GL POINT, GL LINE, or GL FILL; i1 and i2 specify the range of u values;and j1 and j2 specify the range of v values.

Although there can be any number of control points in principle, using very large numberscan be problematic. The functions generate polynomials of high degree that require time tocompute and may be unstable.

Figure 30 shows a concrete example of Bezier surface generation. The shape generated isone side of the body of an aircraft; the surface is rendered twice to obtain both sides of theaircraft. Figure 31 shows the 3D coordinates generated by this code.

See also (Hill Jr. 2001, Chapter 11).

48


glEnable(GL_MAP2_VERTEX_3);glEnable(GL_AUTO_NORMAL);setMaterial(METAL);

// Fuselageconst int fuWidth = 4;const int fuLength = 6;const int fuLoops = 20;const int fuSlices = 20;const GLfloat fuShapeFactor = 0.9f;GLfloat fuPoints[fuLength][fuWidth][3];

struct { GLfloat len; GLfloat size; } fuParameters[fuLength] ={

{ -10, 0 },{ -9.6f, 1.4f },{ -9, 1.6f },{ 8, 1.4f },{ 9.9f, 1 },{ 10, 0 }

};

for (int p = 0; p < fuLength; p++){

for (int y = 0; y < fuWidth; y++)fuPoints[p][y][2] = fuParameters[p].len;

fuPoints[p][0][0] = 0;fuPoints[p][1][0] = fuParameters[p].size;fuPoints[p][2][0] = fuParameters[p].size;fuPoints[p][3][0] = 0;

fuPoints[p][0][1] = - fuShapeFactor * fuParameters[p].size;fuPoints[p][1][1] = - fuShapeFactor * fuParameters[p].size;fuPoints[p][2][1] = fuShapeFactor * fuParameters[p].size;fuPoints[p][3][1] = fuShapeFactor * fuParameters[p].size;

}

glMap2f(GL_MAP2_VERTEX_3,0, 1, 3, fuWidth,0, 1, 3 * fuWidth, fuLength,&fuPoints[0][0][0]);

glMapGrid2f(fuLoops, 0, 1, fuSlices, 0, 1);

glEvalMesh2(GL_FILL, 0, fuLoops, 0, fuSlices);glScalef(-1, 1, 1);glEvalMesh2(GL_FILL, 0, fuLoops, 0, fuSlices);

Figure 30: Using Bezier surfaces for the body of a plane

49

5 SPECIAL EFFECTS 5.6 Menus

0.00 0.00 �10.00 0.00 0.00 �10.00 0.00 0.00 �10.00 0.00 0.00 �10.00

0.00 �1.26 �9.60 1.40 �1.26 �9.60 1.40 1.26 �9.60 0.00 1.26 �9.60

0.00 �1.44 �9.00 1.60 �1.44 �9.00 1.60 1.44 �9.00 0.00 1.44 �9.00

0.00 �1.26 8.00 1.40 �1.26 8.00 1.40 1.26 8.00 0.00 1.26 8.00

0.00 �0.90 9.90 1.00 �0.90 9.90 1.00 0.90 9.90 0.00 0.90 9.90

0.00 0.00 10.00 0.00 0.00 10.00 0.00 0.00 10.00 0.00 0.00 10.00

Figure 31: Points generated by the code of Figure 30

void menu(int code){

cout << "Menu selection: " << code << endl;}

void initMenus(){

int sub = glutCreateMenu(menu);glutAddMenuEntry("Orange", 5);glutAddMenuEntry("Pear", 6);glutAddMenuEntry("Quince", 7);glutAddMenuEntry("Raspberry", 8);

glutCreateMenu(menu);glutAddMenuEntry("Apple", 1);glutAddMenuEntry("Banana", 2);glutAddMenuEntry("Carrot", 3);glutAddMenuEntry("Damson", 4);glutAddSubMenu("More...", sub);

glutAttachMenu(GLUT_RIGHT_BUTTON);}

Figure 32: Functions for menu callback and creation

5.6 Menus

GLUT provides menus: the menus are not very beautiful but have the advantage that theyare easy to create. Figure 32 shows the idea. The function initMenus is called once onlyand sets up the menus. The callback function menu is made whenever the user makes a menuselection. The argument passed to menu depends on the selection: if the user selects "orange",the value passed is 5, and so on.

It is also easy to create sub-menus and to attach them to the main menu. In Figure 32, thesub-menu sub displays the four entries with codes 5, 6, 7, and 8, and it appears as the fifthoption on the main menu, which handles the codes 1, 2, 3, and 4.

50

5 SPECIAL EFFECTS 5.7 Text

5.7 Text

GLUT provides text in two forms:

˘ Bit-mapped characters are displayed in the plane of the screen in a fixed orientation; and

˘ stroked characters are 3D objects that can be drawn anywhere in the model and can bescaled and rotated.

The call

glRasterPos(x, y);

sets the initial position for a bit-mapped string. The coordinates (x, y) are model coordinatesand transformations (e.g., glTranslatef) are applied to them.

To write one character, c, and set the raster position for the next character, call

glutBitmapCharacter(GLUT BITMAP TIMES ROMAN 24, c);

The call

glutStrokeCharacter(font, c);

draws a character, c, in the given font, which should be either GLUT_MONO_ROMAN (fixed width)or GLUT_STROKE_ROMAN (proportional width). Since the height of a character is about 120units, you may want to scale them down to suit your model. glutStrokeCharacter appliesa translation to the right, so that a sequence of calls displays a string with correct spacingbetween letters.

5.8 Other Features of OpenGL

There are a number of other features of OpenGL that will not be discussed in detail here.Instead, we describe the idea and omit details of the implementation.

5.8.1 Textures

A texture is a 1D or, more usually, a 2D pattern that is “wrapped” onto an object. The objectmay be a plane, such as the side of a cube, or a curved surface such as a cylinder or sphere.OpenGL needs two pieces of information in order to apply a texture: the texture image, andthe mapping of texture coordinates to object coordinates. The coordinate mapping is simplein the case of a plane or cylinder, but more complicated in the case of a sphere or other object.

We have seen that when an object is drawn by the low-level functions, the user must providethe coordinates and normal direction of each vertex. When textures are used, texture coordi-nates must be provided as well. For the “standard objects”, texture coordinates are providedby the corresponding functions. Consequently, if you want to texture a sphere or a teapot,all you have to do is to tellOpenGL where to find the texture data. This data must be storedas an RGB array: OpenGL does not handle BMP or JPG files directly.

OpenGL provides a lot of support for textures: the chapter on textures in the “Red Book”((Woo, Nedier, Davis, and Shreiner 2000)) occupies 77 pages! In this section, we provide justa brief overview of simple texturing techniques. Here are the key steps.

1. Create a texture object . This is simply an array of data corresponding to a 1D, 2D,or 3D image. Each pixel is represented by anything from one to four pixels. The mostcommon case is a 2D texture with RGBA values (1 byte each) at each pixel. The

51

5 SPECIAL EFFECTS 5.8 Other Features of OpenGL

dimensions of the array must be powers of 2. Consequently, the array will occupy4� 2m � 2n bytes of memory.

You can generate the texture yourself, by calculation, or you can read data from afile. Pictures usually come in an encoded format, such as .jpg or .bmp, and mustbe converted to raw binary form before OpenGL can use them. Utility programs forperforming the conversions can be downloaded from the internet.

2. You have to tell OpenGL how the texture is to be applied. The most common cases are:

Replace: the final colour in the scene is the texture colour.

Modulate: the texture provides a value that changes the current colour. This mode issuitable if the texture is being used as a shadow, for example.

Blend: the original colour and the texture colour are combined according to a blendingfunction.

3. Enable texture mapping by executing

glEnable(GL_TEXTURE_2D);

The available constants are GL_TEXTURE_1D, GL_TEXTURE_2D, and GL_TEXTURE_3D, de-pending on the dimensionality of the texture. If more than one dimension is enabled,the higher one is used.

4. Draw the scene, providing texture coordinates for each vertex:

glTexCoord3f(....);glVertex3f(....);

OpenGL uses 0 and 1 as the limits of the texture in each direction. This means that,if the texture coordinates are between 0 and 1, OpenGL will use a value within thetexture. If the texture coordinates are outside this range, then what happens dependson the texturing mode. If you use GL_REPEAT, then the texture will be repeated as oftenas necessary. For example, if you texture a square using coordinates that run from 0 to5, you will obtain 5�5 D 25 copies of the texture. This technique is useful for renderingtiles on a floor, for example.

If you use GL_CLAMP, the borders of the texture will be extended as far as necessary.

The following example shows the steps required to apply a simple texture. It is taken fromthe “red book” (Woo, Nedier, Davis, and Shreiner 2000). Like all programs in the “red book”,this program is written in C rather thanC++.

The first step is to create a texture. This texture is computed; the more common case isfor the texture to be read from a file and perhaps converted to the appropriate format forOpenGL. Note that the size of the image is 26 � 26.

#define checkImageWidth 64#define checkImageHeight 64static GLubyte checkImage[checkImageHeight][checkImageWidth][4];void makeCheckImage(void){

52


int i, j, c;

for (i = 0; i < checkImageHeight; i++) {for (j = 0; j < checkImageWidth; j++) {

c = ((((i&0x8)==0)^((j&0x8))==0))*255;checkImage[i][j][0] = (GLubyte) c;checkImage[i][j][1] = (GLubyte) c;checkImage[i][j][2] = (GLubyte) c;checkImage[i][j][3] = (GLubyte) 255;

}}

}

The texture object must have a “name” which is actually a small integer. The initializationfunction constructs the texture image and then:

� glPixelStore tells OpenGL that the texture data is aligned on a one-byte boundary.

� glGentextures obtains one name for the texture and stores it in texName.

� glBindtexture tells OpenGL that texName will be a 2D texture.

� The position on the texture is defined by coordinates (s, t ). The various calls toglTexparameter instruct OpenGL to repeat the texture in both dimensions and to useGL_NEAREST for both magnification and minification filters. These calls do not alter thedefault values and could be omitted without making any difference.

� glTexImage2D passes information about the texture to OpenGL, including the addressof the texture itself.

static GLuint texName;

void init(void){

glClearColor (0.0, 0.0, 0.0, 0.0);glShadeModel(GL_FLAT);glEnable(GL_DEPTH_TEST);

makeCheckImage();glPixelStorei(GL_UNPACK_ALIGNMENT, 1);glGenTextures(1, &texName);glBindTexture(GL_TEXTURE_2D, texName);glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_REPEAT);glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_REPEAT);glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, checkImageWidth, checkImageHeight,

0, GL_RGBA, GL_UNSIGNED_BYTE, checkImage);}

53


The display function enables 2D texturing, specifies GL_DECAL mode for the texture. It thendisplays two squares, one in the XY plane, and the other at an angle.

void display(void){

glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);glEnable(GL_TEXTURE_2D);glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_DECAL);

glBegin(GL_QUADS);glTexCoord2f(0.0, 0.0); glVertex3f(-2.0, -1.0, 0.0);glTexCoord2f(0.0, 1.0); glVertex3f(-2.0, 1.0, 0.0);glTexCoord2f(1.0, 1.0); glVertex3f(0.0, 1.0, 0.0);glTexCoord2f(1.0, 0.0); glVertex3f(0.0, -1.0, 0.0);

glTexCoord2f(0.0, 0.0); glVertex3f(1.0, -1.0, 0.0);glTexCoord2f(0.0, 1.0); glVertex3f(1.0, 1.0, 0.0);glTexCoord2f(1.0, 1.0); glVertex3f(2.41421, 1.0, -1.41421);glTexCoord2f(1.0, 0.0); glVertex3f(2.41421, -1.0, -1.41421);glEnd();glDisable(GL_TEXTURE_2D);

}

The remainder of the program is conventional OpenGL.

This basic pattern can be varied in a number of ways.

� The function glTexImage2D should be replaced by glTexImage1D for 1D textures or byglTexImage3D for 3D textures.

� The internal format is GL_RGBA in the call to glTexImage2D, indicating that eachtexel consists of four bytes containing red, green, blue, and alpha values. There are 37other constants, each specifying different storage conventions.

� In the call to glTexEnv, the final argument determines how the texture is applied. Itcan be GL_DECAL, GL_REPLACE, GL_MODULATE, or GL_BLEND. When GL_BLEND is used, theblending function must also be set to an appropriate value.

� It is inefficient to use the full detail of a texture if the image of the texture on thescreen is very small. To avoid this inefficiency, a number of texture images are stored atdifferent levels of detail, and OpenGL selects the appropriate image for the application(this is called mipmapping). You can ask OpenGL to compute mipmaps or you canprovide your own.

5.8.2 NURBS

NURBS are Non-Uniform Rational B-SplineS, or curves generated from sets of points. “Non-uniform” means that the points do not have to be evenly spaced; “rational” means that theequations have the form P(x, y)=Q(x, y), where P and Q are polynomials; a “spline” is a

54


continuous curve formed from a set of curves; and “B” stands for “basis”, where the “basis”is a set of functions that is suitable for building spline curves.

In OpenGL, NURBS are built from Bezier curves; in fact, the NURBS functions form a high-level interface to the functions that we have already seen in Section 5.5.

5.8.3 Antialiasing

You have probably noticed at one time or another that lines on a display do not have smoothedges. The effect is particularly pronounced for lines that are almost parallel to one of theaxes: such lines look like staircases instead of straight lines. This phenomenon is calledaliasing and it is due to the finite size of the pixels on the screen. The same phenomenonmakes the edges of squares and rectangles look jagged when the edges are not parallel to theaxes.

The avoidance of aliasing effects is called antialiasing . There are various ways of imple-menting antialiasing, and OpenGL provides a few of them.

Antialiasing Lines and Points A simple method of reducing aliasing effects is to adjustthe colour of pixels close to the line by blending. The dafult way of rendering a line is todivide pixels into two classes: pixels “on” the line and pixels “not on” the line. A better wayis to decide, for each pixel, how much it contributes to the line, and to use this quantity forblending. To achieve antialiasing in this way, your program should execute the statements

glEnable(GL_LINE_SMOOTH);glEnable(GL_BLEND);glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);

before drawing lines or points.

Antialiasing Polygons The above technique is not very helpful because we are usuallymore interested in drawing polygons than in drawing points and lines. We can use a similartechnique, but it works best only if the polygons are drawn in order of their distance fromthe viewer, with close polygons first and distant polygons last. Sorting the polygons, unfortu-nately, is a non-trivial task. Sorting can be omitted, but the results are not so good. Executethe code

glEnable(GL_POLYGON_SMOOTH);glEnable(GL_BLEND);glBlendFunc(GL_SRC_ALPHA_SATURATE, GL_ONE);

before drawing polygons.

Jittering Another way or reducing aliasing effects is to render the scene several times inslightly different places. The movements should be very small, usually less than one pixel,and the technique is called jittering .

As usual, OpenGL coding involves trade-offs. An antialiased scene will look better than a scenwith aliasing effects, but it will take longer to render. If jittering is used, it may take muchlonger to render.

55

5 SPECIAL EFFECTS 5.9 Program Development

5.8.4 Picking

A question that is often asked is “how do I select an object with the mouse?” In general,this is hard to do, because the coordinates we use to draw the model have a complicatedrelationship with the coordinates of the mouse relative to the window. Furthermore, in a3D scene, there may be several objects that corrspond to a single mouse position, one beingbehind another. The solution that OpenGLprovides is to provide a special rendering mode.Here is what happens:

˘ The user provides each object of interest with a “name” (in fact, the name is an integer).

˘ The user defines a small region of the screen. Typically, this is a rectangle that includesthe mouse position and a few pixels either side.

˘ OpenGL then renders the scene in selection mode. Whenever a named object is displayedinside the selected region, a “hit record”, containing its name and some other informationis added to the “selection list”.

˘ The user examines the hit records to determine the object — or objects — selected.

5.8.5 Error Handling

OpenGL never prints a message, raises an exception, or causes your program to crash. However,that doesn’t mean that nothing can go wrong: many operations cause errors and it is yourresponsibility to discover that they have occurred. If you are using an OpenGL feature andit doesn’t seem to be working properly, or is not working at all, it is possible that you haveperformed an invalid operation. To find out, call glGetError with no arguments: the valuereturned is the current error code . You can pass the value to gluErrorString to obtain anintelligible message:

GLenum error = glGetError();if (error != GL NO ERROR)

cout << "GL error: " << gluErrorString(error) << endl;

5.9 Program Development

Most of the techniques for program development work for graphics programs but there are afew additional points to note.

˘ Incremental development works best. Start by getting something simple in the graphicswindow and then work on refining it.

˘ The “blank window” problem occurs often. To avoid it, start with a simple model atthe origin and straightforward projections. When you can see something, elaborate theprogram in small steps.

˘ Don’t do everything at once. Get the shapes right before working on colour and lighting.

˘ Use “graphical debugging”. For example, a function that draws a set of axes (e.g., asthree coloured lines) can be very useful for finding out where you are in the model.

˘ During the early stages, having a mouse function that applies simple movements — forexample, rotations about two axes — can be very helpful.

56

6 ORGANIZATION OF A GRAPHICS SYSTEM

6 Organization of a Graphics System

6.1 The Graphics Pipeline

The path from vertex definition to visible pixel is long and complicated. When we are usinga high-level API such as OpenGL, it is not necessary to understand all of the details of thispath. Nevertheless, a rough idea of the process is helpful because it enables us to avoid obviousmistakes in graphics programming.

There is a diagram called The OpenGL Machine that describes the way in which OpenGLprocesses data. A particular version of OpenGL does not have to implement the machine pre-cisely, but it must provide the same effect. You can obtain a diagram of the OpenGL Machineeither directly as www.3dlabs.com/support/developer/state.pdf or from the course webpage (in either case you will have to magnify it to make it readable).

The pipeline has two inputs: vertexes and pixels . Most of the information is typically inthe form of vertexes; pixels are used for special-purpose applications such as displaying textin particular positions in the viewing window.

A unit of geometric data is called a primitive and consists of a object type and a listof vertexes. For example, a triangle object has three vertexes and a quad object has fourvertexes. Vertex data includes:

˘ 4D coordinates, (x, y, z,w)

˘ normal vector

˘ texture coordinates

˘ colour values, (r, g, b, a) (or a colour index)

˘ material properties

˘ edge-flag data

The data associated with a vertex is assembled by a call to glVertex. This implies thatall of the information that OpenGL needs for a vertex must be established before the call toglVertex. All of the data except the (x, y, z,w) coordinates have default values, and OpenGLwill use these if you have not provided any information. The default values are usually fairlysensible, and this enables you to get a rough picture with a minimum of work and refine itafterwards.

6.1.1 Per Vertex Operations

A number of operations are performed on each vertex: these are called per vertex operationsand they are important because the time that OpenGL requires to render a scene is the productof the time taken to process one vertex and the number of vertexes in the scene.

˘ The position of each vertex is transformed by the model view matrix

˘ The normal vector at each vertex is transformed by the inverse transpose of the modelview matrix

˘ The normal is renormalized (made a unit vector) if specified by the user

˘ Texture coordinates are transformed by the model view matrix

˘ The normal vectors, material properties, and lighting model are used to perform lightingcalculations that determine the final colour of the vertex

57

6 ORGANIZATION OF A GRAPHICS SYSTEM 6.1 The Graphics Pipeline

6.1.2 Primitive Assembly

When all of the vertexes have been processed, the primitives are assembled. At this stage,there is an object — triangle, quad, or general polygon — and information about each of itsvertexes. Each object of this kind is called a primitive .

The primitive is clipped by any clipping planes defined by the user. The clipping planes aretransformed by the model view matrix and determine whether a primitive is removed, whollyor partially, by the clipping plane. A primitive that is partially clipped may acquire newvertexes: for example, a clipped triangle becomes a quadrilateral.

The spatial coordinates of each vertex are then transformed by the projection matrix. Asecond round of clipping occurs: this time, anything outside the viewing volume is clipped.At this stage, the viewing volume is usually bounded by x D ˙1, y D ˙1, z D ˙1, regardlessof the kind of projection (orthogonal or perspective). Again, primitives at the edges of theviewing volume may gain new vertices.

Each primitive has a front face and a back face. Culling is applied, eliminating back face (orfront face) data if it is not required.

6.1.3 Rasterization

The primitives are then converted to fragments in a process called rasterization . Thegraphics window is considered to be a collection of small squares called pixels . For example,a typical window on a high resolution screen might have 800 � 600 D 480, 000 pixels. Aprimitive usually occupies several pixels, but some primitives (for example, small triangles ina part of the model that is distant from the viewer) might occupy only part of a pixel.

If the primitive occupies one pixel or more, the shading model determines how the pixelsare coloured. In flat shading, the colour at one vertex determines the colour of all pixels; insmooth shading, the pixel colours are obtained by interpolating between the vertex colours.In all cases, the boundaries between primitives must be considered: a pixel may be part ofmore than one primitive, in which case its colour must be averaged.

6.1.4 Pixel Operations

Meanwhile, some data has been specified directly in terms of pixels. Such data is usually in thewrong format for OpenGL and must be packed, unpacked, realigned, or otherwise converted.Pixel data is then rasterized and converted into fragments and combined with vertex data.

6.1.5 Fragment Operations

Several further operations are performed on the fragments:

˘ Texture data is mapped from the texture image source to the fragment

˘ If fog is enabled, it is applied to the fragment

˘ Antialiasing may be applied to reduce the jaggedness of lines and boundaries

˘ Scissor tests that may exclude part of the viewing window are applied

˘ Alpha computations are performed for overlapping fragments

˘ Stencils are applied to eliminate some parts of the view

58

6 ORGANIZATION OF A GRAPHICS SYSTEM 6.2 Rasterization

˘ The depth-buffer test is applied to choose the closest fragment at each pixel

˘ Blending, dithering, and logical operations are performed

˘ Colour masking is applied, if necessary, to reduce the amount of colour information tothe number of bits provided by the frame buffer

˘ The pixels for the fragment are written to the frame buffer

6.2 Rasterization

Rasterization, which is the process of turning the scene into pixels, is not a particularlyglamorous part of graphics programming but it is nonetheless one of the most important.However many strange and wonderful effects your graphics engine can create, they will all bespoiled by poor rasterization.

It is worth noting that, at this level, operations may be performed by either software orhardware. One of the factors which distinguishes a high-performance graphics workstationfrom a simple PC is that the workstation has more sophisticated hardware. For example,the following quotation is taken from a description of the new Silicon Graphics Onyx 3000workstation:

The new graphics system is built on the shared memory SGI NUMAflex archi-tecture of the SGI Onyx 3000 series systems, which allows it to deliver industry-leading interactive graphics performance of up to 283 million triangles per secondof sustained performance and 7.7 billion pixels per second.

A “triangle” in this context means that the system can take three vertexes and produce asmooth-shaded triangle in 3.5 nanoseconds.

Consider the edge between two fragments. Mathematically, it is a perfectly straight line withno width. In practice, it is formed of pixels. A pixel either belongs to one fragment or theother, or is shared by both fragments. The pixel must be coloured accordingly; if the colour iswrong, our extremely sensitive eyes will detect imperfections in the image, even if we cannotsee precisely what causes them.

It follows that the low-level primitives of the graphics system must be extremely robust andreliable. For example, the pixels that form a line must not depend on the direction in whichwe draw the line. Furthermore, operations at this level must be extremely efficient, becausethey are used very frequently.

Finally, at this stage, we are working with discrete entities (pixels), not smoothly changingquantities. Not only are the pixels themselves discrete, but colour is quantized into a fixed alimited number of bits. Consequently, the best algorithms will avoid floating-point calculationsas much as possible and work with integers only.

We will consider just two of the problems that arise during rasterization: drawing a straightline (which is fundamental) and drawing a circle (which is useful but not quite so fundamental).

6.2.1 Drawing a Straight Line

The midpoint algorithm scan converts a straight line using only integer addition. Bresen-ham (1965) had the original idea, Pitteway (1967) gave the midpoint formulation, and theversion given here is due to Van Aken (1984).

59


The problem is to scan convert a line with end points (x0, y0) and (x1, y1). We assume thatthe end points have integer coordinates (that is, they lie on the pixel grid). Let

dx D x1 � x0

dy D y1 � y0

We assume that dx > 0, dy > 0, anddy

dx< 1. (Note that dx and dy are integers, not

differentials.) Since the slope is less than one, we will need a pixel at every X-ordinate. Theproblem is to choose the Y coordinates of the pixels.

Assume that we have plotted a pixel at (xp , yp). We have two choices for the pixel at xp C 1:it should be at either H or L in the diagram below. Suppose that M is midway between L

and H . If the line passes below M , we plot pixel L; if the line passes above M (as in thediagram), we plot pixel H .

M

M 0

��

syp

yp C 1

xp xp C 1

H

L

The equation of the line is

y D xdy

dxC B (1)

where B D y0 � x0dy

dxis the intercept with the axis x D 0 (we do not actually need B in the

subsequent calculations). We can rewrite (1) as x dy � y dx C B dx D 0 and we define

F(x, y) � x dy � y dx C B dx. (2)

If P is a point on the line, clearly F(P) D 0. We can also show that

F(P)

�< 0, if P is above the line;> 0, if P is below the line.

Consequently, we can use F to decide which pixel to plot. If F(M) < 0, then M is above theline and we plot L; if F(M) > 0, then M is below the line and we plot H .

We can easily compute F(M), using definition (2), as

F(M) D (xp C 1) dy � (yp C 12) dx C B dx.

What happens next? Suppose we plot pixel L. Then the next midpoint, M 0, is one step“east” of M at (xp C 2, yp C 1

2). We have

F(M 0) D F(xp C 2, yp C 12)

D (xp C 2) dy � (yp C 12) dx C B dx

D F(M)C dy.

60


If, instead, we plot pixel H , the next midpoint is one step “northeast” of M at (xpC2, ypC 32),

as in the diagram. In this case,

F(M 0) D F(xp C 2, yp C 32)

D (xp C 2) dy � (yp C 32) dx C B dx

D F(M)C dy � dx.

Using these results, we need to compute d D F(M) only once, during initialization. For thefirst point on the line, xp D x0 and yp D y0 . Consequently:

F(M) D (xp C 1) dy � (yp C 12) dx C B dx

D (x0 C 1) dy � (y0 C 12) dx C y0 dx � x0 dy

D dy � 12

dx

In subsequent iterations, we:increment x;if d < 0, add dy to d;else if d > 0, increment y and add dy � dx to d .

There are three points to note:

� In the last step, dy < dx, and so dy � dx < 0 and d gets smaller.

� We have implicitly dealt with the case F(M) D 0 in the same way as F(M) > 0. Insome situations, we might need a more careful choice.

� The algorithm still has fractions (with denominator 2). Since we need only the sign ofF , not its value, we can use 2F(M) instead of F(M).

Figure 33 shows a simple C version of the algorithm. Remember that this handles only thecase 0 < dy < dx: a complete function would have code for all cases.

6.2.2 Drawing a Circle

We can use the ideas that we used to draw a straight line to draw a circle. First, we usesymmetry to reduce the amount of work eightfold. Assume that the centre of the circle is atthe origin (0, 0). Then, if (x, y) is a point on the circle, then the following seven points arealso on the circle: (�x, y), (x,�y), (�x,�y), (y, x), (�y, x), (y,�x), and (�y,�x).

The equation of a circle with radius R and centre at the origin is x2 C y2 D R2. Let

F(x, y) D x2 C y2 �R2.

Then, for any point P :

F(P)

8<:< 0, if P is inside the circle;D 0, if P is on the circle; and> 0, if P is outside the circle.

61


void line (int x0, int y0, int x1, int y1){

int dx = x1 - x0;int dy = y1 - y0;int d = 2 * dy - dx;int L = 2 * dy;int H = 2 * (dy - dx);int x = x0;int y = y0;for (; x < x1; x++){

pixel(x, y);if (d < 0)

d += L;else{

d += H;y++;

}}pixel(x1, y1);

}

Figure 33: A C function for lines with slope less than 1.

r b bb b

bxp xp C 1 xp C 2

yp � 2

yp � 1

ypH

L

M M2

M1

pppppppppppp

Figure 34: Drawing a circle

62


Assume we have plotted a pixel at (xp , yp) (see Figure 34). The decision variable d is givenby

d D F(xp C 1, yp � 12)

D (xp C 1)2 C (yp � 12)2 �R2

If d � 0, as in Figure 34, we plot L, the next decision point is M1, and

d 0 D F(xp C 2, yp � 32)

D (xp C 2)2 C (yp � 32)2 �R2

D d C 2xp � 2yp C 5.

If d > 0, we plot H , the next decision point is M2, and

d 0 D F(xp C 2, yp � 12)

D (xp C 2)2 C (yp � 12)2 �R2

D d C 2xp C 3.

For the first pixel, x D 0, y0 D R, and

M D (x0 C 1, R� 12)

D (1, R� 12),

and

F(M) D F(1, R� 12)

D 12 C (R� 12)2 �R2

D 54�R.

From this algorithm, it is straight forward to derive the code shown in Figure 35. For eachcoordinate computed by the algorithm, the function circlepoints in Figure 36 plots eightpixels at the points of symmetry.

6.2.3 Clipping

Clipping means removing part or all of an object because it is invisible. The simplest exampleis a straight line joining two points A and B. The easy cases are when A and B are bothinside or both outside the window; the harder case is when one point is inside and the other isoutside, because we have to find out where the line meets the edge of the window and “clip”it at that point.

We consider just one clipping technique: the Sutherland-Hodgman polygon-clipping algo-rithm. The general algorithm can clip a polygon in 3D against a polyhedral volume definedby planes. We will consider the simple case of clipping against planes parallel to the principleaxes, such as the boundaries of the viewing volume (VV).

The algorithm moves around the polygon, considering the vertexes one at a time, and addingvertexes to its output. We assume that it has processed vertex u and is moving to vertex v.The following cases must be considered:

63


void circle (int radius){

int x = 0;int y = radius;double d = 1.25 - radius;circlepoints(x, y);while (y > x){

if (d < 0)d += 2.0 * x + 3.0;

else{

d += 2.0 * (x - y) + 5.0;y--;

}x++;circlepoints(x, y);

}}

Figure 35: Computing points in the first octant

void circlepoints (int x, int y){

pixel(x, y);pixel(-x, y);pixel(x, -y);pixel(-x, -y);pixel(y, x);pixel(-y, x);pixel(y, -x);pixel(-y, -x);

}

Figure 36: Plotting eight symmetrical points

1. u and v are both inside the VV: output the vertex v.

2. u is inside the VV and v is outside the VV: find the intersection w of the line uv withthe VV and output w.

3. u and v are both outside the VV: no output.

4. u is outside the VV and v is inside: find the intersection w of the line uv with the VV;output w and then v.

Sutherland and Hodgman showed how to implement this algorithm recursively so that it canbe performed in hardware without intermediate storage. Figure 37 gives C-like pseudocode

64


void PolyClip(Vertex inVertexArray[],Vertex outVertexArray[],int inLength,int & outLength,Edge clipBoundary)

{Vertex s, p, i;int j;outLength = 0;s = inVertexArray[inLength - 1];for (j = 0; j < inLength; j++){

p = inVertexArray[j];if (Inside(p, clipBoundary)){

if (Inside(s, clipBoundary))Output(p);

else{i = Intersect(s, p, clipBoundary);Output(i);Output(p);

}}else

if (Inside(s, clipBoundary)){i = Intersect(s, p, clipBoundary);Output(i);

}s = p;

}}

Figure 37: Sutherland-Hodgman Polygon Clipping

for the algorithm. Note that PolyClip is called four times: once for each side of the enclosingrectangle.

In Figure 37, Output(p) is a kind of macro with the effect:

outVertexArray(outLength++) = p;

The function Inside checks whether the given point is inside the clipping boundary andreturns true if it is. The function Intersect returns the vertex where the line joining thetwo given vertexes crosses the clipping boundary.

65


r

r

r

rA B

CD

1001 1000 1010

0001 0000 0010

0101 0100 0110

Figure 38: Labelling the regions

To make algorithms like this one efficient, it is important that low-level operations such asInside and Intersect are performed quickly. For example, we need a fast way of decidingwhether a line is partly or fully outside the VV. Here is one way of doing this for rectangles.We extend the rectangle ABCD to define nine regions, as shown in Figure 38.

Suppose that the bottom left corner of the rectangle ABCD is at (Xmin, Ymin) and the topright hand corner is at (Xmax, Ymax). Then the four coding bits have following values:

˘ Bit 1: y > Ymax

˘ Bit 2: y < Ymin

˘ Bit 3: x > Xmax

˘ Bit 4: x < Xmin

We can assign these bits quickly. Now assume that we have assigned bits to the end points ofa line, giving values A and B.

˘ If A D B D 0, the line is within the rectangle.

˘ If A & B 6D 0, the line is entirely outside the rectangle.

In the other cases, the line is partly inside and partly outside the rectangle. Note, however,that the bit values tell us which edge it crosses; this information speeds up the calculation ofthe point of intersection.

66

7 TRANSFORMATIONS — AGAIN

7 Transformations — Again

This section provides the mathematical background for 3D graphics systems. We begin bymotivating the discussion: what is wrong with “ordinary” 3D Euclidean space?

It turns out that there are several problems:

˘ In physics and mechanics, we use a coordinate system with a fixed origin. This is conve-nient for simple problems but, in graphics, the choice of origin is often not obvious. Weneed a more flexible kind of geometry.

˘ Graphics requires several kinds of transformations (translation, scaling, rotation, etc.)and projections. Although these can all be performed in 3D Euclidean space, there is nosimple and uniform technique that works for all of the transformations that we need.

The appropriate mathematical tools have existed for a long time: they are scalar, vector, andaffine spaces.

7.1 Scalar Spaces

A scalar space , also known as a field , is a set of values and operations on those values. Thereare two special values, the zero and the unit , and the operations are addition, subtraction,multiplication, and division. Although there are many scalar spaces, we need only one: thereal numbers . In this section, we will use lower case Greek letters (˛,ˇ, , . . .) to denotereal numbers and R to denote the set of all real numbers. Naturally, the zero of this systemis 0 and the unit is 1.

7.2 Vector Spaces

A vector space is a collection of vectors that satisfies a set of axioms. We write vectors asbold face Roman letters such as u, v, w, etc. The axioms of a vector space are as follows; wedenote the vector space itself by V.

˘ There is a zero vector that we will write as 0.

˘ Vectors can be multiplied by scalars. If ˛ 2 R and v 2 V then ˛ v 2V.

˘ Vectors can be added and subtracted. If u 2V and v 2V then uCv 2V and u�v 2V.

˘ The following identities hold for all ˛,ˇ 2 R and u, v 2V:

0 v D 0 (3)1 v D v (4)

(˛ ˇ) v D ˛ (ˇ v) (5)(˛ C ˇ) v D ˛ v C ˇ v (6)˛ (uC v) D ˛ u C ˛ v (7)

There are a number of important properties of vector spaces that we will not explore herebecause they can be found in any book on linear algebra. In particular, a basis for a vectorspace V is a set of vectors u1, u2, . . . , un such that every vector v 2V can be put in the form˛1 u1 C ˛2 u2 C � � � C ˛n un. The vector space has dimension d if it has no basis consistingof less than d vectors.

67

7 TRANSFORMATIONS — AGAIN 7.3 Affine Spaces

Vector spaces have a particular vector, 0, that plays a special role. Affine spaces, which wediscuss next, are a way of avoiding the existence of an element with special properties.

Example The standard model for a vector space is the set of n-tuples of real numbers. Avector (v1, v2, . . . , vn) is a tuple with n real components. This vector space is usually denotedby Rn.

For example, R2 consists of pairs like (x, y). The zero vector 0 is represented by (0, 0).Multiplication by a scalar, and addition and subtraction of vectors are defined by

˛ (x, y) D (˛ x, ˛ y)

(x, y) C (x0y 0) D (x C x0, y C y 0)

(x, y) � (x0y 0) D (x � x0, y � y 0)

7.3 Affine Spaces

An affine space consists of: a set of points ; an associated vector space ; and two operations(in addition to the operations of the vector space). The two operations are difference ofpoints (giving a vector) and addition of a vector and a point (giving a point). In thefollowing formal definitions, we write P for the set of points and V for the associated vectorspace.

˘ If P 2 P and Q 2 P then P �Q 2 V.

˘ If P 2 P and v 2V then P C v 2 P.

The affine operations must satisfy certain properties:

P � P D 0 (8)(P C u)C v D P C (u C v) (9)(P �Q) C v D (P C v)�Q (10)

P C v D P if and only if v D 0 (11)

Consider the expression L � P C ˛ (Q � P). We note first of all that this expression iswell-formed: since P and Q are points, Q�P is a vector. We can multiply the vector Q�P

by a scalar, ˛, obtaining the vector ˛ (Q� P), which we can add to the point P . If we thinkof P and Q as fixed points and ˛ as a variable scalar, then L corresponds to a set of points.This set includes the points P and Q; clearly, when ˛ D 0, we have L � P . Less obviously,when ˛ D 1, we have L � P C (Q� P) D Q.

We define fP C ˛ (Q� P) j ˛ 2 R g to be the line joining the points P and Q. Similarly,we define

fRC ˇ (P C ˛ (Q � P)) j ˛ 2 R,ˇ 2 R g

to be the plane through the points P , Q, and R.

Two lines, L � P C ˛ (Q � P) and L0 � P 0 C ˛ (Q0 � P 0) are parallel if Q � P D Q0 � P 0

(vector equality).

The expression ˛ P C ˇQ is currently undefined, because we cannot multiply a point by ascalar. If ˛C ˇ D 1, however, we define this expression to mean P C ˇ (Q�P) and we referto ˛ P C ˇQ as an affine combination .

68

7 TRANSFORMATIONS — AGAIN 7.4 Transformations

Euclidean similarity affine projectiveTransformations

rotation � � � �translation � � � �uniform scaling � � �nonuniform scaling � �shear � �perspective projection �

Invariantslength �angle � �ratio of lengths � �parallelism � � �incidence � � � �cross-ratio � � � �

Figure 39: Varieties of Transformation (adapted from Birchfield (1998))

Example A typical member of R4 is the 4-tuple (x, y, z,w). Define a point to be a memberof R4 with the form (x, y, z, 1). Then the difference of two points (computed by vectorsubtraction in R4) is

(x, y, z, 1)� (x0, y 0, z0, 1) D (x � x0, y � y 0, z � z0, 0)

If we interpret (x � x0, y � y 0, z � z0, 0) as a vector in R3, then the set of points is an affinespace with R3 as its associated vector space. It is easy to see that the axioms (8)–(11) aresatisfied.

The space we have defined is called the standard affine 3-space in R. We will use thisaffine space for the rest of this section, referring to it as S. The four coordinates used todescribe a point in S are called homogeneous coordinates .

We can take an arbitrary member of R4, such as (x, y, z,w), and transform it to the point(x=w, y=w, z=w, 1) in S; this transformation is called homogenization . The four coordi-nates used to describe a point in S are called homogeneous coordinates . We can performcalculations in R4 but, before we interpret the results, we must homogenize all of the points.

7.4 Transformations

There are many kinds of transformations. Transformations are classified according to theproperties that they preserve. For example, a rotation is a Euclidean transformationbecause it does not distort objects in Euclidean space, but a projection is non-Euclideanbecause it loses information about one dimension and distorts objects in the other dimensions.An important feature of a transformation is the properties that it preserves: these are calledthe invariants of the transformation. Figure 39 shows various kinds of transformations andtheir properties.

An affine transformation is a function that maps every point of an affine space to anotherpoint in the same affine space. An affine transformation T must satisfy certain properties:

69

7 TRANSFORMATIONS — AGAIN 7.4 Transformations

˘ Affine combinations are preserved:

T (˛P C ˇQ) D ˛ T (P) C ˇ T (Q)

˘ If L is a line, then T (L) is a line. (Note that the line L is a set of points; T (L) is anabbreviation for fT (P) j P 2 L g.)

˘ If L kL0 (L and L0 are parallel lines) then T (L) kT (L0).

˘ If M is a plane, then T (M) is a plane.

˘ If M kM 0 (M and M 0 are parallel planes) then T (M) kT (M 0).

There is a convenient representation of affine transformations on S: we can use 4�4 matricesof the form

M D

2664

� � � �� 0 0 0 1

3775

where the dots indicate any value. To transform a point, we treat the point as a 4� 1 matrixand premultiply it by the transformation matrix.

7.4.1 Translation

A translation transforms the point (x, y, z, 1) to (x C a, y C b, z C c, 1). The matrix and itseffect are described by the following equation:

2664

1 0 0 a

0 1 0 b

0 0 1 c

0 0 0 1

3775 �

2664

x

y

z

1

3775 D

2664

x C a

y C b

z C c

1

3775

7.4.2 Scaling

A scaling transformation scales the coordinates of the point by given factors along each of theprinciple axes. The matrix and its effect are described by the following equation:

2664

r 0 0 0

0 s 0 0

0 0 t 0

0 0 0 1

3775�

2664

x

y

z

1

3775 D

2664

r x

s y

t z

1

3775

7.4.3 Rotation

Rotations about the principle axes are defined by the equations below. We assume that therotations are counter-clockwise in a right-handed coordinate system. (To visualize a right-handed coordinate system, extend the thumb and first two fingers of your right hand so they

70

7 TRANSFORMATIONS — AGAIN 7.5 Non-Affine Transformations

are roughly at right-angles to one another. Then your thumb points along the X -axis, yourfirst finger points along the Y -axis, and your second finger points along the Z-axis.)

Rx D

2664

1 0 0 0

0 cos � � sin � 0

0 sin � cos � 0

0 0 0 1

3775

Ry D

2664

cos � 0 sin � 0

0 1 0 0

� sin � 0 cos � 0

0 0 0 1

3775

Rz D

2664

cos � � sin � 0 0

sin � cos � 0 0

0 0 1 0

0 0 0 1

3775

A general rotation through an angle � about an axis in the direction of the unit vectoru D (ux, uy , uz) is given by the matrix

Ru(�) D

2664

c C (1� c)u2x (1� c)uyux � suz (1� c)uzux C suy 0

(1� c)uxuy C suz c C (1� c)u2y (1� c)uzuy � sux 0

(1� c)uxuz � suy (1� c)uyuz C sux c C (1� c)u2z 0

0 0 0 1

3775

where s D sin � and c D cos � .

7.5 Non-Affine Transformations

We can use matrices to define transformations that are not affine but are nevertheless usefulin graphics. The bottom row of such a matrix is not [0, 0, 0, 1] and, in fact, we can use thisfact to achieve interesting results.

7.5.1 Perspective Transformations

The first non-affine transformation that we will consider the perspective transformation. Dur-ing the middle ages, the concept of perspective was developed by imagining a ray of light pass-ing through a window; the ray originates at a point in the scene and arrives at the painter’seye. The point at which it passes through the window is the point where that part of thescene should be painted.

Figure 40 shows the origin at O , with the Z axis extending to the right and the Y axisextending upwards (the X axis, which comes out of the paper towards you, is not shown).The “window” on which the scene is to be projected is at z D �n. The point in the scene isat P , and its projection P 0 is obtained by drawing a line (corresponding to a light ray) fromP to O . By similar triangles,

y 0 D �yn

z

71


�Z

6

Y

O

��

n

P 0

y 0

z

Py

Figure 40: Perspective

and, in the XZ plane

x0 D �xn

z

It might appear that we cannot achieve a transformation of this kind with a matrix, becausematrix transformations are supposed to be linear. But consider the following equation:

2664

n 0 0 0

0 n 0 0

0 0 0 0

0 0 �1 0

3775�

2664

x

y

z

1

3775 D

2664

n x

n y

0

�z

3775

If we homogenize the transformed point by dividing each of its components by �z, we obtainP 0 D (�x

n

z,�y

n

z, 0, 1), with the same X and Y values as above.

A transformation, as explained above, is a mapping from a space into the same space. Aprojection is a mapping from a space to a space with fewer dimensions. If we discard thelast two components of P 0, we obtain the projection

(x, y, z, 1) 7! (�xn

z,�y

n

z)

We have mapped the point P in S to a point in a two-dimensional plane using a perspectivetransformation followed by a projection.

In practical situations, it is useful to have a value for the Z coordinate: for example, we canuse this value for depth buffer comparisons. To obtain a suitable value for Z, we apply thefollowing transformation

2664

n 0 0 0

0 n 0 0

0 0 a b

0 0 �1 0

3775 �

2664

x

y

z

1

3775 D

2664

n x

n y

a z C b

�z

3775

After homogenization, we obtain the 3D point��x

n

z,�y

n

z,�

a z C b

z

�. In this expression,

the Z value is called the pseudodepth . It increases with distance, and can be used for depthbuffer comparisons.

72


For clipping purposes, it is convenient to restrict the pseudodepth values to the range ˙1, as inthe other directions. Assume the near and far planes are at z D �n and z D �f respectively.Then we have

�a (�n)C b

�nD �1

�a (�f )C b

�fD 1

and therefore

a DnC fn� f

b D2 n f

n� f

and the perspective transformation that OpenGL uses is indeed266664

n 0 0 0

0 n 0 0

0 0nC fn� f

2 n f

n� f0 0 �1 0

377775

The pseudodepth is

z0 D �az C b

z

D �nCfn�f

z C 2nfn�f

z

D �z(n C f )C 2f n

z(n� f ).

7.5.2 Shadows

Suppose that there is a light source at L, a point at P , and a plane given by the equationAxCBy CC zCD D 0. The point P will cast a shadow on the plane. To find the position ofthe shadow, we have to find where the line through L and P meets the plane. The parametricequation of a line through L and P is given by:

Qx D Lx C (Px � Lx) t (12)Qy D Ly C (Py � Ly) t (13)Qz D Lz C (Pz � Lz) t (14)

To find where this line meets the plane, we solve the equation

A Qx C B Qy C C Qz CD D 0

for t . The result is

t D �A Lx C B Ly C C Lz CD

A Px �A Lx C B Py � B Ly C C Pz � C Lz

DA Lx C B Ly C C Lz CD

(A Lx C B Ly C C Lz)� (A Px C B Py C C Pz)

73


Substituting this value into (12)–(14) gives the coordinates for the shadow point as

Qx DLx B Py CLx C Pz � Px B Ly � Px C Lz � Px D C Lx D

A Px � A Lx C B Py � B Ly C C Pz � C Lz

D �Lx(B Py C C Pz CD) � Px(B Ly C C LZ CD)

(A Lx C B Ly C C Lz) � (A Px C B Py C C Pz)

Qy D ��Ly A Px � Ly C Pz C Py A Lx C Py C Lz C Py D �Ly D


D �Ly(A Px C C Pz CD)� Py(A Lx C C Lz CD)


Qz DLz A Px CLz B Py � Pz A Lx � Pz B Ly � Pz D C Lz D


D �Lz(A Px C B Py CD)� Pz(A Lx C B Ly CD)


The matrix that takes a point P onto its projection Q on the plane is2664

�B Ly � C Lz �D Lx B Lx C Lx D

Ly A �A Lx � C Lz �D Ly C Ly D

Lz A Lz B �A Lx � B Ly �D Lz D

A B C �A Lx � B Ly � C Lz

3775

As an example, we can choose y D 0 as the plane and (0, 1, 0) as the position of the lightsource. Then B D 1, A D C D D D 0, Lx D 0, Ly D 1, and Lz D 0. The matrix is

2664

�1 0 0 0

0 0 0 0

0 0 �1 0

0 1 0 �1

3775 .

If we use this matrix to transform the point (x, y, z, 1), we obtain (�x, 0,�z, y � 1). After

homogenization, this point becomes�

x

1� y, 0,

z

1� y

�. We can see that this is correct for

points with 0 � y < 1. First, note that the shadow lies entirely in the plane y D 0. Next,points with y D 0 transform to themselves, because they are on the plane and so is theirshadow. Points on the plane y D 1

2have their coordinates doubled. As a point moves closer

to the plane y D 1, its shadow approaches infinity.

7.5.3 Reflection

Again, we consider a point P at (Px , Py , Pz) and the plane AxCBy CC z CD D 0. Here arethe parametric equations of a line through P perpendicular to the plane:

x D Px C A t (15)y D Py C B t (16)z D Pz C C t (17)

To find where this line meets the plane, we solve this equation and Ax C By C C z CD D 0

for t , giving

t0 D �A Px C B Py C C Pz CD

A2 C B2 C C 2(18)

74

7 TRANSFORMATIONS — AGAIN 7.6 Working with Matrices

The reflection Q of P in the plane is the point obtained by substituting t D 2 t0 in equations(15)–(17):

Qx D Px C 2 A t0

Qy D Py C 2 B t0

Qz D Pz C 2 C t0

Using the value of t0 given by (18), we have:

Qx D Px �2 a (a Px C b Py C c Pz C d)

�

Qy D Py �2 b (a Px C b Py C c Pz C d)

�

Qz D Pz �2 c (a Px C b Py C c Pz C d)

�

where � D A2 C B2 C C 2. The matrix which maps P to Q is2664

�� 2 A2 �2 A B �2 A C �2 A D

�2 A B �� 2 B2 �2 B C �2 B D

�2 A C �2 B C �� 2 C 2 �2 C D

0 0 0 �

3775

As a test of this matrix, we consider reflections in the YZ plane, which has equation x D 0.The coefficients in the plane equation are A D 1 and B D C D D D 0. The following matrixis clearly correct: 2

664

�1 0 0 0

0 1 0 0

0 0 1 0

0 0 0 1

3775

7.6 Working with Matrices

Finding the equations for projects is fairly straightforward: it is usually a matter of solvinglinear equations. The algebra can get a bit heavy but problems of this kind are easily solvedby a package such as Maple, Matlab, or Mathematica.

The hard part is converting the solution from a set of equations into a matrix. The packagesare not much use here. The following notes may help.

The following equation illustrates the effect of terms in the bottom row of a 4�4 transformationmatrix:

2664

1 0 0 0

0 1 0 0

0 0 1 0

a b c d

3775�

2664

x

y

z

1

3775 D

2664

x

y

z

a x C b y C c z C d

3775

After normalizing and dropping the fourth coordinate, we obtain the 3D point�

x

a x C b y C c z C d,

y

a x C b y C c z C d,

z

a x C b y C c z C d

�.

75

7 TRANSFORMATIONS — AGAIN 7.6 Working with Matrices

That is, entries in the fourth row act as divisors for the coordinates of the output point. Thefirst three columns divide by factors proportional to x, y, and z respectively, and the fourthcolumn can be used to divide by a constant factor.

As usual, entries in the right column correspond to translations:2664

1 0 0 r

0 1 0 s

0 0 1 t

0 0 0 1

3775�

2664

x

y

z

1

3775 D

2664

x C r

y C s

z C t

1

3775

Thus these entries in the matrix can be used to add constant quantities (that is quantitiesindependent of x, y, and z) to the output point.

We can apply these ideas to the equations for Q (15)–(15). The denominator of each coordi-nate is (A Lx C B Ly C C Lz)� (A Px C B Py C C Pz). From this, we can immediately inferthat the matrix must have the form

2664

� � � �� A B C �A Lx � B Ly � C Lz

3775

where the dots indicate values we don’t know yet and we have changed the sign because Qx

is negative.

Looking at the numerator of

Qx D �Lx(B Py C C Pz CD)� Px(B Ly C C LZ CD)


the Px term tells us that the top-left corner of the matrix must be �(B Ly C C LZ C D).Similarly, the components of Lx(B Py CC Pz CD give the other components of the first rowof the matrix. We now have:

2664

�(B Ly C C LZ CD) Lx B Lx C Lx D

� � � �� A B C �A Lx � B Ly � C Lz

3775

It is now safe to make a guess about the other entries based on symmetry. Filling them ingives:2664

�B Ly � C Lz �D Lx B Lx C Lx D

Ly A �A Lx � C Lz �D Ly C Ly D

Lz A Lz B �A Lx � B Ly �D Lz D

A B C �A Lx � B Ly � C Lz

3775

The final step is to see of this works. First, try some very simple examples. For instance,put the light source at (0, 1, 0) and use the plane y D 0. This gives the matrix shown at theend of Section 7.5.2. Then we can try more complicated examples and, finally, try it out withOpenGL.

76

8 ROTATION

8 Rotation

Rotation in three dimensions is quite complicated but is easier to understand in relationto rotation in two dimensions. Consequently, we discuss rotation in general first; then 2Drotation, although much of the material should be revision; and finally rotation.

8.1 Groups

A group G D (S, �) is an algebraic structure consisting of a set S and a binary operation �on elements of the set. A group must have the following properties:

Closure: The set S is closed under the operation �: if x 2 S and y 2 S , then x � y 2 S .Associative: The operation � is associative: for all x, y, z 2 S , x � (y � z) D (x � y) � z.Unit: There is a unit element u 2 S with the property that, for any x 2 S , x �u D u �x D x.Inverse: For every element x 2 S , there is an inverse element y 2 S such that x � y D u.

Note that the group properties do not include commutativity. In general, x � y 6D y � x. Agroup with a commutative operator is called a commutative group or an Abelian group .

We will write x�1 for the inverse of x.

Since S is a set, it has subsets. If the elements of a subset and the group operation form agroup H , the H is a subgroup of G. Here is the formal definition of subgroup:

Let G D (S, �) be a group and suppose the set T � S has the following properties (note thatwe do not need to mention associativity):

Closure: The set T is closed under the group operation � of G: if x 2 T and y 2 T , thenx � y 2 T .

Unit: T contains the unit element u of G.Inverse: If x 2 T , then x�1 2 T .

Then H D (T , �) is a subgroup of G.

Groups are often used to model operations on a set of objects. For example, graphics trans-formations operate on vertex coordinates.

Assume that there is a group G and a set of objects O and that it is meaningful to apply amember of G to a member of O . If f 2 G and x 2 O , we write f (x) for this operation. Werequire:

Closure: the group operations are closed over O : if p 2 G and x 2 O , then p(x) 2 O ;Unit element: if e is the unit element of G and x 2 S , then e(x) D x.

A binary relation � on a set S is an equivalence relation iff for all x, y, z 2 S :

Reflexivity: x � x;Symmetry: x � y if and only if y � x;Transitivity: if x � y and y � z, then x � z.

77

8 ROTATION 8.1 Groups

Do not confuse the application of a group element to another group element (p � q 2 G) withthe application of a group element to a member of O (p(x) 2 O).

The familiar concept of “symmetry” is formally defined in terms of subgroups and equivalencerelations.

Lemma: Let � be an equivalence relation on O and let H be the subset of G defined by

p 2 H () 8 x 2 O .p(x) � x.

Then H is a subgroup of G.

Proof: We assume that � is an equivalence relation we show that H is a subgroup of G.

Subset: H is a subset of G by definition.Unit element: for any x 2 S :

x � x (reflexivity)e(x) � x (unit element)

e 2 H (definition of H )

Closure: Assume p, q 2 H and q(x) D y and p(y) D z. Then pq(x) D p(y) D z and

x � y (q 2 H and q(x) D y)y � z (p 2 H and p(y) D z)x � z (transitivity of �)x � (p � q)(x) (by construction of z)

p � q 2 H (definition of H )

Inverse: Assume p 2 H and let y D p(x).

p(x) � x (p 2 H)

y � x (y D p(x))

x � y (reflexivity of �)p�1(y) � y (p�1(y) D p�1p(x) D x)

p�1 2 H (definition of H )

As an example, suppose that the elements of S are images of squares and the elements of G

are p� . If x 2 S , then p�(x) is the image x rotated by � degrees, where � is a whole numbersatisfying 0 � � < 360. We define

p� � p� D p[�C�]

p�1� D p[�� ]

where [� ] stands for � adjusted by adding or subtracting multiples of 360 so that 0 � [� ] < 360.

Suppose that x � y if “x looks the same as y”. Since x and y are images of squares, theywill look the same if they are rotated through a multiple of 90ı. The subgroup of G inducedby � consists of the operators p� with � 2 f 0, 90, 180, 270 g.

78

8 ROTATION 8.2 2D Rotation

8.2 2D Rotation

Rotations in 2D are rotations in a plane about a fixed point. A rotation has one parameter,which is an angle: we rotate through an angle � .

If we rotate through � and then through �, we have rotated through � C�. Rotating through0 has no effect. Rotating through � and then through �� also has no effect.

Thus we can summarize the rotation group in 2D as follows:

˘ The elements are angles. We will assume that angles are computed mod 360ı — that is,if an angle is greater than 360ı (or less than zero), we subtract (or add) 360ı to it untilit is in the range [0, 360).

˘ The binary operation is addition of angles : given � and �, we can calculate � C �.

˘ The unit is 0ı.

˘ The inverse of � is �� .

The rotation group in 2D is called SO2, short for special and orthogonal group in twodimensions . We note that it is a commutative group.

We need a way of calculating how points on the plane are transformed by elements of SO2.With simple coordinate geometry, we can show that, if a point P has coordinates (x, y) then,after rotation through � , it has coordinates (x0, y 0) where

x0 D x cos � C y sin �y 0 D �x sin � C y cos �

8.2.1 Representing 2D Rotations with Matrices

We can use 2� 2 matrices to represent these rotations. The matrix for a rotation through �is �

cos � sin �� sin � cos �

�

and the unit is obtained by substituting 0 for � :�

1 0

0 1

�

The group operation with this representation is matrix multiplication. For example:�

cos � sin �� sin � cos �

��

�cos � sin�� sin � cos �

�D

�cos � cos� � sin � sin� cos � sin � C sin � cos �� sin � cos� C cos � sin� � sin � sin� C cos � cos�

�

D�

cos(� C �) sin(� C �)

� sin(� C �) cos(� C �)

�

Although matrix multiplication in general is not commutative, these particular operations arecommutative.

79


8.2.2 Representing 2D Rotations with Complex Numbers

There is an alternative representation for 2D rotations: we can use complex numbers.

A complex number has the form xC i y in which i Dp�1. The norm2 of a complex number

z D x C i y is

k z k D x2 C y2

We are interested only in complex numbers z with k z k D 1. If we write z D x C i y, thenx2 C y2 D 1 and so these numbers lie on the unit circle in a complex plane. We can write allsuch numbers in the form cos � C i sin � for some value 0 � � < 2� .

If we multiply two numbers on the unit circle, we get another number on the unit circle:

(cos � C i sin �) (cos� C i sin �) D cos � cos� � sin � sin� C i(sin � cos� C cos � sin�)

D cos(� C �)C i sin(� C �)

Thus the effect of multiplying by cos � C i sin � is to rotate through an angle � . In thissystem, the rotation group is represented as follows:

˘ Group elements are complex numbers of the form cos � C i sin � .

˘ The group operation is complex multiplication.

˘ Complex multiplication is associative.

˘ The unit element is 1C i 0, corresponding to � D 0.

˘ The inverse of cos � C i sin � is cos � � i sin � because

(cos � C i sin �) (cos � � i sin �) D cos2 � C sin2 � C i(sin � cos � � cos � sin �)

D 1

8.3 3D Rotation

Although there are analogies between 2D rotations and 3D rotations, 3D rotations are consid-erably more complicated. We first note a couple of non-intuitive trivia involving 3D rotation.

1. Rotations interfere with one another in unexpected ways. Let Rx(�) stand for a rotationthrough � about the X axis, with similar conventions for the Y and Z axes. Then youcan verify by drawing pictures that

Ry(180ı)�Rx(180ı) D Rz(180ı)

2. Although you cannot rotate an object through 360ı without letting go of it, you canrotate it through 720ı. Hold something flat on the palm of your right hand, facingupwards. Begin turning it in a counter-clockwise direction. After turning it about 360ı,your arm will be twisted but you can continue turning, raising your hand above yourhead. Eventually, you will return to the starting position but the object will have rotatedthrough 720ı. (This is called “the plate trick”.)

2The norm of a vector is the same as its length, orq

x21

C x22

C � � �, where x1, x2, . . . are the components

of the vector. In algebraic work, the square root is usually omitted. Thus we will use the squared norm forcomplex numbers here and for quaternions in Section 8.3.2.

80


The rotation group in 3D is called SO3. As with SO2, there are two important representations,the first using matrices and the second using a generalized form of complex numbers calledquaternions (Shoemake 1994a; Shoemake 1994d; Shoemake 1994b; Shoemake 1994c). Wediscuss each in turn.

8.3.1 Representing 3D Rotations with Matrices

We need a 3 � 3 matrix to represent a 3D rotation. In graphics, we use a 4� 4 matrix withthe form 2

664

? ? ? 0

? ? ? 0

? ? ? 0

0 0 0 1

3775

in which the ? entries form the 3D rotation matrix. From now on, we will ignore the 0 and 1entries of the 4� 4 matrix because they do not contribute anything useful.

When we try to use matrices to represent rotations, we run into several difficulties.

1. The first difficulty is that the general form of the matrix, representing a rotation throughan arbitrary angle about an arbitrary axis, is rather complicated. However, LeonhardEuler3 discovered that any rotation can be expressed as the product of three rotationsabout the principle axes. (The form of the matrix for any of these rotations is rathersimple.)

Using the notation above, an arbitrary rotation R can be written Rx(�) �Ry(�) �Rz( )

in which � , �, and are called Euler angles . Because 3D rotations do not commute,we must use a consistent order: Rx(�) �Ry(�) �Rz( ) is not in general the same asRx(�) �Ry(�) �Rz( ) or any other permutation.

2. Euler angles do not solve all of our problems. Suppose � D 90ı and the rotation hasthe form Rx(�) �Ry(90ı) � Rz( ). In this situation, Rx(�) and Rz(�) have the sameeffect and there is no way of rotating about the third axis! We have lost one degree offreedom. This phenomenon is called gimbal lock because it occurs in gyroscopes: theyare supported by three sets of bearings (called “gimbals”) so that they can rotate in anydirection. However, if one set of bearings is rotated through 90ı, the other two sets ofbearings become parallel and the gyroscope can no longer rotate freely.

3. In computer graphics, we frequently want to animate a rotational movement. SupposeR1 and R2 are matrices representing rotations that describe the initial and final orien-tations of an object. What are the intermediate orientations? One obvious idea is tocompute the matrix

R D (1� �) R1 C �R2

because R D R1 when � D 0 and R D R2 when � D 1. Unfortunately, the values ofR do not produce a smooth rotation and, even worse, R may not even be a rotationmatrix!

3Euler, a Swiss mathematician, lived from 1707 to 1783. The name is pronounced roughly as “Oiler”,following the German pronunciation of “eu”. “We may sum up Euler’s work by saying that he created a gooddeal of analysis, and revised almost all the branches of pure mathematics which were then known, filling upthe details, adding proofs, and arranging the whole in a consistent form. Such work is very important, and itis fortunate for science when it fall into hands as competent as those of Euler.”

81


Actually, this is quite easy to see. Suppose R1 is the identity matrix and R2 representsa rotation of 180ı about the X axis. Then

R1 D

2664

1 0 0 0

0 1 0 0

0 0 1 0

0 0 0 1

3775 and R2 D

2664

1 0 0 0

0 �1 0 0

0 0 �1 0

0 0 0 1

3775

The intermediate matrix corresponding to � D 12

is

2664

1 0 0 0

0 0 0 0

0 0 0 0

0 0 0 1

3775

which reduces any 3D object to a 1D line.

4. Given two rotations described by Euler angles, how do you find a rotation that corre-sponds to their difference. Put concretely, suppose

R1 D Rx(�1) �Ry(�1) �Rz( 1)

and R2 D Rx(�2) �Ry(�2) �Rz( 2)

how do we find a rotation

R D Rx(�) �Ry(�) �Rz( )

such that R1 �R D R2? This is a difficult problem to solve with Euler angles.

5. Since rotations matrices contain more information than is necessary to define a rotation,rounding errors create problems. After a sequence of operations on rotation matrices,accumulated errors may give a matrix that is not exactly a rotation. The consequence,in a graphics program, is distortion of the image.

Another potential problem is that we often need inverses of rotations. Computing theinverse of a matrix is expensive and subject to rounding errors. Fortunately, however, theinverse of a rotation matrix is its transpose and is therefore easy to calculate. Althoughwe can exploit this fact when we are doing our own rotation calculations, a graphicspackage such as OpenGL cannot distinguish rotation matrices from other matrices andmust use general techniques for inversion.

In summary, although it is possible to use matrices to represent 3D rotations, there are anumber of problems and a better representation is highly desirable. Fortunately, there is one.

8.3.2 Representing 3D Rotations with Quaternions

One solution of the problem of representing 3D rotations was discovered by Hamilton4 whenhe introduced quaternions.

4Sir William Rowan Hamilton (1805-1865), Irish mathematician who also introduced vectors and matricesand made important contributions to geometrical optics, dynamics (including the “Hamiltonian”), geometry,complex numbers, theory of equations, real analysis, and linear operators.

82


i j ki �1 k �jj �k �1 ik j �i �1

Figure 41: Quaternion multiplication: when a quaternion is represented as s C i xC j y C k z,multiplication uses this table for products of i, j, and k. Note that i2 D j2 D k2 D �1.

Hamilton reasoned that, since 2D rotations can be represented by two numbers (x C i y, seeabove) it should be possible to represent 3D rotations with three numbers. For eight years,he experimented with 3D vectors but was unsuccessful, because the rotation group cannot berepresented as a vector space. Eventually, he tried four numbers and succeeded very quickly:he called the new objects quaternions (“quater” is Latin for “four”).

We can write a quaternion in several different ways:

˘ As a tuple of four real numbers: (s, x, y, z);

˘ As s C i x C j y C k z, by analogy to x C i y (see Figure 41).

˘ As a scalar/vector pair: (s, v).

We will use the last of these representations, which is also the most modern.

If the vector part of a quaternion is the zero vector (so we have (s, 0)) the quaternion behavesexactly like a real number. If the scalar part is zero (so we have (0, v)), the quaternionbehaves exactly like a vector. We use this fact to make implicit conversions:

˘ the vector v can be converted to the quaternion (0, v);

˘ the quaternion (0, v) can be converted to the vector v.

In particular, the unit quaternion (1, 0) is essentially the same as the real number 1.

The quaternions are a number system in which all of the standard operations (addition, sub-traction, multiplication, and division) are defined. We do not need addition and subtractionand we will ignore them for now (there is one application which comes later).

Multiplication is defined like this:

(s1, v1) (s2, v2) D (s1s2 � v1 � v2, s1v2 C s2v1 C v1 � v2)

in which v1 �v2 is the dot product or inner product of the vectors v1 and v2 and v1 �v2

is their outer product or cross product .

The unit (1, 0) behaves as it should:

(1, 0) (s, v) D (1 s C 0 � v, 1 vC s 0C 0� v)

D (s, v)

The conjugate of the quaternion q D (s, v) is the quaternion q� D (s, �v). (Compare: theconjugate of the complex number x C i y is the complex number x � i y.)

The norm of the quaternion q D (s, v) is

k q k D q q�

D (s, v) (s, �v)

83


D (s2 C v � v, s v C s (�v)C v� v)

D (s2 C v � v, 0)

D s2 C v � v

Recall that, for any vector v, v � v D 0. The norm of a quaternion is a real number. If wewrite out the components of the vector part of the quaternion, we have

k (s, (x, y, z)) k D s2 C x2 C y2 C z2

(Compare: the norm of a complex number z D x C i y is the real number x2 C y2.)

We can rearrange

k q k D q q�

into the form

qq�

k q kD 1

which suggests

q�1 Dq�

k q k

Of all the quaternions, only the zero quaternion, (0, 0), does not have an inverse.

A unit quaternion is a quaternion q with k q k D 1. Note that:

˘ the unit (1, 0) is an example of a unit quaternion but is not the only one;

˘ if q is a unit quaternion, then q�1 D q�.

Let Q be the set of unit quaternions. Then Q, with multiplication as the operation, is agroup:

˘ multiplication is closed and associative (easy to prove, although we haven’t done so here);

˘ there is a unit, (1, 0) D 1;

˘ every unit quaternion has an inverse.

Consider the quaternion q D (cos � , u sin �) in which u is a unit vector and so u � u D 1.Since

k q k D cos2 � C (u � u sin2 �

D 1

q is a unit quaternion. In general, we can write any unit quaternion in the form (cos � , u sin �).(Compare: if z is a complex number and k z k D 1, we can write z in the form cos � C i sin � .)

Consider the product of unit quaternions with their vector components in the same direction(recall that u� u D 0 for any vector u):

(cos � , u sin �) (cos�, u sin�) D (cos � cos � � (u � u) sin � sin�,

u cos � sin � C u sin � cos� C (u� u) sin � sin�)

D (cos(� C �), u sin(� C �))

84


Multiplying unit quaternions is the same as adding angles. (Compare:

(cos � C i sin �) (cos� C i sin �) D cos(� C �)C i sin(� C �) )

We have at last reached the interesting part. Suppose q D (cos � , u sin �) is a unit quaternionand v is a vector. Then

v0 D q v q�1

is the vector v rotated through 2� about an axis in the direction u. This is the sense in whichquaternions represent 3D rotations.

8.3.3 A Proof that Unit Quaternions Represent Rotations

Define R(� , u) as an operation on vectors: its effect is to rotate a vector through an angle� about an axis defined by the unit vector u. We compute the vector R(� , u) v. Below, weshorten this expression to R v.

The first step is to resolve v into components parallel to and orthogonal to u:

vp D (u � v)u

vo D v � (u � v)u

Since R does not affect vp, we have

R v D R(vp C vo)

D vp CR vo

Let w be a vector perpendicular to vo and lying in the plane of the rotation. Then w mustbe orthogonal to u and vo and:

w D u � vo

D u � (v � (u � v)u)

D u � v � (u � v)(u� u)

D u � v

since u� u D 0.

We can resolve R vo into components parallel to vo and w. In fact:

R vo D vo cos � Cw sin �

and hence

R v D R vp CR vo

D R vp C vo cos � Cw sin �D (u � v)uC (v � (u � v)u) cos � C (u� v) sin �D v cos � C u(u � v)(1� cos �)C (u� v) sin� (19)

85


The next step is to see how quaternions achieve the same effect. Let p D (0, v) be a purequaternion and q D (cos �, u sin�). Then

q p D (cos �, u sin�) (0, v)

D (�(u � v) sin�, v cos� C (u� v) sin�)

and

q p q�1 D (�(u � v) sin�, v cos� C (u� v) sin�) (cos�,�u sin�) (20)D (�(u � v) sin� cos� � (v cos� C (u� v) sin�)) � (�u sin�), (21)

u(u � v) sin2 � C v cos2 � C (u� v) sin� cos� �(v cos � C (u� v) sin�)� (u sin�))

D (�(u � v) sin� cos� C (u � v) sin� cos� C (u� v) � u sin2 �, (22)u(u � v) sin2 � C v cos2 � C (u� v) sin� cos � C(u � v) sin� cos� � ((u� v)� u) sin2 �)

D (0, (cos2 � � sin2 �)vC 2 sin2 �(u � v)uC 2 sin� cos�(u� v)) (23)D (0, v cos 2� C u(u � v)(1� cos 2�)C (u� v) sin2� (24)

In (23), the scalar part becomes zero, because the first two terms cancel and this third termis zero: u�v �u D 0 because u�v is orthogonal to u, . In the vector part, we use the generalfact that (b�c)�d D (d�b)c�(d�c)b which, in this case, gives (u�v)�uD (u�u)v�(u�v)u.

Comparing (19) and (24), we see that they are the same if we substitute � D 2�.

To gain familiarity with unit quaternions, we consider a few simple examples. We will use theform (cos � , u sin �) for the general unit quaternion.

˘ First, assume � D 0. Then the quaternion is (1, 0 u) or simply (1, 0). The direction ofthe unit vector makes no difference if the amount of rotation is zero.

˘ Next, suppose � D 90ı. The quaternion then has the form (0, u). This means that apure unit quaternion represents a rotation through 90ı about the unit vector componentof the quaternion.

8.3.4 Quaternions and Matrices

We can think of the quaternion product q q0 as an operation: q is applied to q0. The operationq can be represented as a matrix. We call it a “left operation” because q is on the left of q0

(this is important because quaternion multiplication is not commutative). If q D (s, (x, y, z)),we can calculate the matrix from the definition of the quaternion product and obtain:

Lq D

2664

s �z y x

z s �x y

�y x s z

�x �y �z s

3775

Symmetrically, we can consider q0 q�, in which q� is a right operator acting on q0; the corre-sponding matrix is:

Rq� D

2664

s �z y �x

z s �x �y

�y x s �z

x y z s

3775

86

8 ROTATION 8.4 Quaternions in Practice

Since matrix multiplication is associative, the matrix Lq Rq� represents the effect of q v q� onthe vector v. In other words, it is the rotation matrix corresponding to the quaternion q:

Lq Rq� D

2664

s2 � x2 � y2 � z2 2(xy � sz) 2(xz C sy) 0

2(xy C sz) s2 � x2 C y2 � z2 2(yz � sx) 0

2(xz � sy) 2(yz C sx) w2 � x2 � y2 C z2 0

0 0 0 s2 C x2 C y2 C z2

3775

In the cases we are interested in, k q k D 1, and this matrix simplifies to

Q D

2664

1� 2(y2 C z2) 2(xy � sz) 2(xz C sy) 0

2(xy C sz) 1� 2(x2 C z2) 2(yz � sx) 0

2(xz � sy) 2(yz C sx) 1� 2(x2 C y2) 0

0 0 0 1

3775

It is also possible, of course, to convert a rotation matrix into a quaternion. The algebra israther heavy and will not be presented here.

8.3.5 Quaternion Interpolation

One of the advantage of the quaternion representation is that we can interpolate smoothlybetween two orientations.

The problem that we wish to solve is this: given two rotations, R and R0, how do we constructa sequence of rotations R1, R2, . . . , Rn such that R1 D R and Rn D R0 in such a way that,when we apply these rotations to a graphical object, it appears to rotate smoothly. Note thatif R and R0 are represented by matrices, this is not an easy problem to solve.

If we represent the rotations R and R0 as quaternions q and q0, respectively, there is a fairlysimple solution — although its derivation is tricky.

Let � D cos�1(q � q0) and define

slerp(t, q, q0) Dq sin((1� t )�)C q0 sin(t�)

sin �.

in which “slerp” stands for spherical linear interpolation . Then

slerp(0, q, q0) D q

slerp(1, q, q0) D q0

and, for 0 < t < 1, slerp(t, q, q0) is a quaternion that smoothly interpolates from q to q0.

8.4 Quaternions in Practice

The following three programs (available on the web site) illustrate applications of quaternionsto graphics programming. In each case, the same effect is hard to achieve with matricesalthough, of course, anything is possible.

87


void mouseMovement (int xNew, int yNew){

const int MSTART = -10000;static int xOld = MSTART;static int yOld = MSTART;if (xOld == MSTART && yOld == MSTART){

xOld = xNew + 1;yOld = yNew + 1;

}

quat.trackball(float(2 * xOld - width) / float(width),float(height - 2 * yOld) / float(height),float(2 * xNew - width) / float(width),float(height - 2 * yNew) / float(height) );

xOld = xNew;yOld = yNew;

glutPostRedisplay();}

Figure 42: Mouse callback function for trackball simulation

8.4.1 Imitating a Trackball

A trackball is a device used for motion input in professional CAD workstations and also forsome games. It is a sphere with only the top accessible; by moving your hand over the sphere,you send rotational information to the computer. The purpose of this program is to use themouse to simulate a trackball: when you move the mouse, it is as if you were moving yourhand over a trackball.

The hard part is performed by CUGL. We describe the API first and then the underlyingimplementation. The first step is to declare a quaternion:

Quaternion quat;

In the display function, the quaternion is used to rotate the model:

quat.rotate();buildPlane();

Most of the work is done by the mouse callback function, shown in Figure 42. This func-tion stores the previous mouse position, (xOld,yOld), and the most recent mouse position,(xNew,yNew).

To avoid a sudden jump when the program is started, the first block of code ensures that theold and new values are close together initially.

88


double project(double x, double y){

double dsq = x * x + y * y;double d = sqrt(dsq);if (d < BALLRADIUS * 0.5 * sqrt(2))

return sqrt(BRADSQ - dsq);else

return BRADSQ / (2 * d);}

Figure 43: Projecting the mouse position

The function passes transformed values of the old and new mouse positions to the functionQuaternion::trackball. The transformation ensures that the arguments are all in the range[�1, 1] (assuming that the mouse stays inside the window).

Finally, the old mouse position is updated to become the new mouse position and the functioncalls glutPostRedisplay to refresh the view.

From the user’s point of view, that’s all there is to do. We can look behind the scenes andsee what CUGL is doing.

The implementation uses a couple of constants: the radius, r , of the simulated trackball andr2:

const double BALLRADIUS = 0.8f;const double BRADSQ = BALLRADIUS * BALLRADIUS;

The mouse position (x, y) is projected onto a sphere to obtain a 3D point (x, y, z) such thatx2 C y2 C z2 D r2. In practice, the effect of a mouse movement becomes too extreme asthe mouse approaches the edge of the ball, and we project onto a hyperboloid instead ifx2 C y2 � 1

2r2 . All of this is done by the auxiliary function project shown in Figure 43.

The real work is done by Quaternion::trackball, shown in Figure 44. First, the vectors v1

and v2 are initialized to the projected positions of the mouse on the sphere or hyperboloid.The idea is to compute a quaternion r that represents the rotation corresponding to thesetwo points and to multiply the current quaternion by r .

The function computes vectors a D v1 � v2 and d D v1 � v2 and the real number t Dkd k2r

.The value of t is then clipped if necessary to ensure that �1 � t � 1 and then � is set tosin�1 t .

We require a rotation through 12� about an axis parallel to a. The corresponding quaternion

is (sin � , Oa cos �) where Oa Dak a k

.

8.4.2 Moving the Camera

The problem solved by the next program is to provide a set of commands that move thecamera in a consistent way. The program provides six ways of translating the camera (left,right, up, down, forwards, and backwards) and four ways of rotating it (pan left, pan right,

89


void Quaternion::trackball(double x1, double y1, double x2, double y2){

Vector v1(x1, y1, project(x1, y1));Vector v2(x2, y2, project(x2, y2));

Vector a = cross(v2, v1);Vector d = v1 - v2;double t = d.length() / (2 * BALLRADIUS);if (t > 1) t = 1;if (t < -1) t = -1;double theta = asin(t);(*this) *= Quaternion(cos(theta), a.normalize() * sin(theta));

};

Figure 44: Updating the trackball quaternion

tilt up, and tilt down). We know that this is difficult to do with Euler angles because ofthe “gimbal lock” problem described in Section 8.3.1. Moreover, the matrix solution requirescomputing matrix inverses, as we will see. The program shows how to do it with quaternions.

The program uses a vector to store the position of the camera. The initial value of the vectoris (0,�h, 0), where h is the height of the camera. The height is negative because OpenGLtranslates the scene, not the camera.

const Vector STARTPOS = Vector(0, - INITIAL_HEIGHT, 0);Vector pos = STARTPOS;

The orientation of the camera is stored in a quaternion. The initial value of the quaternionis (1, 0), which is established by the default constructor.

const Quaternion STARTQUAT;Quaternion quat = STARTQUAT;

In the display function, the quaternion and the vector are used to rotate and position thecamera:

quat.rotate();pos.translate();scene();

The user translates the camera by pressing one of the keys ’f’, ’b’, ’l’, ’r’, ’u’, or ’d’.Each key invokes the function move with a unit vector, as shown in Figure 45. The unitvectors are defined in Figure 46.

Figure 47 shows the function move. We cannot simply update the position vector, becausethe direction of movement would depend on the orientation, which is not what we want.Suppose that the quaternion controlling the orientation is q. Then we apply q�1 to thecurrent position, obtaining the vector w giving the initial orientation of the camera. We thenapply the translation u to this vector and apply q to the result.

90


void graphicKeys (unsigned char key, int x, int y){

switch (key){case ’f’:

move(K);break;

case ’b’:move(- K);break;

case ’l’:move(I);break;

case ’r’:move(- I);break;

case ’u’:move(- J);break;

case ’d’:move(J);break;

case ’s’:pos = STARTPOS;quat = STARTQUAT;break;

case 27:exit(0);

default:}glutPostRedisplay();

}

Figure 45: Translating the camera

const Vector I = Vector(1, 0, 0);const Vector J = Vector(0, 1, 0);const Vector K = Vector(0, 0, 1);

Figure 46: Unit vectors

91


void move(Vector u){

pos+=quat.apply(u);}

Figure 47: Auxiliary function for translating the camera

void functionKeys (int key, int x, int y){

const double DELTA = radians(5);switch (key){case GLUT_KEY_UP:

quat *= Quaternion(I, DELTA);break;

case GLUT_KEY_DOWN:quat *= Quaternion(I, - DELTA);break;

case GLUT_KEY_LEFT:quat *= Quaternion(J, DELTA);break;

case GLUT_KEY_RIGHT:quat *= Quaternion(J, - DELTA);break;

}glutPostRedisplay();

}

Figure 48: Rotating the camera

Figure 48 shows the callback function for rotating the camera. The arrow keys " and # tiltthe camera up and down, and the arrow keys and ! pan left or right. In each case, thecurrent quaternion is multiplied by a quaternion that rotates 5ı about the given axis — I fortilts and J for pans. Note that the call radians(5) converts 5ı to radians as required by thequaternion constructor.

8.4.3 Flying

The final program allows the user to “fly” a plane by using the arrow keys. The plane movesforwards with uniform speed and the arrow keys change its orientation. The problem is toupdate the position of the plane in a way that is consistent with its orientation. As before,this is difficult to do with matrices and Euler angles could lock up. The following solutionuses quaternions.

The speed of the plane is constant and in the direction �Z, because this is the way the plane’scoordinates are set up.

92


const Vector VEL(0, 0, -100);

The plane has a current velocity, position, and orientation. Note that velocity is not ingeneral equal to VEL but rather VEL rotated by orientation.

Vector velocity;Vector position;Quaternion orientation;

When the user presses ’r’, the velocity, position, and orientation are reset to their initialvalues. The order of the statements is important because orientation is used to set velocity.

void reset(){

orientation = Quaternion(J, radians(90));velocity = orientation.apply(VEL);position = Vector();

}

The display function translates the plane and then uses the inverse of the orientation quater-nion to set its direction. (The inversion could be avoided by reversing the effect of the arrowkeys in the special key callback function.)

position.translate();orientation.inv().rotate();glCallList(plane);

The idle callback function performs a simple integration, adding v dt to the current position.

void idle (){

position += velocity * DELTA_TIME;glutPostRedisplay();

}

Figure 49 shows the callback function that handles the arrow keys for controlling the plane.Each key changes the orientation by a small amount and re-computes the velocity by applyingthe new orientation to the initial velocity VEL. The small orientation changes are defined byconstant quaternions:

const Quaternion climb(I, DELTA_TURN);const Quaternion left(J, DELTA_TURN);const Quaternion roll(K, DELTA_TURN);const Quaternion climbInv = climb.inv();const Quaternion leftInv = left.inv();const Quaternion rollInv = roll.inv();

93


void functionKeys (int key, int x, int y){

switch (key){case GLUT_KEY_UP:

orientation *= climb;velocity = orientation.apply(VEL);break;

case GLUT_KEY_DOWN:orientation *= climbInv;velocity = orientation.apply(VEL);break;

case GLUT_KEY_LEFT:orientation *= left;velocity = orientation.apply(VEL);break;

case GLUT_KEY_RIGHT:orientation *= leftInv;velocity = orientation.apply(VEL);break;

case GLUT_KEY_END:orientation *= roll;velocity = orientation.apply(VEL);break;

case GLUT_KEY_PAGE_DOWN:orientation *= rollInv;velocity = orientation.apply(VEL);break;

}}

Figure 49: Callback function for flying the plane

94

9 THEORY OF ILLUMINATION

9 Theory of Illumination

This section is an expanded version of Appendix B from Getting Started with OpenGL.

To obtain realistic images in computer graphics, we need to know not only about light but alsowhat happens when light is reflected from an object into our eyes. The nature of this reflectiondetermines the appearance of the object. The general problem is to use the properties of thelight sources and the materials to compute the apparent colour at each pixel that correspondsto part of an object on the screen.

9.1 Steps to Realistic Illumination

We discuss various techniques for solving this problem, increasing the realism at each step.In each case, we define the intensity , I , of a pixel in terms of a formula. The first fewtechniques ignore colour.

9.1.1 Intrinsic Brightness

We assume that each object has an intrinsic brightness ki . Then

I D ki

This technique can be used for simple graphics, and is essentially the technique that OpenGLuses when lighting is disabled, but it is clearly unsatisfactory. There is no attempt to modelproperties of the light or its effect on the objects.

9.1.2 Ambient Light

We assume that there is ambient light (light from all directions) with intensity Ia and thateach object has an ambient reflection coefficient ka. This gives

I D Ia ka

In practice, the ambient light technique looks a lot like the intrinsic brightness technique.

9.1.3 Diffuse Lighting

We assume that there is a single, point source of light and that the object has diffuse orLambertian reflective properties. This means that the light reflected from the object dependsonly on the incidence angle of the light, not the direction of the viewer.

More precisely, suppose that: N is a vector normal to the surface of the object; L is a vectorcorresponding to the direction of the light; and V is a vector corresponding to the direction ofthe viewer. Figure 50 shows these vectors: note that V is not necessarily in the plane definedby L and N. Assume that all vectors have unit length (kN k D kL k D kV k D 1). Then

I D N � L

Note that:

95

9 THEORY OF ILLUMINATION 9.1 Steps to Realistic Illumination

6

N

@@

@@@I

L

��

��

R

��:V

@@

@@

@@

@@

@@

@@rlight

��rViewer

Object

Figure 50: Illuminating an object

˘ V does not appear in this expression and so the brightness does not depend on theviewer’s position;

˘ the brightness is greatest when N and L are parallel (N � L D 1); and

˘ the brightness is smallest when N and L are orthogonal (N �L D 0).

We can account for Lambertian reflection in the following way. Suppose that the beam oflight has cross-section area A and it strikes the surface of the object at an angle � . Then thearea illuminated is approximately A= cos � . After striking the object, the light is scattereduniformly in all directions. The apparent brightness to the viewer is inversely proportional tothe area illuminated, which means that it is proportional to cos � , the inner product of thelight vector and the surface normal.

We introduce Ip , the incident light intensity from a point source, and kd , the diffuse reflec-tion coefficient of the object. Then

I D Ip kd (N � L)

The value of N �L can be negative: this will be the case if the light is underneath the surfaceof the object. We usually assume that such light does not contribute to the illumination ofthe surface. In calculations, we should use max(N � L, 0) to keep negative contributions outof our results.

If we include some ambient light, this equation becomes

I D ia ka C Ip kd (N � L)

9.1.4 Attenuation of Light

Light attenuates (gets weaker) with distance from the source. The theoretical rate of attenu-ation for a point source of light is quadratic. In practice, sources are not true points and thereis always some ambient light from reflecting surfaces (although ambient light is very weak inouter space). Consequently, we assume that attenuation, f , is given by

f D1

C C L d CQ d2

96


where:

˘ d is the distance between the light and the object;

˘ C (constant attenuation) ensures that a close light source does not give an infinite amountof light;

˘ L (linear term) allows for the fact that the source is not a point; and

˘ Q (quadratic term) models the theoretical attenuation from a point source.

Then we have

I D ia ka C f Ip kd (N � L)

9.1.5 Coloured Light

The previous calculations ignore colour. In this section, we assume that:

˘ the object has diffuse colour factors Odr (red), Odg (blue), and Odb (green);

˘ the light has intensity colour factors corresponding to ambient sources (Iar , Iag, andIab); and

˘ point sources (Ipr , Ipg , and Ipb).

All of these numbers are in the range [0, 1]. We now have three intensity equations (with� D r, g, b) of the form

I� D Ia� ka Od� C f Ip� kd Od� (N �L)

9.1.6 Specular Reflection

Lambertian reflection is a property of dull objects such as cloth or chalk. Many objectsexhibit degrees of shininess: polished wood has some shininess and a mirror is the ultimate inshininess. The technical name for shininess is specular reflection . A characteristic featureof specular reflection is that it has a colour closer to the colour of the light source than thecolour of the object. For example, a brown table made of polished wood that is illuminatedby a white light will have specular highlights that are white, not brown.

Specular reflection depends on the direction of the viewer as well as the light. We introducea new vector, R (the reflection vector), which is the direction in which the light wouldbe reflected if the object was a mirror (see Figure 50). The brightness of specular reflectiondepends on the angle between R and V (the angle of the viewer). For Phong shading(developed by Phong Bui-Tong), we assume that the brightness is proportional to (R �V)n,where n D 1 corresponds to a slightly glossy surface and n D 1 corresponds to a perfectmirror. We now have

I� D Ia� ka Od� C f Ip�(kdOd� (N � L)C ks (R �V)n)

where ks is the specular reflection coefficient and n is the specular reflection exponent .

97


PPPPPPPPPPPPi

L��1

R

6N

6-S -S

� �

N cos �

Figure 51: Calculating R

Calculating the Reflection Vector The reflection vector R is the mirror image of theincident vector L relative to the normal vector N. We assume that L and N are unit vectors.Consequently, the projection of L onto N is a vector with length cos � in the direction of N,or N cos � . As we can see from the right side of Figure 51:

R D N cos � C S

Similarly, from the left side of Figure 51:

S D N cos � � L

Adding these equations gives

RC S D N cos� C S CN cos � � L

which simplifies to

R D 2 N cos� � L

Since cos � D N � L, we can calculate R from

R D 2 N(N �L) � L

To calculate specular reflection, we actually need R �V. The time needed for this calculationdepends on the assumptions made about the light source and the viewer:

˘ If the light source and the viewer are assumed to be at infinity, R and V are both constantacross the polygon and it is necessary to calculate R �V only once for the polygon.

˘ If the light source is assumed to be at infinity (a directional light) but the viewer isnearby, R is constant across the polygon but V varies, and we must calculate V andR �V for each pixel.

˘ If the light source is nearby (a positional light) and the viewer is nearby, both R and Vvary across the polygon, and we must calculate R, V, and R �V for each pixel.

98

9 THEORY OF ILLUMINATION 9.2 Polygon Shading

9.1.7 Specular Colours

In practice, the colour of specular reflection is not completely independent of the colour ofthe object. To allow for this, we can give the object a specular colour Os�. Then we have

I� D Ia� ka Od� C f Ip�(kdOd� (N � L)C ks Os� (R �V)n)

This equation represents our final technique for lighting an object and is a close approximationto what OpenGL actually does.

9.1.8 Multiple Light Sources

If there are several light sources, we simply add their contributions. If the sum of the con-tributions exceeds 1, we can either “clamp” the value (that is, use 1 instead of the actualresult) or reduce all values in proportion so that the greatest value is 1. Clamping is cheapercomputationally and usually sufficient.

The actual calculation performed by OpenGL is:

V� D Oe� CMa�Oa� Cn�1X

iD0

�1

kc C kld C kqd2

�

i

si

�Ia�Oa� C (N �L)Id�Od� C (R �V)�Is�Os�

�i

where

V� D Vertex brightnessMa� D Ambient light model

kc D Constant attenuation coefficientkl D Linear attenuation coefficientkq D Quadratic attenuation coefficientd D Distance of light source from vertexsi D Spotlight effect

Ia� D Ambient lightId� D Diffuse lightIs� D Specular light

Oe� D Emissive brightness of materialOa� D Ambient brightness of materialOd� D Diffuse brightness of materialOs� D Specular brightness of material� D Shininess of material

and the subscript � indicates colour components, and the subscript i denotes one of the lights.

9.2 Polygon Shading

The objects in graphical models are usually defined as many small polygons, typically trianglesor rectangles. We must choose a suitable colour for each visible pixel of a polygon: this iscalled polygon shading .

99


9.2.1 Flat Shading

In flat shading, we compute a vector normal to the polygon and use it to compute the colourfor every pixel of the polygon. The computation implicitly assumes that:

˘ the polygon is really flat (not an approximation to a curved surface);

˘ N � L is constant (the light source is infinitely far away); and

˘ N �V is constant (the viewer is infinitely far away.)

Flat shading is efficient computationally but not very satisfactory: the edges of the polygonstend to be visible and we see a polyhedron rather than the surface we are trying to approx-imate. (The edges are even more visible than we might expect, due to the subjective Macheffect , which exaggerates a change of colour along a line.)

9.2.2 Smooth Shading

In smooth shading, we compute normals at the vertices of the polygons, averaging over thepolygons that meet at the vertex. If we are using polygons to approximate a smooth surface,these vectors approximate the true surface normals at the vertices. We compute the colourat each vertex and then colour the polygons by interpolating the colours at interior pixels.

Smooth shading of coloured objects requires interpolating colour values. (Suppose we havea line AB and we know the colour at A and the colour at B. Then interpolation enables usto calculate all the colours between A and B.) It is not clear that interpolation of colours iseven possible. However, in Section 10 we will discover that interpolation is indeed possibleand not even very difficult.

There are several varieties of smooth shading. The most important are Gouraud shading5

and Phong shading .6

Gouraud Shading is a form of smooth shading that uses a particular kind of interpolationfor efficiency.

1. Compute the normal at each vertex of the polygon mesh. For analytical surfaces, suchas spheres and cones, we can compute the normals exactly. For surfaces approximatedby polygons, we use the average of the surface normals at each vertex.

2. Compute the light intensity for each colour at each vertex using a lighting model (e.g.,Section 9.1.7 above).

3. Interpolate intensities along the edges of the polygons.

4. Interpolate intensities along scan lines within the polygons.

Figure 52 illustrates Gouraud shading. We assume a rectangular viewing window, W , anda polygon with vertices v1, v2, v3 to be displayed. The first step calculates the colours atthe vertices; the second step interpolates between the vertices to obtain the colours alongthe edges of the polygon. When a scan line s crosses the polygon, find the points p1 and

5Henri Gouraud. Continuous Shading of Curved Surfaces. IEEE Trans. Computers, C–20(6), June 1971,623–9.

6Bui-Tuong Phong. Illumination for Computer Generated Pictures. Comm. ACM, 18(6), June 1975, 311–7.

100


W

��

v1

PPPPPPPPPPPP

v2

AA

AA

AA

AA

v3

s rp1 rp2

Figure 52: Gouraud Shading

@@

@@@I

rp1

v1

AA

AA

AAK

BB

BB

BBM 6

��

��

��

��

rp1

v2

s

Figure 53: Phong Shading

p2 where it crosses the edges of the polygon and find the colours there. Finally, interpolatecolours between p1 and p2 to obtain the correct colour for each pixel on the scan line.

Phong Shading is similar to Gouraud shading but interpolates the normals rather thanthe intensities. Phong shading requires more computation than Gouraud shading but givesbetter results, especially for specular highlights.

Figure 53 illustrates Phong shading. The scan line is s and the edges of the polygon are at p1

and p2. The averaged normal vectors at these points are v1 and v2. The algorithm moves toeach pixel between p1 and p2 and computes the normal vector there by interpolating betweenv1 and v2.

101

10 THE THEORY OF LIGHT AND COLOUR

10 The Theory of Light and Colour

It is possible to include diagrams in these notes (e.g., the CIE chromaticity diagram)but this has the effect of making the files much larger (megabytes rather than kilo-bytes). Consequently, these notes are mainly text and the diagrams can be downloadedfrom the course website.

The purpose of this section is to provide partial answers to the questions:

˘ What is light?

˘ How do we perceive light?

˘ How do we create the illusion of light and colour on a computer screen?

The first two questions have simple answers that are not very useful: light consists of photonswith various energy levels; and we perceive light when photons cause chemical changes in theretinas of our eyes. We need to know a bit more in order to understand how it is possible tocreate quite good illusions with relatively simple equipment.

10.1 Physiology of the Eye

The eye has many parts; the most important for this discussion are:

˘ The lens and cornea at the front of the eye, which focus light onto the retina at theback of the eye.

˘ The iris diaphragm , which enables the eye to control the size of the lens aperture andhence the amount of light reaching the retina.

˘ The retina , which consists of light-sensitive cells.

The light sensitive cells of the retina are of two kinds: rods and cones . (The names “rod”and “cone” come from the shape of the cells.)

˘ Rods are very sensitive to intensity but do not distinguish colours well. Most rods arepositioned away from the centre of the retina. At low levels of illumination (e.g., atnight), we see mainly with our rods: we don’t see much colour and very dim lights arebest seen by looking to one side of them.

˘ Cones are about a thousand times less sensitive to light than rods, but they do perceivecolours. The centre of the retina has the highest density of cones, and that is where aour vision is sharpest and we are most aware of colour. There is a small region, calledthe fovea , where the density of cones is highest. When we “look at” an object, we haveadjusted our eyes so that the image of that object falls onto the fovea.

˘ There are three kinds of cones: roughly speaking, they respond to red, green, and bluelight. In reality, there is a lot of overlap, and each cone responds to some extent to allcolours.

If ears were like eyes, we could distinguish only low-, medium-, and high-pitched sounds; wecould not use speech to communicate and we could not enjoy music. The ear achieves thiswith the aid of approximately 16,000 receptors (the “hair cells”), each responding to a slightlydifferent frequency. The trade-off, of course, is that ears cannot determine the direction ofthe source of sound precisely.

102

10 THE THEORY OF LIGHT AND COLOUR 10.2 Achromatic Light

Device Dynamic Range Perceptible levelsCathode-ray tube 50 – 200 400 – 550Photographic print 100 450Photographic slide 1000 700Newsprint 10 200

Figure 54: Dynamic range and perceptible steps for various devices

10.2 Achromatic Light

Before considering coloured light, we will have a brief look at achromatic light — literally,light without colour. More precisely, in this section, we will ignore the coloured componentsthat all light has and consider only the brightness, or intensity, of the light.

The eye responds to a wide range of brightnesses. The dimmest light that we can respondto is about 10�6 cs=m2 (where ‘cs’ stands for candelas and ‘m’ stands for metres). At thisintensity, each visual receptor in the eye is receiving about one photon every 10 minutes; thereason we can see anything at all is that the eye can integrate the responses of many receptors.The brightest light that we can respond to without damaging the retina is about 108 cs=m2

— or 1014 times as much light.

Not surprisingly, given this range, our response to light intensity is not linear but logarithmic.A trilight typically can be set to emit 50, 100, or 150 watts. We see the step from 50 to 100watts as being greater than the step from 100 to 150 watts. To achieve even steps, the trilightshould have settings of 50, 100, and 200 watts — so that each setting doubles the intensity.

Suppose that we have a device, such a computer monitor, that can emit light at variousintensities. There will be a minimum intensity Imin and a maximum intensity Imax, whereImin should be set so that we can just see the effect and Imax is probably the highest intensitythat the device is capable of. The intermediate intensities should be

Imin rn

where 0 � n � N and r is chosen so that imin rN D Imax or r D N

rImax

Imin. The ratio

Imax

Iminis

called the dynamic range of the device.

Ideally, the steps should be small enough that we cannot detect the step from Imin rn toImin rnC1. The number of steps needed depends on the dynamic range. Figure 54 shows thedynamic range and number of perceptible steps for some common devices. If we use one byte(8 bits) to encode intensity, we have 28 D 256 levels. From Figure 54, we can see that this isenough for newsprint but not for the other devices, CRTs in particular. However, if we usethree bytes (24 bits) to encode three colours, we have 224 � 16 million distinct codes, whichshould provide enough levels of brightness for most purposes.

10.3 Coloured Light

As mentioned in Section 10.1, the cones in our eyes distinguish colours. “Colour” is a sensation;the physical cause of colour is the wavelength of photons. Most light consists of a large numberof photons with different wavelengths. (An important exception is the photons emitted from

103

10 THE THEORY OF LIGHT AND COLOUR 10.3 Coloured Light

a laser, which all have the same wavelength; laser light is called monochromatic.) Wesee photons as light if they have a wavelength between 400 nm and 700 nm (‘nm’ stands fornanometre and 1 nm D 10�9 metre). Photons with long wavelengths look red and photonswith short wavelengths look blue. We can detect photons with longer wavelength than redlight, but we call the effect “heat” rather than “light”.

A light source has a power spectrum which associates a power, or intensity, with eachwavelength. It is common practice to use � to stand for a wavelength. We describe a sourceof light as a function, P(�), which gives the power of the source at each wavelength.

Similarly, the response of a receptor is also a function of the wavelength of the light. We canwrite R(�), G(�), and B(�) for the response of the red, green, and blue receptors respectively.The corresponding curves have a single hump that corresponds to the wavelength of greatestsensitivity.

Our response to a light source is a triple of three real numbers (r, g, b) where

r DZ 700

�D400

R(�) P(�) d�

g DZ 700

�D400

G(�) P(�) d�

b DZ 700

�D400

B(�) P(�) d�

It looks from these equations as if the analysis of colour vision would be alarmingly complex.Fortunately, the eye has interesting properties that enable us to make significant simplifica-tions. The first thing to notice is that although there are very many possible power spectra— essentially any set of values defined between � D 400 and � D 700 define a power spectrum— our perception is confined to the three numbers (r, g, b). This means that if two differentsources of light, with different power spectra, give the same values of (r, g, b), then we cannotdistinguish them .

Two light sources that appear to have the same colour are called metamers .

When we use the common expression “adding red and green gives yellow”, what we are reallysaying is that a light source with red and green components, and another light source with asingle yellow component, are metamers.

The following results are based on experiments performed by asking people what they see. Toobtain objectivity, the subjects are not asked questions like “Does this look green or blue toyou?” but instead are asked “Are these colours the same or different?” Since people providehighly consistent answers to this question, colour theory has become an objective science.

The first phenomenon is this: suppose X and Y are metamers. Then, for any colour Z, X CZ

and Y CZ are metamers. The “+” sign here stands for adding light. For example, we mighthave two projectors projecting pools of light X and Y onto a screen. Viewers state that thetwo pools of light have the same colour. A third projector, emitting light Z, is then switchedon. The viewers will agree that the areas where light from both projectors X and Z hitsthe screens are indistinguishable from the area where light from projectors Y and Z hits thescreen.

104

10 THE THEORY OF LIGHT AND COLOUR 10.3 Coloured Light

Next, light of any colour can be obtained by mixing light from three colours in the rightproportions. (This statement is not precisely true: the exceptions will be discussed shortly.)The three colours are called primaries .

Suppose that we use colours R, G, and B as primaries (although the names suggest red, green,and blue, we do not have to use these colours as primaries). Then the claim is that, for anycolour X , we can choose factors ˛, ˇ, and such that

X D ˛RC ˇG C B

This in itself is not very interesting. The interesting part is that the linear formula is not justa convenient way of writing: colour mixing really is linear. Suppose we have two colours X

and X 0 and we have found appropriate weightings of the primary colours for them:

X D ˛RC ˇG C B

X 0 D ˛0 RC ˇ0 G C 0 B

Then we can obtain X CX 0 simply by summing the primary weights:

X C X 0 D (˛ C ˛0) RC (ˇ C ˇ0) G C ( C 0) B

This implies that colour has the properties of a three dimensional vector space and thatany three colours form a basis for this space.

As mentioned, there are problems with this representation. The first one is obvious: the threeprimaries must be linearly independent. The system would not work if we could find ˛ andˇ such that B D ˛RC ˇG.

There is a more serious problem. It is true that we can represent all colours in the formX D ˛R C ˇG C B, where now R, G, and B actually do stand for red, green, and blue.The problem is that some of the values of ˛ are negative! In fact, if we use any three visiblecolours as a basis, we will need negative coefficients to obtain some colours.

“Negative colours”, of course, do not exist in reality. We can interpret an equation of theform

C D 0.2R� 0.1G C 0.8B

which apparently defines a colour C with “negative green” as

C C 0.1G D 0.2RC 0.8B.

That is, we must add green to C to match the equivalent colour 0.2RC 0.8B.

There are three technical terms that we use when discussing colours.

1. The brightness of a colour is called its luminance . Colours that we consider differentmay in fact be the “same” colour with a different luminance. For example, olive brownis dark yellow.

2. The “colour” of a colour is called its hue . Red and green are different hues, but yellowand olive brown (see above) have the same hue and differ only in luminance.

3. Suppose that we compare a colour with a grey with the same luminance. Their closenessdepends on the saturation of the colour. A highly saturated colour, such as pure red,is far from grey, but an unsaturated colour, such as pink, is close to a light shade ofgrey. The deep blue that we see in Finnish pottery and Islamic mosques is saturated,whereas sky blue is unsaturated.

105

10 THE THEORY OF LIGHT AND COLOUR 10.4 The CIE System

10.4 The CIE System

The problem of negative coefficients was recognized many years ago and, in 1931, the Com-mission Internationale de l’Eclairage (CIE for short) introduced a system of tristimulusvalues for defining colour. The CIE primaries are called X, Y, and Z. Their properties are:

˘ They are super-saturated colours that cannot actually be perceived.

˘ All of the colours that we can perceive can be expressed as x XCy YC z Z with positivevalues of x, y, and z.

˘ The Y curve matches the sensitivity of the eye: it is low at the ends of the spectrumand has a peak corresponding to the dominant colour of the sun, to which our eyesare adapted. Consequently, the CIE y component of a light source is equivalent to theluminance of the source.

The CIE System enables us to describe a light source in two ways.

Tristimulus values are the actual values of the three components: X , Y , and Z.

Chromaticity values are the normalized versions of the tristimulus values:

x DX

X C Y C Z

y DY

X C Y C Z

z DZ

X C Y C Z

The tristimulus values describe what we see (hence the term “tristimulus”): the colour ofthe light and its intensity. The chromaticity values are normalized and do not describe thebrightness. However, since x C y C z D 1, only two are needed and, in practice, we describe alight source using (x, y, Y ) coordinates. From x, y, and z, we can recover the other values as

z D 1� x � y

X D xY

y

Z D zY

y

Now imagine all of the values of (x, y, z) that correspond to visible colours. (Note that “everyvisible colour can be represented by suitable values of (x, y, z)” is not the same as the converse“every value of (x, y, z) corresponds to a visible colour”.) These values form a cone with verydim colours near the origin and brighter colours further out. The chromaticity values, forwhich x C y C z D 1, form a plane that intersects the cone. The appearance of the visiblecolours on this plane is called the CIE Chromaticity Diagram .7

The linearity property says that, if we take two points (i.e., colours) on the CIE diagram, wecan obtain the points (i.e., colours) on the line between them by mixing the colours. If wetake three colours forming a triangle, we can obtain any colour inside the triangle by mixingthe colours. The triangle is called a gamut .

7The course web site has links to the chromaticity diagram and an applet that allows you to experimentwith it.

106


Since the sides of the CIE diagram are curved, we cannot find three points (corresponding tovisible colours) that enclose the entire area. Consequently, any physical device that relieson three primary colours cannot generate all perceptible colours . Nevertheless, some“devices”, such as slide film, cover a large proportion of the CIE diagram.

Corresponding to each wavelength in the visible spectrum, there is a pure colour. CIE Chro-maticity Coordinate tables give the XYZ components for discrete wavelengths: Figure 55shows a typical table. For a single wavelength, we can read the XYZ components directly.For a general light source, we compute a sum. (Strictly, we should use continuous functionsand integrate; the summation provides a good enough approximation and is much easier tocompute.)

Suppose that our light source has a power spectrum P(�) for values of � (the wavelength)in the visible spectrum. In practice, we would measure P(�) at discrete values, such as� D 380, 385, . . . , 825 if we were using the table in Figure 55. To obtain the tristimulus valuescorresponding to this light source, we compute

X DX

�

P(�) x�

Y DX

�

P(�) y�

Z DX

�

P(�) z�

where x�, y�, and z� are taken from the table in Figure 55.

10.4.1 Using Gamuts

Devices for printing or displaying colour use three or more sources of colour in combination.We will consider only computer monitors, which have three guns firing electrons at phosphors(it is the phosphors that create the colours, not the electrons) or use LCDs to display colour.(Colour television uses the same principle.) Both technologies are based on red, green, andblue (RGB) primaries.

We can represent the colours of the primaries (without their luminance) as coordinates on theCIE Chromaticity Diagram. Suppose that the coordinates are (xr , yr ), (xg, yg), and (xb, yb).Let

Ci D Xi C Yi CZi

zi D 1� xi � yi

Xi D xi Ci

Yi D yi Ci

Zi D zi Ci

where i 2 f r, g, b g and (Xi, Yi , Zi) are the XYZ coordinates of the colours that the monitorcan display. Then the relationship between the XYZ coordinates and RGB values (that is,the signals we send to the device) is:

24

X

Y

Z

35 D

24

Xr Xg Xb

Yr Yg Yb

Zr Zg Zb

35 �

24

R

G

B

35

107


� x y z

380 2.689900e-003 2.000000e-004 1.226000e-002385 5.310500e-003 3.955600e-004 2.422200e-002390 1.078100e-002 8.000000e-004 4.925000e-002395 2.079200e-002 1.545700e-003 9.513500e-002400 3.798100e-002 2.800000e-003 1.740900e-001405 6.315700e-002 4.656200e-003 2.901300e-001410 9.994100e-002 7.400000e-003 4.605300e-001415 1.582400e-001 1.177900e-002 7.316600e-001420 2.294800e-001 1.750000e-002 1.065800e+000425 2.810800e-001 2.267800e-002 1.314600e+000430 3.109500e-001 2.730000e-002 1.467200e+000435 3.307200e-001 3.258400e-002 1.579600e+000440 3.333600e-001 3.790000e-002 1.616600e+000445 3.167200e-001 4.239100e-002 1.568200e+000450 2.888200e-001 4.680000e-002 1.471700e+000455 2.596900e-001 5.212200e-002 1.374000e+000460 2.327600e-001 6.000000e-002 1.291700e+000465 2.099900e-001 7.294200e-002 1.235600e+000470 1.747600e-001 9.098000e-002 1.113800e+000475 1.328700e-001 1.128400e-001 9.422000e-001480 9.194400e-002 1.390200e-001 7.559600e-001485 5.698500e-002 1.698700e-001 5.864000e-001490 3.173100e-002 2.080200e-001 4.466900e-001495 1.461300e-002 2.580800e-001 3.411600e-001500 4.849100e-003 3.230000e-001 2.643700e-001505 2.321500e-003 4.054000e-001 2.059400e-001510 9.289900e-003 5.030000e-001 1.544500e-001515 2.927800e-002 6.081100e-001 1.091800e-001520 6.379100e-002 7.100000e-001 7.658500e-002525 1.108100e-001 7.951000e-001 5.622700e-002530 1.669200e-001 8.620000e-001 4.136600e-002535 2.276800e-001 9.150500e-001 2.935300e-002540 2.926900e-001 9.540000e-001 2.004200e-002545 3.622500e-001 9.800400e-001 1.331200e-002550 4.363500e-001 9.949500e-001 8.782300e-003555 5.151300e-001 1.000100e+000 5.857300e-003560 5.974800e-001 9.950000e-001 4.049300e-003565 6.812100e-001 9.787500e-001 2.921700e-003570 7.642500e-001 9.520000e-001 2.277100e-003575 8.439400e-001 9.155800e-001 1.970600e-003580 9.163500e-001 8.700000e-001 1.806600e-003585 9.770300e-001 8.162300e-001 1.544900e-003590 1.023000e+000 7.570000e-001 1.234800e-003595 1.051300e+000 6.948300e-001 1.117700e-003600 1.055000e+000 6.310000e-001 9.056400e-004605 1.036200e+000 5.665400e-001 6.946700e-004610 9.923900e-001 5.030000e-001 4.288500e-004615 9.286100e-001 4.417200e-001 3.181700e-004620 8.434600e-001 3.810000e-001 2.559800e-004625 7.398300e-001 3.205200e-001 1.567900e-004630 6.328900e-001 2.650000e-001 9.769400e-005635 5.335100e-001 2.170200e-001 6.894400e-005640 4.406200e-001 1.750000e-001 5.116500e-005645 3.545300e-001 1.381200e-001 3.601600e-005650 2.786200e-001 1.070000e-001 2.423800e-005655 2.148500e-001 8.165200e-002 1.691500e-005660 1.616100e-001 6.100000e-002 1.190600e-005665 1.182000e-001 4.432700e-002 8.148900e-006670 8.575300e-002 3.200000e-002 5.600600e-006675 6.307700e-002 2.345400e-002 3.954400e-006680 4.583400e-002 1.700000e-002 2.791200e-006685 3.205700e-002 1.187200e-002 1.917600e-006690 2.218700e-002 8.210000e-003 1.313500e-006695 1.561200e-002 5.772300e-003 9.151900e-007700 1.109800e-002 4.102000e-003 6.476700e-007705 7.923300e-003 2.929100e-003 4.635200e-007710 5.653100e-003 2.091000e-003 3.330400e-007715 4.003900e-003 1.482200e-003 2.382300e-007720 2.825300e-003 1.047000e-003 1.702600e-007725 1.994700e-003 7.401500e-004 1.220700e-007730 1.399400e-003 5.200000e-004 8.710700e-008735 9.698000e-004 3.609300e-004 6.145500e-008740 6.684700e-004 2.492000e-004 4.316200e-008745 4.614100e-004 1.723100e-004 3.037900e-008750 3.207300e-004 1.200000e-004 2.155400e-008755 2.257300e-004 8.462000e-005 1.549300e-008760 1.597300e-004 6.000000e-005 1.120400e-008765 1.127500e-004 4.244600e-005 8.087300e-009770 7.951300e-005 3.000000e-005 5.834000e-009775 5.608700e-005 2.121000e-005 4.211000e-009780 3.954100e-005 1.498900e-005 3.038300e-009785 2.785200e-005 1.058400e-005 2.190700e-009790 1.959700e-005 7.465600e-006 1.577800e-009795 1.377000e-005 5.259200e-006 1.134800e-009800 9.670000e-006 3.702800e-006 8.156500e-010805 6.791800e-006 2.607600e-006 5.862600e-010810 4.770600e-006 1.836500e-006 4.213800e-010815 3.355000e-006 1.295000e-006 3.031900e-010820 2.353400e-006 9.109200e-007 2.175300e-010825 1.637700e-006 6.356400e-007 1.547600e-010

Figure 55: CIE Chromaticity Coordinates

108


D

24

xr xg xb

yr yg yb

zr zg zb

35 �

24

Cr 0 0

0 Cg 0

0 0 Cb

35 �

24

R

G

B

35

(To see how this relationship is derived, consider the RGB signals (1, 0, 0) (pure red), (0, 1, 0)

(pure green), and (0, 0, 1) (pure blue).)

The characteristics of a particular device are defined by its values of Cr , Cg , and Cb . Thesecan be obtained in either of two ways:

1. We can use a photometer to measure the luminance levels Yr , Yg, and Yb directly withthe monitor set to maximum brightness for the corresponding colour. Then:

Cr DYr

yrCg D

Yg

ygCb D

Yb

yb

2. A more common method is to measure the XYZ coordinates (Xw , Yw, Zw) of the mon-itor’s white (that is, RGB coordinates (1, 1, 1)) and then solve the following equationfor the Ci ’s:

24

Xw

Yw

Zw

35 D

24

xr xg xb

yr yg yb

zr zg zb

35�

24

Cr

Cg

Cb

35

In most cases, the value we know is actually (xw , yw , Yw) — that is, the (x, y) positionon the Chromaticity Diagram and the luminance of white. In this case, the equationabove has the following solution:

Cr D k[xw(yg � yb)� yw(xg � xb)C xgyb � xbyg ]

Cg D k[xw(yb � yr)� yw(xb � xr)� xryb C xbyr ]

Cb D k[xw(yr � yg)� yw(xr � xg)C xryg � xgyr ]

where

k DYw

yw [xr (yg � yb)C xg(yb � yr )C xb(yr � yg)].

The International Electrotechnical Commission has a standard (IEC 61966-2-1) that definesthe “D65 white point” (so called because it corresponds to a black body radiating at 6500ıK)with tristimulus values (0.3127, 0.3290, 0.3583). The following table shows tristimulus valuesfor a typical monitor:

Colour Coordinates x y z

Red (xr , yr , zr ) 0.628 0.346 0.026Green (xg, yg , zg) 0.268 0.588 0.144Blue (xb, yb , zb) 0.150 0.070 0.780

The RGB/XYZ mappings for this monitor are:24

R

G

B

35 D

24

3.240479 �1.537150 �0.498535

�0.969256 1.875992 0.041556

0.055648 �0.204043 1.057311

35 �

24

X

Y

Z

35

109

10 THE THEORY OF LIGHT AND COLOUR 10.5 Other Colour Systems

and24

X

Y

Z

35 D

24

0.412453 0.357580 0.180423

0.212671 0.715160 0.072169

0.019334 0.119193 0.950227

35�

24

R

G

B

35

10.5 Other Colour Systems

There are several other systems for representing colour. All of those that we describe here arelinearly related to the CIE system. In other words, we can transform from one system to anyother system using a 3� 3 matrix.

10.5.1 RGB

The RGB system uses red, green, and blue as primaries, with coefficients between 0 and1. We can visualize RGB colours as a cube with black ((0, 0, 0)) at one corner and white((1, 1, 1)) at the opposite corner. Other corners are coloured red, green, blue, cyan (green +blue), magenta (red + blue), and yellow (red + green). RGB is used for computer graphicsbecause cathode-ray monitors have red, green, and blue phosphors and LCD monitors havebeen designed for compatibility.

10.5.2 CMY

The CMY system is the inverse of RGB and it is used for printing, where colours are subtrac-tive rather than additive. The letters stand for cyan, magenta, and yellow. The relationshipis simply

24

C

M

Y

35 D

24

1

1

1

35�

24

R

G

B

35

CMY is often extended to CMYK, where K is black. This is simply an economy, becauseCMYK is used for printers and it is cheaper to print black with black ink than to achieve anapproximate black by mixing cyan, magenta, and yellow.

High quality colour printers extend the range still further, usually by adding light cyan andlight magenta inks, giving a total of six different inks.

10.5.3 YIQ

When colour television was introduced, there were many users with monochrome (so called“black and white” or b/w) receivers. Broadcast engineers had to solve three problems:

˘ Transmit a colour signal

˘ Ensure compatibility : a b/w receiver must be able to produce a reasonable picture froma colour signal

˘ Ensure recompatibility : a colour receiver must be able to reproduce a b/w signal (e.g.,an old movie)

110

10 THE THEORY OF LIGHT AND COLOUR 10.6 Gamma Correction

Transmitting an RGB signal does not work because it is not compatible: a b/w re-ceiver shows RGB as shades of gey, with no deep blacks or bright whites. The YIQ

system was adopted by the US National Television System Committee (NTSC)for colour TV. The YIQ colour solid is a linear transformation of the RGB cube.Its purpose is to exploit certain characteristics of the human eye to maximize theutilization of a fixed bandwidth. The human visual system is more sensitive tochanges in luminance than to changes in hue or saturation, and thus a wider band-width should be dedicated to luminance than to color information. Y is similarto perceived luminance; I and Q carry color information and some luminanceinformation. The Y signal usually has 4.2 MHz bandwidth in a 525 line system.Originally, the I and Q had different bandwidths (1.5 and 0.6 MHz), but nowthey commonly have the same bandwidth of 1 MHz. [Adapted from informationon Nan C. Schaller’s web page.]

The CIE values for the standard NTSC phosphors are (0.67, 0.33) for red, (0.21, 0.71) forgreen, and (0.14, 0.08) for blue. The white point is at (xw , yw , Yw) D (0.31, 0.316, 1.0). Theequations for converting between YIQ and RGB are

24

Y

I

Q

35 D

24

0.299 0.587 0.114

0.596 �0.275 �0.321

0.212 �0.523 0.311

35�

24

R

G

B

35

and24

R

G

B

35 D

24

1 0.956 0.621

1 �0.272 �0.647

1 �1.105 1.702

35�

24

Y

I

Q

35 .

The ranges are 0 � Y � 1, �1 � I � 1 and �1 � Q � 1.

10.6 Gamma Correction

The response of a monitor is nonlinear. In this approximation

I D k V

I is the light intensity seen by the viewer, k is a constant, V is the voltage at the electrongun, and is another constant. Since it is that causes the nonlinearity, we need gammcorrection to compensate for it. Typical values are 2.3 < < 2.6, with D 2.2 often beingassumed.

111

11 ADVANCED TECHNIQUES

11 Advanced Techniques

Previous sections have focused mainly on rendering techniques that are provided by OpenGL.In this section, we look briefly at two techniques that are not provided directly by OpenGL,although they can be simulated. Amongst serious graphics programmers, OpenGL is consid-ered to be a rather simple system with rather limited capabilities. The important advantageof OpenGL, and the reason for its popularity, is that it is simple and fast. The techniquesdescribed in this section are very slow by comparison. Although they are acceptably fastwith appropriate simplifications and modern hardware, the earliest implementations requiredhundreds, or in some cases, thousands of hours of mainframe computer time to produce goodimages.

11.1 Ray-Tracing

OpenGL computes the colour of each vertex of the scene. This is a potentially wasteful process,since many vertexes never reach the viewing window: they may be clipped because they areoutside the viewing volume or invisible because there is an object between them and theviewer.

Ray-tracing avoids this source of inefficiency by computing the colour of each pixel on thescreen. This avoids wasting time by computing the colour of invisible objects, but ray-tracingintroduces inefficiencies of its own.

HHHHHHHHHH

ZZ

ZZ

ZZ

ZZ

ZZ

Vshhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh

Ps&%'$

O1O2

J

JJJ

O3 O4

Figure 56: Ray Tracing

Figure 56 illustrates the basic ideas of ray-tracing. The viewer is at V and P is a pixel on thescreen. A line drawn from V to P and extended into the scene meets objects in the scene atpoints O1, O2, O3, and O4. The colour at pixel P , from the point of view of the observer atV , must come from point O1, since the other points on the line are hidden.

The line PV is called a ray . Although the light actually travels from the object O1 to theviewer V , the calculation traces the ray backwards, from the viewer to the object. Thus thebasic ray-tracing algorithm, in pseudocode, is:

for each pixel, P:Construct the line from the viewer, VP.Find the points O1, O2, P3, . . . , On where PV meets surfaces in the scene.

112

11 ADVANCED TECHNIQUES 11.1 Ray-Tracing

Find Omin, the closest Oi to the viewer.Compute the colour at Omin and set the colour of pixel P to this value.

To make this more precise, we introduce some vectors:

e D the displacement of the viewer’s eye from the originn D the direction of the viewer with respect to the centre of the scenev D the vertical direction (“up vector”)u D the direction right on the viewing screen

The unit vectors u, v, and n form a right-handed coordinate system corresponding to XYZ

in OpenGL camera coordinates.

Suppose the screen has width W , with number of columns Nc , and height H with number ofrows Nr . The horizontal displacement depends only on the column number and the verticaldisplacement depends only on the row number. Thus for the pixel at column c and row r , wehave

uc D W

�2c

Nc

� 1

�

vr D H

�2r

Nr� 1

�

If we assume that the position of the screen with respect to the origin is N in the �n direction(as in OpenGL) then the pixel with screen coordinates (r, c) has position

p D e�N nC uc uC vr v

and the parametric equation of the line joining the viewer’s eye to this point is

L(t ) D e(1� t )C (e�N nC uc u C vr v) t (25)

In this equation, t D 0 corresponds to the eye position, e, and t D 1 corresponds to the pixelon the screen. Consequently, points in the scene on this line will have values t > 1 and largervalues of t corresponds to greater distance from the viewer.

We can write (25) in the form

L(t ) D eC d t (26)

where

d D �N nCW

�2c

Nc� 1

�uCH

�2r

Nr� 1

�v

Since e is fixed, it is necessary only to calculate d to find the line equation for each pixel.

We can now write the ray-tracing algorithm more precisely:

for (r D 0; r < Nr; r CC)for (c D 0; c < Nc; c CC)

find L(t )

for each surface S in the scene such that L(ts) 2 S

store minftsg

113


Finding the Intersections The next problem computing the intersections. If the surfaceis a simple mathematical object, such as a cube (or general rectangular shape), a cylinder,a sphere (or general ellipsoid), we can compute the intersection using a formula, as we willshow below. This means that, for simple objects, ray-tracing is an exact method, in contrastto OpenGL, in which objects are modelled by a polygonal mesh. This is one reason whyray-traced scenes look sharper and clearer than scenes rendered by OpenGL.

Here is one approach to computing the intersections. We assume that the surface is the setof points that satisfy an equation of the form F(p) D 0. That is, the surface S consists of thepoints S D fp j F(p) D 0 g.To find the intersection of the ray L(t ) with the surface F(p) D 0, we solve the equation

F(L(t )) D 0

for t . From (26) above, this is equivalent to solving

F(e C d t ) D 0

for t .

Transformations It is only on rare occasions that we will need to draw a unit sphereat the origin. How should we handle the case of a general sphere with an equation like(x � a)2 C (y � b)2 C (z � c)2 D r2? We could set up the equation and solve it as above, butthere is an alternative way. Just as in OpenGL, we can use translating, rotating, and scalingtransformations to transform the unit sphere into a general ellipsoid.

Suppose that we have a canonical surface (e.g., a unit sphere at the origin) F(p) D 0 anda transformation T that transforms into into something more general (e.g., a football flyingthrough a goal mouth). Suppose also that q is a point on the transformed object, so thatT (p) D q. Given q, we can find p by inverting the transformation: p D T �1q. SinceF(p) D 0, it follows that F(T �1q) D 0. In other words, the transformed surface is the set ofpoints

˚q j F(T �1q) D 0

.

The method for finding the intersection of a ray with a canonical object F that has beentransformed by T is therefore to solve

F(T �1(eC d t )) D 0.

As an optimization we note that, since the transform and its inverse are usually linear, theequation can be written

F(T �1(e)C T �1(d t )) D 0

in which T �1(e) is a constant (that is, independent of t).

Examples

� Suppose that the object we are viewing is a plane. The general equation of a plane isAx C By C C z CD D 0 for particular values of the constants A, B, C , and D. Thus

F(p) D Apx C Bpy C Cpz CD

114


and, with p D L(t ) D eC d t , the equation F(L(t )) D 0 becomes

A(ex C dx t )C B(ey C dy t )C C(ez C dz t )CD D 0.

This is a linear equation and its solution is

t D �AEx C BEy C CEz CD

Adx C Bdy C Cdz.

� If p has components (x, y, z) and

F(p) D x2 C y2 C z2 � 1

then the set fp j F(p) D 0 g is a unit sphere centered at the origin. We call this thecanonical sphere and we obtain other spheres by scaling and translating the canonicalsphere.

For example, suppose we want a sphere with radius 3 at (2, 4, 6). The required trans-formations are

T D

2664

1 0 0 2

0 1 0 4

0 0 1 6

0 0 0 1

3775

and

S D

2664

3 0 0 0

0 3 0 0

0 0 3 0

0 0 0 1

3775

their product is

T � S D

2664

3 0 0 2

0 3 0 4

0 0 3 6

0 0 0 1

3775

and the inverse of this matrix is

(T � S)�1 D

26666666664

1

30 0

�2

3

01

30�4

3

0 01

3�2

0 0 0 1

37777777775

Applying this matrix to the (homogenized) point eC d t gives

q D�

1

3ex C

1

3dx t �

2

3, �

4

3C

1

3ey C

1

3dy t, �2C

1

3ez C

1

3dz t

�

115


HHHHHHHHHH

ZZ

ZZ

ZZ

ZZ

ZZ

Vshhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh

Ps&%'$��

s��s

O

L1

AA

AA

AA

AA

AAsL2

Figure 57: Lighting in the ray-tracing model

and the equation F(q) D 0 is obtained as

�1

3ex C

1

3dx t �

2

3

�2

C��

4

3C

1

3ey C

1

3dy t

�2

C��2C

1

3ez C

1

3dz t

�2

� 1 D 0

which is a quadratic in t :�

1

9dx

2 C1

9dy

2 C1

9dz

2

�t2C

��

8

9dy �

4

9dx C

2

9ex dx C

2

9ez dz �

4

3dz C

2

9ey dy

�tC

1

9ex

2 C47

9�

4

9ex �

4

3ez C

1

9ey

2 C1

9ez

2 �8

9ey D 0.

This equation may have no roots (the ray misses the sphere), one root (the ray is tangentto the sphere), or two roots (the ray intersects the sphere twice and we take the smallestroot).

We have computed the point where the ray meets the object, but we still have to compute theillumination at that point. This means that we must be able to find the normal to the surfaceof the object at the point of intersection with the ray. This normal is in fact the normal tothe generic surface transformed by the inverse transpose of T . If the generic surface has asimple normal calculation — as many of them do — the normal is easy to find.

11.1.1 Recursive Ray Tracing

For detailed lighting calculations, including shadows, reflection, and refraction, we can applythe ray-tracing algorithm recursively.

In Figure 57, the ray meets a surface at O . There are two light sources, L1 and L2. Thesource L1 does not in fact illuminate the surface at O because there is another object between

116


L1 and O . The source L2, however, does illuminate O . If we take into account objects thatblock the light, we will achieve the effect of shadows without further effort.

In order to find out whether a light source is blocked by an obstacle, we apply the ray-tracingalgorithm recursively. We emit a ray from O in the direction of the source L1 and determinewhether it meets any surfaces before L1. Repeating this for each light source enables us tocalculate the contribution of each source to the colour at O . If there is a possibility that theobstacle might be reflective, we can recurse again to find out how much light it contributes.

We can use a similar technique for refraction by a transparent object. When the ray emittedfrom O meets a transparent surface, it sends out two further rays, one for reflected and onefor refracted light. (As usual, the rays are going backwards, in the opposite direction to thesimulated light.) In a complete ray-tracing calculation, the intensity at a point is the sum of:

� ambient light,

� diffuse light,

� specular light,

� reflected light, and

� refracted light.

Here is simplified pseudocode for a complete ray-tracing system. The function shade is passeda Ray and returns the colour of a single pixel. The function hit finds the first surface thatthe ray intersects.

Colour shade(Ray r)obj = r.hit()Colour colcol.set(emissive light)col.set(ambient light)for each light source

if obj is shinyref = reflected raycol.add(shininess * shade(ref))

if obj is transparenttrn = transmitted raycol.add(transparency * shade(trn))

return col

11.1.2 Summary

We can use the techniques that we have seen before (Gouraud and Phong shading) but theprecision of ray-tracing makes it desirable to use more sophisticated lighting models. We willdescribe briefly one of these models, due to Cook and with a later improvement by Torrance,and usually called the Cook-Torrance lighting model.

The Cook-Torrance model assumes that a rough surface consists of many small facets andthat each facet is almost mirror-like. Surfaces at an angle ı reflect light back to the viewer.

117

11 ADVANCED TECHNIQUES 11.2 Radiosity

The distribution of ı is given by

D(ı) De�( tan ı

m )2

4 m2 cos4ı

where m is a roughness factor.

It is easy to apply textures in a ray-tracing system: the hit coordinates are mapped directlyto texture coordinates.

Ray tracing is good for:

� specular reflection

� refraction

� transparency

� ambient lighting

� shadows

It is not so good for diffuse lighting. However, radiosity does a good job of diffuse lighting.

11.2 Radiosity

There are some similarities between radiosity and ray-tracing. Both models are based on phys-ical principles; both are based on simple ideas, yet a good implementation is quite complex;both require large amounts of computation. In other ways they are quite different: radiosityhandles diffuse light best, whereas ray-tracing is best for specular light; radiosity computesillumination independently of the point of view, whereas ray-tracing is completely determinedby the point of view.

The basic assumption of radiosity is that all surfaces emit light. If this seems surprising, lookaround you. There are very few objects in direct view that you cannot see, which implies thatthey are all emitting light.

Radiosity techniques divide the scene into patches . A patch is simply an area of a surface:it can be large or small, flat or curved. It is helpful to think of a patch as being a small, flatarea, but this is not necessarily the case. We assume that each patch is an opaque, diffuseemitter and reflector of light.

Then:

Bi D Ei C �i

X

j

Bj FjiAj

Ai(27)

In this equation:

Bi D the radiosity of patch i

D light reflected by patch i (W=m2)

Ei D the emissivity of patch i

D light directly emitted by patch i (W=m2)

�i D reflection coefficient of patch i (a dimensionless number)

118


Ai D the area of patch i

Fji D the form factor for patch i, defined below

Equation (27) says that the amount of light emitted by patch i consists of the amount it emitsdirectly, Ei , plus the sum of the amount of light it receives from other patches and reflects.

For most patches, Ei D 0. If Ei > 0, then patch i is a light source.

The amount of light that patch i reflects is proportional to its reflection coefficient, �i. Then,for each other patch i, the amount of light depends on: the emission from patch j , Bj ; theform factor, Fji ; and the relative areas, Ai and Aj .

Bj Fji is defined to be the amount of light leaving a unit area of patch j that reaches allof patch i. What we actually need is the amount of light arriving at a unit area of patch i

from all of patch j . This explains the factorAj

Ai.

Fji expresses the optical “coupling” between patches j and i. It is high if the patches areclose together and parallel and small if they are far away or not parallel. If one patch facesaway from the other, the coupling is zero. In most cases, Fii D 0 (a patch is not coupled toitself) but, if a patch is a concave curved surface, a small amount of self-coupling is possible.

For diffuse light, we can show that AiFij D Aj Fji or

FjiAj

Ai

D Fij (28)

Substituting (28) into (27) gives

Bi D Ei C �i

X

j

Bj Fij

which we can rearrange to give the radiosity equation :

Bi � �i

X

j

Bj Fij D Ei (29)

The important feature of the radiosity equation (29) is that it is simply a set of linear simul-taneous equations in B1, B2, . . . , Bn. If we know the constants �i , Ei, and Fij , we can solvethese equations and determine the radiosity of each patch.

So far, we have considered only the intensity of light. In practice, the constants �i and Ei

will have different values at different wavelengths and we will have to perform the calculationat several (typically three) wavelengths to obtain a result with colours. (Note that Fji doesnot depend on the wavelength but only on the geometry of the scene.)

11.2.1 Computing Form Factors

The hardest part of radiosity is the calculation of the form factors Fij . We consider a smallpart of each patch, dAi and dAj , which we can assume to be flat. Let:

L D the line joining these areasr D the length of L

119


�i D the angle between L and the normal to dAi

�j D the angle between L and the normal to dAj

Hij D 1 if dAi is visible from dAj and 0 otherwise

Then

dFdi,dj Dcos �i cos �j

� r2Hij dAj

Integrating over Aj gives

Fdi,j DZ

Aj

cos �i cos �j

� r2Hij dAj

and a second integration over dAi gives

Fij D1

Ai

Z

Ai

Z

Aj

cos �i cos �j

� r2Hij dAj dAi

A naive implementation is likely to be inefficient. If there are N patches, we have to computeN 2 double integrals. Since N may be in the hundreds or even thousands, this will be timeconsuming.

The Cohen-Greenberg algorithm computes fairly good approximations to Fij with much lesswork. The idea of the algorithm is to pre-compute values for small squares on a semi-cubethat encloses the patch and to use these values in the computation of Fij .

11.2.2 Choosing Patches

Since the computation time increases as N 2 for N patches, choosing the right patches is acrucial step. To obtain reasonable efficiency, we would like to have large patches where thelight is uniform and smaller patches where it is non-uniform. Once again, recursion comes tothe rescue. A practical radiosity algorithm works like this:

1. Divide the scene into a small number of large patches.

2. Calculate the radiosity of each patch.

3. Estimate the radiosity variation across each patch (e.g., by looking at the radiosity ofits neighbours).

4. If the radiosity gradient across a patch is larger than a given threshold, split the patchinto two smaller patches.

5. If any patches were split in step 4, repeat from step 2.

In practice, the efficiency of the algorithm is improved by not repeating all the calculationsin step 2, because some values will not have been changed significantly by splitting patches.

120

11 ADVANCED TECHNIQUES 11.3 Bump Mapping

11.2.3 Improvements

Practical versions of radiosity algorithms are incremental. Initially, all Bi are assumed to bezero. Equation (29) is used to calculate the first approximation to Bi , using only the non-zeroEis. The second iteration takes into account one reflection, and the third iteration takesinto account two reflections, and so on. In practice, the equations stabilize fairly quickly,because third and higher order reflections have very little energy. The procedure stops whenan iteration produces very little change.

Radiosity does a good job of diffuse lighting, and it automatically provides ambient lighting.It is possible to account for specular lighting by including directional effects in the calculationof Fij , but the computational overhead usually makes this impractical.

However, as we have seen, ray-tracing does a good job of specular light. It is possible tocombine radiosity and ray-tracing to get the best of both models. A typical calculation goeslike this:

1. Compute radiosity values for the scene, giving ambient and diffuse lighting that is in-dependent of the view point .

2. Project the scene onto a viewing window.

3. Perform ray-tracing to obtain specular lights, reflection, and refraction.

11.3 Bump Mapping

Consider an orange. It is approximately a sphere, but has an irregular surface. It would beexpensive to model an orange exactly, because a very large number of polygons would beneeded to model the surface accurately. It is possible, however, to obtain the illusion of anorange by drawing a sphere with incorrect normals. The normals interact with the lightingcalculations to give the effect of a dimpled surface.

The technique is called bump mapping and it was introduced by James Blinn in 1978.

Suppose that we have a surface defined with parametric coordinates. Points on the surface arep(u, v) for various values of u and v (which are essentially the same as texture coordinates).For example, we can define a sphere with radius r centered at the origin as

p(u, v) D (r sin u cos v, r sin u sin v, r cos u).

Bump mapping perturbs the true surface by adding a bump function to it:

p0(u, v) D p(u, v)C b(u, v)n

where

N D@p@u�@p@v

and

n DNjNj

121

11 ADVANCED TECHNIQUES 11.4 Environment Mapping

is the unit normal vector at p. The perturbed normal vector is

N0 D@p0

@u�@p0

@v.

We have

@p0

@uD

@

@u(pC bn)

D@p@uC n

@b

@uC b

@n@u

.

If we assume that b is small (that is the idea of a “bump” mapping), we can neglect the lastterm, giving for u and v:

@p0

@u�

@p@uC n

@b

@u@p0

@v�

@p@vC n

@b

@v.

The perturbed surface normal vector is

N0 D@p0

@u�@p0

@v

D@p@u�@p@vC@b

@v

�@p@u� n

�C@b

@u

�n�

@p@v

�C@b

@u

@b

@v(n� n).

But n� n D 0 and so

N0 D NC@b

@v

�@p@u� n

�C@b

@u

�n�

@p@v

�.

We obtain the perturbed unit normal by normalizing N0.

Although it is possible to all the calculations analytically, the usual practice is to approximatethe bump function b with a look-up table and to estimate the derivatives with finite differences:

@b

@u� bi,j � bi�1,j

@b

@v� bi,j � bi,j�1

Values of @b@u

and @b@v

can be tabulated, and the values of @p@u

and @p@v

are needed anyway fornormal calculation. Consequently, bump mapping is quite an efficient operation.

11.4 Environment Mapping

Real-life scenes often contain highly reflective objects. If these objects are to look realistic,they should reflect the scenery around them. In general, this is hard to do, because it requirescalculating the reflected ray for each point on the surface, and then finding out where thatray comes from in the scene. (This is a problem that is solved neatly by recursive ray-tracing,of course.)

There are two ways of simplifying the rendering of reflective objects:

122

11 ADVANCED TECHNIQUES 11.4 Environment Mapping

1. We can work with simple objects, such as cylinders and spheres. It is easier to computereflections from these objects than from, say, a shiny car.

2. We can use texture mapping to render the object.

The second method, which is called environment mapping , was introduced by Blinn andNewell. We will consider a simple and standard case that happens to have direct supportfrom OpenGL: the problem of rendering a reflecting sphere.

Environment mapping depends to some extent on the fact that people are rather like raccoons:we recognize shiny or reflecting objects easily, and we are not too fussy about the precise detailsof the reflection. Imagine a reflected sphere that is moving around. Strictly, the reflection onthe sphere should exactly match the surroundings; in practice, if the match is fairly good, oureyes accept it as a reflection.

Environment mapping works in two steps: the first step is to obtain a suitable reflected image,and the second step is to use that image to texture a sphere. A photograph taken with afish-eye lens provides a usable image. Alternatively, we can take a regular image and distortit.

The following code is extracted from a program that performs texturing in sphere-map mode.

The following code is used during initialization. PixelMap is a class defined in CUGL. Theimage should be a fish-eye view, as described above, but it does not have to be.

PixelMap tex;tex.read("image.bmp");GLuint name;glGenTextures(1, &name);tex.setTexture(name);glTexGenf(GL_S, GL_TEXTURE_GEN_MODE, GL_SPHERE_MAP);glTexGenf(GL_T, GL_TEXTURE_GEN_MODE, GL_SPHERE_MAP);

The following code is used in the display function. The displayed object must have texturecoordinates 0 � s � 1 and 0 � t � 1.

glEnable(GL_TEXTURE_2D);glEnable(GL_TEXTURE_GEN_S);glEnable(GL_TEXTURE_GEN_T);// Display the objectglDisable(GL_TEXTURE_GEN_S);glDisable(GL_TEXTURE_GEN_T);glDisable(GL_TEXTURE_2D);

The mathematical justification of sphere mapping follows.

Let u be a unit vector from the eye to a vertex on the sphere. Let r be the correspondingreflection vector computed as

r D u� 2(n � u) n

123

11 ADVANCED TECHNIQUES 11.5 The Accumulation Buffer

Then the texture coordinates are calculated as

s D1

2

�rx

pC 1

�

t D1

2

�ry

pC 1

�

where

p Dq

r2x C r2

y C (rz C 1)2.

11.5 The Accumulation Buffer

Although we have seen various ways of making graphics images realistic, it is usually easy todistinguish a computer graphic image from a photograph or a real scene. There are severalreasons for this: one important reason is that a graphics image is sharp and bright everywhere.We are used to seeing real scenes in which distant objects are less brightly coloured than nearbyobjects and photographs in which one part is in sharp focus and the rest is blurred. We canuse OpenGL fog to give the effect of distance. This section discusses blurring.

Blurring can be simulated by drawing the scene several times in slightly different positions.OpenGL provides the accumulation buffer for this and other purposes. This buffer is usedto “accumulate” several different images before displaying a final image. Most applications ofthe accumulation buffer are quite slow because the image must be rendered several times.

Here are some of the applications of the accumulation buffer:

� A camera focuses at a particular distance. In theory, there is a plane that is sharpand everything else is blurred. In practice, there is a depth of field , defined by twodistances between which everything is sharp enough (for example, the blurring mightbe less than the grain size of the film).

To achieve a photographic effect, we can render the scene several times into the accu-mulation buffer. One point in the scene, the centre of the image at the focal plane, iskept fixed, and the scene is randomly rotated through a very small angle around thispoint. The effect is that this point in the scene is sharp and everything else is blurred.

� If a fast-moving object is photographed with a slow shutter speed, it appears blurred.Skilled photographers sometimes pan the camera to follow the object, in which case theobject is sharp but the background is blurred. In either case, the effect is called motionblur . It is used (often with exaggeration) in comics and animated films to emphasizethe feeling of motion.

To achieve the effect in OpenGL, the scene is again rendered several times into theaccumulation buffer. Stationary objects stay in the same place, and moving objects aremoved slightly. The “exposure” (explained below) can be varied. For example, a movingobject might be rendered in five different positions with exposure 1

2, 1

4, 1

8, 1

16, and 1

32,

to give the effect of fading.

� Jagged edges and other artefacts of polygonal decomposition can be smoothed by ren-dering the image several times in slightly different positions — this is a form of an-tialiasing . The movements should be random and very small, typically less than apixel.

124

11 ADVANCED TECHNIQUES 11.5 The Accumulation Buffer

The function that controls the accumulation buffer is glAccum(). It requires two arguments,an operation and a value. Figure 58 explains the effect of the various operations. The “buffercurrently selected for reading” is set by glReadBuffer() and the “buffer currently selectedfor writing” is selected by glDrawBuffer(). By default, the current colour buffer is used forreading and writing, so it is not actually necessary to call these functions.

Operation (op) Effect (val)GL ACCUM Read each pixel of the buffer currently selected for reading, multiply

the RGB values by val, and add the result to the accumulation buffer.

GL LOAD Read each pixel of the buffer currently selected for reading, multiplythe RGB values by val, and store the result in the accumulation buffer.

GL RETURN Take each pixel from the accumulation buffer, multiply the RGB valuesby val, and store the result in the colour buffer currently selected forwriting.

GL ADD Add val to each pixel in the accumulation buffer.

GL MULT Multiply each pixel in the accumulation buffer by val.

Figure 58: Effect of glAccum(op, val)

In a typical application, the accumulation buffer is used as follows:

� Call glClear(GL_ACCUM_BUFFER_BIT) to clear the accumulation buffer.

� Render the image n times into the colour buffer (as usual). After each rendering, callglAccum (GL_ACCUM, x) with x D 1=n.

� Call glAccum (GL_RETURN, 1.0) to copy the accumulated information back to thecolour buffer.

125

REFERENCES REFERENCES

References

Birchfield, S. (1998, April). An introduction to projective geometry (for computer vision).http://robotics.stanford.edu/�birch/projective.

Eberly, D. H. (2001). 3D Game Engine Design: a Practical Approach to Real-Time Com-puter Graphics. Academic Press (Morgan Kaufmann).

Foley, J. D., A. van Dam, S. K. Feiner, and J. F. Hughes (1996). Computer Graphics:Principles and Practice (Second ed.). Addison-Wesley.

Heckbert, Paul S. (editor) (1994). Graphics Gems IV. Acedemic Press.

Hill Jr., F. (2001). Computer Graphics using OpenGL. Prentice-Hall.

Shoemake, K. (1993). Quaternions. Available at various web sites, including (December2001) http://www.fasterlight.com/hugg/links/gamelinks.html.

Shoemake, K. (1994a). Arcball rotation control. In (Heckbert, Paul S. (editor) 1994), pp.175–192. Academic Press.

Shoemake, K. (1994b). Euler angle conversion. In (Heckbert, Paul S. (editor) 1994), pp.222–229. Academic Press.

Shoemake, K. (1994c). Fiber bundle twist reduction. In (Heckbert, Paul S. (editor) 1994),pp. 230–238. Academic Press.

Shoemake, K. (1994d). Polar matrix decomposition. In (Heckbert, Paul S. (editor) 1994),pp. 207–221. Academic Press.

Shreiner, D. e. (2000). OpenGL Reference Manual: the Official Reference Document toOpenGL, Version 1.2 (Third ed.). Addison-Wesley. The ‘Blue Book’.

Slater, M., A. Steed, and Y. Chrysanthou (2002). Computer Graphics and Virtual Envi-ronments: from Realism to Real-Time. Addison-Wesley.

Watt, A. and F. Policarpo (2000). 3D Games: Real-time Rendering and Software Technol-ogy, Volume One. Addison-Wesley.

Watt, A. and M. Watt (1992). Advanced Animation and Rendering Techniques. Addison-Wesley.

Woo, M., J. Nedier, T. Davis, and D. Shreiner (2000). OpenGL Programming Guide: theOfficial Guide to Learning OpenGL, Version 1.2 (Third ed.). Addison-Wesley. The ‘RedBook’.

Wright Jr., R. S. and M. Sweet (2000). OpenGL SuperBible (Second Edition ed.). The WaiteGroup.

126

COMP 6761 Advanced Computer Graphics - Concordia University

Documents

Transcript of COMP 6761 Advanced Computer Graphics - Concordia University