pudn.comread.pudn.com/.../ebook/240778/Geometric_Modeling_A_First_Cours… · GMCh1 12/30/99 1-2...

GMCh1 12/30/99 1-1

GEOMETRIC MODELING: A First Course

Copyright © 1995-2000 by Aristides A. G. RequichaPermission is hereby granted to copy this document for individual student use at USC,provided that this notice is included in each copy.

1. Introduction

1.1 Preamble

Dazzling cinematic special effects... Realistic simulations of machines in motion...Analyses of the stresses in aerospace structures during flight... Programs that automaticallydrive robots along collision-free paths in cluttered environments... Rapid prototypingmachines that behave like three-dimensional printers... Underlying these and many otherapplications are geometric models of the objects under study. Geometric models arecomputational (symbol) structures that capture the spatial aspects of the objects of interestfor an application. This course is primarily concerned with geometric models for three-dimensional objects, and with the associated computer algorithms for constructing andquerying the models.

We are interested in modeling both real (i.e., physical) and virtual objects. Virtual objectsoften correspond to physical objects that are being designed but have not yet been built. Butthey may also correspond to objects that do not obey the laws of Physics, and therefore arepurely imaginary. In addition, some of the objects we encounter in the applications arethemselves mathematical abstractions—for example, the set of accessible directions alongwhich a touch probe may approach a given surface.

Geometric information is pervasive in many engineering and scientific fields, such as (i)VLSI layout, (ii) geographic information systems, (iii) electronic packaging, (iv) computergraphics and visualization, (v) computer vision, (vi) architectural and structural design, and(vii) design and manufacture of electromechanical products, to name a few. The first twoexamples just cited are primarily two dimensional (2-D), whereas the last two areintrinsically 3-D; examples (iii) through (v) may be either 2-D or 3-D. This courseemphasizes modeling of 3-D objects, and attempts to strike a balance between applicationsin two areas: graphics and multimedia, and robotics and automation.

3-D computer graphics is becoming ubiquitous. Most desktop computers systems areexpected to bundle support for 3-D applications in the near future, and the use of 3-D inmultimedia and the World Wide Web is burgeoning. Display techniques are relatively well-developed and are implemented in hardware accelerators and in software browsers.Construction of the geometric models of the objects to be displayed, however, is becominga bottleneck for 3-D graphics. In this text we attempt to complement, rather than competewith, Computer Graphics textbooks. We focus on modeling, and de-emphasize display andvisualization, because they are well covered in the graphics texts—e.g. [Foley et al. 1990].Models for free-form curves and surfaces are also treated briefly, because they constitute alarge subject on their own, and are discussed in several recent texts—e.g. [Bartels etal.1987, Farin 1997, Rockwood & Chambers 1996]. Much of the course deals primarilywith polyhedra and objects bounded by surfaces of low degree, which suffice to illustratethe main combinatorial ideas of the field and its numerical difficulties.

GMCh1 12/30/99 1-2

The intended audience are undergraduates in Computer Science or Engineering disciplinesat the junior or senior level, and practicing engineers. The material corresponds to a typicalsemester course, and has been class-tested at the University of Southern California.Mathematical pre-requisites are the usual calculus courses, including analytic geometry.Knowledge of linear algebra is helpful, but we use little of it beyond elementary conceptssuch as matrix multiplication.

The sections titled Mathematical Underpinnings, which typically appear towards the endsof chapters, are written tersely and require a higher level of mathematical maturity than theremainder of the text. They are intended to provide supplementary mathematical material,and entries into the mathematical literature. They should be ignored in standardundergraduate courses. End-of-chapter sections titled Further Explorations provide aglimpse of geometric modeling material not covered in the text, with references to theliterature. These sections should also be ignored in undergraduate courses. FurtherExplorations are not meant to be exhaustive. The geometric modeling literature hasexploded over the last few years, and it is not possible to cover it completely in a text ofreasonable size.

A graduate course can be designed around the material presented here by expanding theMathematical Underpinnings sections, and adding a few advanced topics from the currentliterature referenced in the Further Explorations sections, or at the discretion of theinstructor. Graduate versions of this course have been taught by the author at USC andearlier at the University of Rochester since the mid 1970s. The material has now becomesufficiently well codified to be taught to undergraduates.

Readers are assumed to have a certain computational maturity. They should be familiar withbasic programming notions such as data structures and recursion, and be proficient in C orrelated object-oriented programming languages such as C++ or Java. Students with good Cskills can learn C++ or Java in parallel with the course material.

Learning about computation without doing computation is not very sensible. But interestingexercises in geometric modeling require a fair amount of software, which cannot be writtenwithin the time constraints of typical university or self-study courses. Early versions of thiscourse were taught in Pascal, and then in C, with custom packages for vector and matrixmanipulation, and for display. In the late 1990s the language used was C++ and aGeometric Tool Kit (GTK) was provided to deal with much of the low-level computationand graphical display involved in typical geometric modeling problems. Display was doneby using a VRML browser. The current version uses Java and the Java 3D ApplicationProgramming Interface (API). By building upon available low-level facilities students cantackle significant problems, and construct prototype geometric modeling systems.

Programming assignments can include small graphic user interfaces to the core geometriccomputations that are the main subject of the course. With modern interface design tools,such as the Java Swing API, it is possible to implement attractive interfaces with areasonable effort.

The remainder of this text is organized as follows. Sections 1.2 and 1.3 discuss the role ofgeometric information in two specific application domains. Section 1.4 proposes asystematic approach to the study of geometric modeling. The final section of thisintroductory chapter presents a historical summary of the field. Chapter 2 presents basicconcepts from Euclidean and projective geometry and uses them to model motions andprojections. Chapter 3 deals with fundamental issues in the computer representation ofphysical objects and presents the main approaches for representing object geometry.Chapter 4 addresses representations for curves and surfaces, and Chapter 5 representations

GMCh1 12/30/99 1-3

for solids. The next two chapters are devoted to algorithms. Chapter 6 presents some of thefundamental algorithms that serve as building blocks for the application algorithmsdiscussed in Chapter 7. Finally, Chapter 8 adresses geometric modeling systems issues.

1.2 The Role of Geometry in CAD/CAM

In this section we focus on a specific application area, mechanical and electromechanicalComputer-Aided Design and Manufacture (CAD/CAM), and discuss briefly the role that 3-D geometry plays in it.

Figure 1.2.1 shows the traditional organization of activities in the life-cycle of a product.Functional requirements, constraints, and optimization criteria are derived from marketconsiderations, and serve as input to the design process. Designers generate detailedspecifications for the parts and assemblies to be produced. These specifications consistprimarily of geometrical information about the objects, augmented with non-geometric datasuch as material, hardness, and so forth. Traditionally, the geometry was defined throughengineering drawings, but these have major drawbacks, as we shall discuss later, and arebeing replaced by the computer-based models studied in this course.

Figure 1.2.1 - Product development life cycle

The proposed designs are analyzed for compliance with the specifications, and modified ifnecessary. This may be done by trial and error, or by using computer-aided optimizationtechniques. Part specifications are passed on to manufacturing, where decisions are madeabout the processes to be used. These decisions take into account not only the productspecifications, but also knowledge about the capabilities, costs and availability ofequipment and manufacturing processes (e.g., milling, drilling, injection molding, layeredfabrication). The results are plans and programs at several levels, plus schedules thatprovide timing information. High-level plans are usually called process plans, and consistof totally or partially-ordered sets of process descriptions. Their lower-level counterpartsare sometimes called operation plans, and may extend all the way down to specificinstructions to machine tools and robots. All the activities discussed thus far areinformation processing tasks. In contrast, manufacturing-plan execution is a physicalprocess that uses energy and materials to perform the specified operations.

Manufactured parts are inspected, assembled, packaged, and shipped. Assembly andinspection are similar to manufacturing, in that process and operation plans must begenerated, and these plans depend critically on part geometries and process capabilities. A

Requirements

Geometric Information + ...

Design Analysis Manufacture Inspection Assembly Packaging,Shipping, ...

Process Information

Material Energy

GMCh1 12/30/99 1-4

complete life cycle would also include service, maintenance, disposal, and perhaps otheractivities. Finally, the entire process must interface smoothly with the business andmanagement activities of the enterprise, which are not shown in Figure 1.2.1.

The traditional design and manufacturing process just described is sequential and hassevere drawbacks. For example, a designer may unnecessarily specify geometric featuresthat are costly to manufacture. Modern manufacturing engineering espouses the principlesof concurrent engineering, which have been shown to lead to better quality, lower cost, andfaster time to market. In essence, all of the activities shown in Figure 1.2.1 are stillperformed, but in parallel, rather than sequentially. This ensures that designers have timelyfeedback on the downstream consequences of their design decisions. Figure 1.2.2 depictswhat we call an Engineering Environment, by analogy with the ProgrammingEnvironments that are commonplace in Software Engineering. Engineering Environmentsare well suited for supporting the concurrent engineering paradigm.

Figure 1.2.2 - Engineering Environments

A designer interacts with the product requirements, sometimes negotiating them withmarketing and management. He or she also interacts with a product design subsystemthrough a suitable human-computer interface. Computer models for evolving designs arestored in a data base. Several specialists, which may be humans or programs, reason aboutthe product design and produce two kinds of information: feedback to designers, andprocess plans and programs for driving the actual manufacturing machinery. Plans are alsostored in the data base. The computer-based specialists and other computational processesshown in the figure invoke a set of fundamental computational tools of genericapplicability. These include the geometric modeling tools studied in this course, plusoptimization procedures, algebraic routines, constraint-maintenance subsystems, and soforth. Engineering Environments that support collaborative work, distributed in time andspace, raise interesting design and research questions, and are still embryonic. Systemarchitectures based on intelligent agents offer potential solutions.

MACHINERY

SPECIALISTS

KNOWLEDGE BASE

Product & Process DesignsLibrariesHandbook Data....

REQUIREMENTS

FunctionConstraintsOther Goals

PLANS & PROGRAMS

NC Robots .....

PROCESSDESIGNS

COMPUTATIONAL TOOLS

GeometryOptimizationEquation Solvers....

ADVICE TODESIGNERS

PRODUCTDESIGN

....

Design Aids

Analysis

Manufacturing

Assembly

Inspection

DESIGNER

GMCh1 12/30/99 1-5

1.3 The Role of Geometry in 3-D Graphics

Computer Graphics in its early years was 2-D, and objects were defined by sequences ofdrawing commands. But it soon became clear that it was important to distinguish betweenthe model of the object to be displayed, and the display primitives themselves. Thisdistinction is especially important in 3-D applications. Today, a typical graphic applicationprovides a Graphic User Interface (GUI) with which a (human) user defines graphicmodels of the objects to be displayed. The objects are then rendered by image synthesissoftware and hardware, producing the desired output, as shown in Figure 1.3.1.Rendering techniques and systems have evolved significantly over the past two decades,and are now capable of photo-realistic displays with texture and reflections, Virtual Reality(VR) immersive displays, and so on.

Figure 1.3.1 – Graphics modeling and rendering

Graphic models contain the shape or geometry of the objects, i.e., their geometric models,but they often require additional information such as color, texture, and so on. Forapplications that involve animation, the temporal behavior of the objects must also bedescribed. Our brief discussion of graphics and CAD/CAM applications (in the previoussection), shows that geometric models normally must be augmented with application-specific data to be practically useful.

The geometric models needed in 3-D graphics are not always built directly by a user, asshown in Figure 1.3.1. Sometimes they are constructed by Computer Vision techniquesfrom existing physical objects, or are acquired by medical imaging equipment such as CAT(Computer Aided Tomography) scanners, or are the result of scientific computations thatsimulate physical phenomena. Some rendering techniques do not even require a 3-D model,and have reasonable success in synthesizing images of 3-D scenes from a sequence of 2-Dimages of existing objects. This is a relatively new area of graphics, called image-basedrendering.

Geometric models used in CAD/CAM must be sufficiently accurate and faithful to permitmanufacture of the desired objects within tolerances, to determine if there are collisionsbetween objects, and so on. In graphics, however, models do not necessarily have to berealistic, provided that the images generated from them look realistic to human observers.Thus, Computer Graphics can use certain modeling techniques that are not adequate forCAD/CAM. For example, natural objects such as trees, waves or clouds, are modeled bygraphics-specific techniques [Foley et al. 1990]. In this text we focus on models that aresuitable for both graphics and CAD/CAM. Discussions of graphics-specific modelingtechniques can be found in graphics textbooks and in the graphics research literature, forexample, in the proceedings of the annual SIGGRAPH conferences. As graphics movestowards more realistic simulations of objects in motion, it has an increasing need forcollision detection and other capabilities normally associated with the CAD/CAM domain.Therefore, we are likely to see an increased overlap in modeling methodologies for thesetwo domains in the future.

USER ModelerGUIImage

SynthesizerDISPLAY

GMCh1 12/30/99 1-6

Sophisticated Virtual Environments are contributing to a more intimate connection betweenthe Graphics and the Robotics and Automation fields through sharing of spatial reasoningtools. These are used in Robotics and Automation to accomplish tasks without a need forhuman intervention to plan and program the required actions. For example, to assembletwo objects one must determine which approach directions can be used. This knowledgecan then be used to drive a robot that assembles the physical objects. Suppose now that wewant to display in a Virtual Environment a human joining two objects together. We canprogram the operation manually, by supplying directly to the Virtual Environment all thelow-level motion information. But this is tedious, error-prone and time-consuming. High-level programming of actions in the Virtual Environment is possible if we deploy planningtools that use geometric reasoning and are similar to those developed in the Robotics andAutomation field.

1.4 Models, Representations, Algorithms and Systems

Mathematics and Computer Science do not deal directly with physical objects andphenomena. They deal with models that capture the relevant aspects of the entities understudy. We distinguish between mathematical and computational models, and usually referto the latter as computational representations (or simply representations). The followingnon-geometric example illustrates the distinction. The decimal and Roman strings ‘115’ and‘CXV’ are computational representations that can easily be stored and manipulated usingstandard programming language constructs. They clearly represent the “same thing”, butwhat is it that they represent? We could say that they represent physical entities such ascollections of pebbles, but then we would have to explain that the color of the pebbles didnot matter, nor did their material, nor did the fact that they were pebbles, and so forth. It ismore reasonable to say that the strings represent natural numbers, which are abstractmathematical entities that capture the aspects of reality relevant to counting. Mathematicsprovides us with a rich theory of natural numbers that we can use to study the properties ofdecimal and Roman string representations and associated computations.

'CXV'

'115'

NaturalNumbers

RomanStrings

DecimalStrings

PhysicalWorld

Abstractions

Figure 1.4.1 - Physical entities, mathematical models and representations

The ultimate validity of a model of a physical entity must be ascertained experimentally.The models must be able to predict the behavior of the corresponding physical entities. Theresults of measurements performed in the real word—the answers A in Figure 1.4.2—mustagree with the values predicted by the model—the answers Am in the figure. Centuries ofexperimentation have shown that natural numbers are good mathematical models.Similarly, the 3-D Euclidean space (E3) of analytic geometry has proven to be an excellent

GMCh1 12/30/99 1-7

model for the spatial aspects of reality, provided that we do not study phenomena at thegalactic spatial scale (which requires the curved spaces of general relativity) or involvingvelocities comparable to the speed of light (which fall under the purview of specialrelativity).

PhysicalSystem Model

?

Q Qm

AmA

Figure 1.4.2 - Models and reality

Let us turn now to a simple geometric example. Suppose that we press the tip of a sharppencil on the top of a desk and make a mark. We can easily conceive a “perfectly sharp”pencil whose tip has zero diameter, and which produces a mark also of zero diameter. Thissituation may be modeled mathematically as follows. The desktop is modeled by a subsetof 2-D Euclidean space (E2), and the tip by a point in that space. If we move the tip on thedesktop we can make another mark, which corresponds to another point. Mathematics letsus pose well-defined questions such as “what is the distance between the two points?”. Themathematical theory of Euclidean spaces has been developed with great rigor and we canrely upon its theorems and methods.

If we want to compute the distance between two points we need means to designate thepoints unambiguously. This is the role played by representations. A point in E2 may berepresented by two real numbers, its x and y coordinates measured with respect to anagreed set of axes, corresponding, for example, to two orthogonal sides of the desk top.Computationally we can use a pair (a 2-element array) of floating point numbers as therepresentation for the point. (Note, however, that floating point numbers must have finiteexponents and mantissas and therefore cannot represent all the real numbers; this is thesource of many numerical robustness problems in geometric modeling.) With thisrepresentation, the distance can be computed by a very simple algorithm that produces anon-negative floating point number by evaluating the familiar expression from analyticgeometry involving the square root of a sum of squares.

The methodology embodied in this desktop example may be summarized and generalizedwith the help of Figure 1.4.3. There are objects of interest in the physical world, such as adesktop and pencil marks. One often needs to answer questions about properties of theseobjects—for example: “how far apart are two pencil marks?”. To do this we first replacethe objects by abstractions which we call mathematical models. In our example, the desktopbecomes a subset of the Euclidean plane, and the pencil marks are modeled by points.Observe that modeling involves abstraction, and ignores many aspects of reality, which arejudged irrelevant for the issues to be studied. For example, the Euclidean model of thedesktop captures its geometric or spatial aspects, but says nothing about many macroscopic

GMCh1 12/30/99 1-8

properties of the desk such as its material and color. In addition, the atomic nature of thedesk is entirely ignored. This implies that each model has a specific domain of applicability.For example, the E2 model for the desktop is appropriate for studying the stability of solidobjects lying on a table, but is clearly unsuitable for dealing with robotic operations at thenanometer scale that move individual atoms or molecules at the solid/gas interface betweentable and air.

PhysicalObjects

Math Models of Objects

Physical Properties

Math Modelsof Prop Values

Representations of Objects

Representations of Prop Values

f a

AbstractionsPhysical World

Figure 1.4.3 - A systematic view of modeling

Object properties are modeled by mathematical functions, denoted by f in Figure 1.4.3,which map (mathematical models of) objects into the values of their properties. Thus,physical “nearness” is measured by the distance between points, which is a mathematicalfunction that maps two points onto a non-negative real number.

Computation requires a further step. Mathematical models must be associated withcorresponding symbol structures, i.e. representations, that can be constructed withincomputers. Thus, we need representations for objects, and for the values of theirproperties. In our example, points on the plane were represented by pairs of floating pointnumbers, and distance values were represented by single positive floating point numbers.We also need algorithms for doing the actual computation, i.e., for converting the inputrepresentations into their output counterparts. In our example, the distance computationevaluates a simple expression containing the coordinates of the input points. Algorithms aresets of computer-intelligible instructions, usually arranged sequentially. Thereforealgorithms are not mathematical functions. We often say that they implement functions,because correct algorithms produce the (representations of the) values of the correspondingmathematical functions.

Our discussions of representations and algorithms, and of Engineering Environments,suggest the following high-level architecture of a generic system for geometriccomputation—see Figure 1.4.4. The essential components are (i) geometric models, i.e.,representations for geometric objects, (ii) algorithmic processes that use suchrepresentations to answer geometric queries, such as “what is the distance between twopoints?”, (iii) input facilities for creating and editing object representations, and forinvoking processes, and (iv) output facilities and representations for results. Thesubsystem that provides facilities for entering, storing, and modifying objectrepresentations is usually called a geometric modeler or geometric modeling system.

GMCh1 12/30/99 1-9

Sometimes a geometric modeler is more broadly defined to include some of the facilities,e.g. graphic rendering, that are needed in most applications.

GeometricModels

DefinitionTranslator

QueryTranslator

Proc

ProcObject

Definitions

GeometricQueries

Rep

Rep

Results......

Figure 1.4.4 - A generic system for geometric computation

This course covers all the components shown in Figure 1.4.4, with a special focus onrepresentations and algorithms. We distinguish between fundamental and application-specific algorithms. Fundamental algorithms are useful for constructing and maintainingrepresentations of geometric objects, and for a range of applications. They includeintersection computations, distances, and so on. Application algorithms are relevantprimarily to specific applications such as graphics or the computation of volumes. Theyoften use fundamental algorithms to perform lower-level calculations.

The mathematics that underlies much of the material in this course is discussed very briefly,mainly in the sections titled “Mathematical Underpinnings”. A more throrough discussionis appropriate only at the graduate level, because it involves a variety of non-trivialsubjects, ranging from general and algebraic topology to differential and algebraicgeometry.

1.5 Historical Summary

This section provides a brief, non-exhaustive summary of the historical development ofgeometric modeling. Articles by many of the pioneers of the field appear in [Piegl 1993],and contain references to the work cited below. A series of survey articles also containmany references and trace the history of the field [Requicha & Voelcker 1982, 1983;Requicha 1988; Requicha & Rossignac 1992].

The beginnings of geometric modeling can be traced to the 1950s, when several relatedtechnologies were launched. Computer graphics saw the development of Sketchpad inSutherland’s Ph.D. thesis at MIT, and of the DAC-1 system at General Motors.Sculptured, or free-form curves and surfaces were introduced at MIT by Coons and inFrance by Bezier and de Casteljau. The APT programming language for numerically-controlled (NC) machining was initially developed by Ross’ group at MIT. Work onmodeling of solid objects included Robert’s thesis on modeling for vision at MIT,Gutterman proposals for a solid modeling system at Sandia, and the research of Luh andKrolak at IBM.

From this spate of initial activity emerged four main streams of work that evolved largelyindependently for some two or three decades. The computer graphics stream focused onrendering and interaction. Initially, graphics was 2-D and the notion of a model was notacknowledged explicitly. The models were essentially the display lists used to drive thehardware. Later on, computer graphics’ models became object approximations through

GMCh1 12/30/99 1-10

unstructured nets of polygons. Today, polygonal models are still prevalent, and associatedrendering algorithms and hardware are well developed and capable of producing highlyrealistic results.

The wireframe stream led to the commercial CAD systems of the 1970s and 80s. Initiallythese systems were simple drafting aids that represented objects by 2-D views composed oflines and arcs. This had unpleasant consequences. For example, when a view wasmodified the changes were not reflected in the others. Later, wireframe systems adoptedrepresentations consisting of unstructured collections of curve segments (i.e. edges) in 3-D. As we shall see later, wireframes do not designate solid objects unambiguously, andcannot guarantee the validity of their data.

The free-form curve and surface stream found important applications in computer-aideddesign and manufacture of car bodies, aircraft fuselages and in other tasks in theautomotive and aerospace industries. The technology evolved towards B-splinerepresentations (discussed later in this course), which were pioneered by the University ofUtah group led by Riesenfeld and Barnhill. (Barnhill moved later to Arizona StateUniversity.) NURBS (Non-Uniform Rational B-Splines) are becoming a de-facto standardfor free-form surface representation. Free-form surfaces are increasingly being used inanimation and other applications in the entertainment industry. Curve and surface modelingis sometimes called Computer-Aided Geometric Design, and has strong ties with numericalanalysis and differential geometry.

Solid modeling is distinguished by the use of unambiguous representations for solids. Theinitial forays cited above were ambitious but met with limited success, and wereabandoned. The field resumed progress in the early 1970s when a flurry of activity tookplace in most of the industrialized countries. Braid’s Ph.D. thesis at the University ofCambridge in England introduced the BUILD system. Braid and co-workers became amajor force in the commercial solid modeling arena, and were responsible for theROMULUS, Parasolid and ACIS modelers, the last two of which are still actively beingdeveloped. In Germany, work proceeded in the Compac system at the University of Berlin,and in Proren at the University of the Ruhr. Brun’s Euclid system was launched in Franceand Engeli’s Euklid in Switzerland. In Japan, Okino developed TIPS-1 at HokkaidoUniversity, and Hosaka developed GeoMap at the University of Tokyo. In the U.S. twosystems were built outside academia: the Shapes system at the Draper Labs, and thecommercial Synthavision system, which had been developed initially for studying ballisticand nuclear effects and was adapted for CAD applications. In American universitiesEastman’s group at Carnegie-Mellon produced the GLIDE system with emphasis onarchitectural applications and data base issues. (Eastman moved to Georgia Tech in the90s.) At Stanford, Baumgart’s system, which was intended primarily for ComputerVision, introduced data structures that were influential in later developments. Finally, theProduction Automation Project at the University of Rochester established much of thetheoretical foundation for the field, and developed the PADL systems. These weredisseminated widely, and several commercial systems were built upon them.

A related, fifth stream emerged in the early 1970s with Shamos’ Ph. D. thesis. This streamfocuses on the theoretical aspects of design and analysis of geometric algorithms, and hasbecome known as Computational Geometry. (The term had been coined earlier, in the1960s, by Forrest in England, and also by Minsky at MIT, with a broader meaning.)Computational Geometry has dealt primarily with theoretical problems involving polygonaland polyhedral objects, often in 2-D, but is moving towards a broader domain, with anincreased emphasis on applications.

GMCh1 12/30/99 1-11

Finally, spatial reasoning and other geometric aspects of Robotics may be considered asixth related stream, which also has evolved independently, although it has strongintellectual ties with solid modeling and CAD/CAM.

In the late 1990s the computer graphics, sculptured surfaces, solid modeling,computational geometry, and robotics research communities were still largely distinct,published in separate journals and attended different conferences. However, a convergenceof all these different aspects of Geometric Computation is becoming evident, and newsystems use ideas from all of these subfields.

Geometric modeling technology has been slow to gain acceptance, because it requiressubstantial computational resources, which have not been available at low cost until veryrecently, and because it often has significant impact on how enterprises are organized andmanaged. 3-D graphics exploded into the marketplace in the second half of the 1990s,much like color 2-D graphics and windowing systems spread earlier in the decade. Many ofthe ideas in Sketchpad, e.g. constraint-based design, are coming into widespread use onlynow. Spline technology is now blossoming in many graphic packages for the personalcomputer market. Solid modeling is gaining acceptance steadily, but there are still manycompanies that use simpler wireframe systems or even manual drafting.

GMCh2 12/30/99 2-1



2. Motions and Projections

2.1 Points and Vectors

Imagine a small solid object and let its dimensions decrease indefinitely. The result of thisconceptual experiment is modeled by a mathematical abstraction called a point. Modernmathematics defines rigorously Euclidean spaces as sets whose elements, called points,satisfy certain axioms. In everyday language we talk of “being at a point in space”, and ingeometric modeling we use Euclidean points to define mathematically the locations ofobjects. In addition, sets of points serve to model more complicated objects, fromtrajectories to physical solids.

Consider now a solid object in straight-line motion. The object’s velocity has a directionand a magnitude, or speed, measured e.g. in meters per second. Velocities and otherphysical entities such as forces that can be characterized by a direction and a magnitude aremodeled mathematically by vectors. Vectors may be added by using the familiarparallelogram rule of analytical geometry and elementary mechanics. They may also bemultiplied by scalar numbers. Scalar multiplication changes the magnitude of the vector butnot its direction. In modern mathematics a vector space is a set of elements, called vectors,with two operations defined on them—vector addition and multiplication by a scalar—thathave certain algebraic properties defined axiomatically. The vector-space axioms ensure thatthe usual Cartesian vectors of analytic geometry are a special case of abstract vectors.Interestingly, there are many other useful entities that are abstract vectors as well. Examplesinclude polynomials of degree ≤n, the spline functions we will discuss later in this course,periodic functions with period T, continuous functions in a closed interval [a,b], and so on.The theory of vector spaces applies equally well to all of these entities. This is a goodexample of the power and elegance of abstraction in modern mathematics.

Points and vectors are intimately connected. In principle there are no privileged points ordirections in space, i.e., space is homogeneous and isotropic. But let us pick somearbitrary point o and call it the origin. (Typically, the origin is selected for convenience insolving a specific problem.) Now each point p ≠ o plus o define a direction and a length.Therefore, for a fixed origin o , each point p corresponds to a vector x , and conversely.That is, there is a one-to-one correspondence between points and vectors. We are arguingintuitively, but the argument can be rigorized by defining point difference as an operationthat produces a vector from two point arguments such that p – o = x . We use the notation

po

← → x

to signify that point p corresponds to vector x in the fixed origin o . (We are using lower-case boldface for points and italic boldface for vectors.)

GMCh2 12/30/99 2-2

The correspondence between points and vectors depends on the choice of origin, as shownin Figure 2.1.1. Here point p corresponds to vector x when the origin is q. However, pcorresponds to y when the origin is o . There is a simple relationship between x and y.Specifically, y = x + , where = q – o is the difference of the two origins. Therefore theeffects of a change of origin are easy to evaluate.

p

q

o

x

y

Figure 2.1.1 Change of origin

In the geometric modeling field it is customary not to distinguish between points andvectors, because a fixed “lab” or “master” origin is assumed to exist. By abuse of languageone talks of operations such as point addition, although addition is defined only forvectors, and extending the vector operation to points produces results that depend on thechoice of origin.

The vector-space elements introduced above are called in this course simply vectors orordinary vectors, to distinguish them from free and applied vectors, which will bediscussed further on.

In a vector space it is always possible to select a minimal set of vectors e1, e2, . . . , en suchthat any vector of the space can be expressed as a linear combination

x = x1e1 + x2e2 + + xnen .

The e i are called a basis, and the x i are the components of the vector x in the given basis.There are infinitely many bases in a vector space, but they all have the same number n ofvectors, and n is called the dimension of the space. In this course we are primarilyinterested in dimensions 2 and 3. The components of a vector in a fixed basis are unique,and, conversely, a set of components determines a unique vector.

Therefore, for a fixed basis, there is a one-to-one correspondence between vectors andarrays of components. We denote this correspondence as

x E← → X .

It is convenient to arrange the components of a vector in a column matrix, and the vectorsof a basis in a row matrix:

GMCh2 12/30/99 2-3

X =

x1

x2

xn

, E = e1 e2 en[ ].

Using matrix notation a vector can be written as x = EX. The correspondence betweenvectors and matrices preserves addition and multiplication by a scalar. The matrix Z thatcorresponds to the sum of two vectors z = x + y is the sum

Z = X + Y =

z1

z2

zn

=

x1

x2

xn

+

y1

y2

yn

=

x1 + y1

x2 + y2

xn + yn

.

For multiplication by a scalar, if z = a x , then Z = a X, or

Z =

ax1

ax2

axn

.

The inner or dot product, denoted x . y, is another useful operation defined on vectors. Itproduces a scalar given two vector arguments. It is defined formally by a set of axioms.The square root of the inner product of a vector with itself is the norm or length of thevector, denoted

x = x.x .

A unit vector is a vector whose length equals unity. Two vectors are orthogonal if their dotproduct is zero. The cosine of the angle between two vectors is given by

cos =x.yx . y

.

The most convenient bases are the orthonormal bases, composed of unit vectors that arepairwise orthogonal. In an orthonormal basis the inner product of two vectors is

x.y = X tY = x1y1 + x2y2 + + xnyn ,

where the superscript denotes matrix transposition, obtained by interchanging rows withcolumns. This is the familiar formula from analytic geometry. (Note that this formula is notvalid in non-orthonormal bases.) The length in an orthonormal basis becomes

x = x12 + x2

2 + + xn2 ,

GMCh2 12/30/99 2-4

which is also a well-known formula. In a Euclidean space we define the distance betweentwo points p and q as the norm of the vector p – q.

Because points correspond to vectors, for a fixed origin, and vectors correspond to columnmatrices, for a fixed basis, there is also a one-to-one correspondence between points andcolumn matrices. A pair (origin, basis) is called a frame or coordinate system. For a fixedframe, points correspond to column matrices:

p(E ,o)

← → X .

The elements of the matrix associated with a point in a given frame are called thecoordinates of the point in that frame. The correspondences between points, vectors andcolumn matrices are very important, because they provide us with the computational toolswe need to represent and manipulate these entities. Matrices are easy to represent as arraysin any modern programming language, and operations such as vector addition also are easyto implement. Point and vector properties are computed by using their coordinates orcomponents in a convenient frame or basis. For example, the distance between two pointsp and q with coordinate matrices X and Y is evaluated by the familiar expression

d(p,q) = (x1 − y1)2 + (x2 − y2 )2 + + (xn − yn )2 .

Finally, there is an additional operation on vectors, called the vector product (also knownas cross, or exterior product), that is very useful, especially in 3-D. Here we define it interms of components in a right-handed, orthonormal, 3-D basis:

x × y = (x2y3 − x3y2 )e1 + (x3y1 − x1y3)e2 + (x1y2 − x2y1)e3 .

The result of a cross product is not truly a vector, and its definition depends on theorientation or handedness of a basis. Right-handed orthonormal bases in 2 and 3-D areshown in Figure 2.1.2. In this course we always use right-handed bases, and we canignore the subtleties of vector-product definitions .

e 1

e 2

e 3e 1

e 2

Figure 2.1.2 – Right-handed bases

The cross product of two parallel vectors is zero. For two non-parallel vectors, x and y ,the cross-product x × y is perpendicular to both x and y . In particular, if E is a right-handed orthonormal basis in 3-D, then

GMCh2 12/30/99 2-5

e1 × e2 = e3

e2 × e3 = e1

e3 × e1 = e2

.

These equations are convenient for completing a basis when two of its vectors are known.

2.2 Transformations

Moving, sizing, and deforming objects are fundamental operations in geometric modeling.Since objects are sets of points, what we need are transformations that map points ontoother points. The following subsections discuss linear and affine transformations, whichare the simplest and most commonly used in geometric modeling. For simplicity weassume a fixed origin, and make no distinction between points and vectors.

2.2.1 Linear Transformations

A transformation T from a vector space onto itself is linear if it distributes over linearcombinations, i.e.,

T(ax + by) = aT(x) + bT(y).

Suppose we have two bases

E = e1 en[ ]F = f1 fn[ ]

and we want a linear transformation that maps the vectors of E onto the vectors of F (seeFigure 2.2.1.1):

Tef (ei) = fi , i = 1, ,n.

e 1

e 2

f 1

f 2

x

y

Figure 2.2.1.1 - Transforming a basis E and a vector x .

GMCh2 12/30/99 2-6

What is the effect of such a transformation on an arbitrary vector? Let x be a vector and yits transformed version

y = Tef (x).

Write the vector x in terms of its components in basis E

x = EX e

and use linearity to obtain

y = Tef (x) = Tef (e1 ) Tef (en )[ ]Xe = f1 fn[ ]Xe .

Now replace the basis vectors F by their components in basis E

y = EF1e EFn

e[ ]X e = E F1e Fn

e[ ]X e .

This shows that the components of y in basis E are

Y e = M e Xe

where M e is an n by n matrix whose columns are the components of the vectors of F in thebasis E:

M e = F1e Fn

e[ ]The last two equations are important for several reasons.

1. They show that, for a fixed basis E, each transformation corresponds to a squarematrix. Earlier on we had a correspondence between vectors and column matrices, andnow we have a correspondence between linear transformations and square matrices.

2. They give us computational tools for evaluating the effect of a transformation on avector. We simply multiply the square matrix that corresponds to the transformation bythe column matrix that corresponds to the vector. This is easily done via arrayoperations in any modern programming language.

3. They tell us how to construct a matrix that maps the vectors of a basis onto another.

Let us illustrate this matrix construction procedure. Many geometric modeling systemsattach coordinate frames to objects, and provide facilities for placing objects by aligningframes. Consider the rectangle shown on the left in Figure 2.2.1.2, and suppose we wantto orient it such that it aligns with the rectangle on the right. All we need is a transformationwith a matrix whose columns are the components of the vectors F in the basis E. (In 2-D itis easy to determine the required transformation by other means, but in a 3-D example itwould not be as easy.)

GMCh2 12/30/99 2-7

e 1

e 2

f 1

f 2

Figure 2.2.1.2 - Orienting an object by frame alignment

For a specific example, let us determine the matrix that corresponds to a counterclockwiserotation by an angle . The components of the vectors F are easy to calculate fromelementary trigonometry, as shown in Figure 2.2.1.3.

cos θ

cos θ

–sin θ

sin θ θ

θ

e1

e2

f1

f2

Figure 2.2.1.3 - Derivation of a rotation matrix

We obtain

F1e =

cos

sin

, F2

e =−sin

cos

,

and therefore the rotation matrix is

M e =cos −sin

sin cos

.

Composition of successive transformations corresponds to matrix multiplication

T2(T1(x))E

← → M2e M1

eXe .

And the inverse transformation, which maps a basis F onto a basis E corresponds to theinverse matrix

M fee = Mef

e −1.

GMCh2 12/30/99 2-8

Matrix multiplication is not commutative, i.e., in general AB≠BA for arbitrary squarematrices A and B. The inverse of a matrix product reverses the order of the matrices:

(AB)− 1 = B−1A−1

Note that some linear transformations do not map a basis onto another. We will seeexamples of these later.

2.2.2 Specific Linear Transformations

Here we investigate several interesting linear transformations in 2-D. The results apply alsoto 3-D, with minor and obvious modifications.

Scaling – Consider the transformation with matrix

a 0

0 b

.

To study its effect on a vector we multiply the corresponding matrices

a 0

0 b

x

y

=

ax

by

.

Here we are denoting the components of a 2-D vector by the customary x and y. The resultis a scaling by factors a and b along the x and y axes. If a = b the scaling is uniform orisotropic and alters the size of an object but not its shape. If both scale factors equal unity,the transformation is the identity and does not modify the object.

Figure 2.2.2.1 illustrates anisotropic scaling by its effect on a square located at the origin.

a=3b=2

Figure 2.2.2.1 – Non-uniform scaling

Shear – Now let one of the off-diagonal elements of the matrix be non-zero. The result is ashear, with the following matrix, and with the effect shown in Figure 2.2.2.2.

1 a

0 1

x

y

=

x + ay

y

.

GMCh2 12/30/99 2-9

a=1

Figure 2.2.2.2 – Shear

Rotation – As we saw earlier, the matrix is

cos −sin

sin cos

.

θ=–30

Figure 2.2.2.3 – Rotation

Reflection – Scalings with negative factors produce reflections. A reflection about the xaxis is shown below.

−1 0

0 1

x

y

=

− x

y

.

Figure 2.2.2.4 – Reflection about the vertical axis

Reflections about the horizontal axis, or about the origin can be constructed similarly.

Orthographic projection – Consider now

1 0

0 0

x

y

=

x

0

.

GMCh2 12/30/99 2-10

This transformation zeroes the y component and does not affect the x component. Itcorresponds to a perpendicular or orthographic projection on the x axis.

Figure 2.2.2.5 – Orthographic projection on the horizontal axis

Orthographic projection does not map a basis onto another basis. It is called a singulartransformation, and cannot be inverted. The projection causes a loss of information aboutthe y components of the vectors. Knowledge of the x component is insufficient to recover avector, because many vectors project on the same point of the x axis.

2.2.3 Rigid Motions

A translation is a mapping that associates to each vector x the sum x + , where is aconstant vector. Translations are not linear transformations and cannot be computed bymatrix multiplication as we have been doing (but see Section 2.6.1 below). Thecomponents of a translated vector y = x + are

Y = X + D ,

where D is the column matrix that corresponds to the translation vector . A translation isshown in Figure 2.2.3.1.

Figure 2.2.3.1 - Translation

Compositions of translations and linear transformations are called affine transformations.Both translations and linear transformations are practically important, and their non-uniform behavior with respect to components is computationally inconvenient. Separateprocedures must be written for dealing with translations and linear transformations, andthey cannot be composed by matrix multiplication. (We will see later that bothtransformations can be treated uniformly if we introduce homogeneous coordinates.)

GMCh2 12/30/99 2-11

Typically, in geometric modeling we do not want to change the shape of a transformedobject. Transformations are applied primarily to locate and orient objects. Transformationsthat preserve distance are called isometries (from the Greek, meaning “same measure”).Isometries that also preserve the signed angles between vectors are called in this courserigid motions. (This is not entirely standard terminology; some texts consider “rigidmotions” and “isometries” as synonyms.) It can be shown that rigid motions are affine andmust be compositions of translations and rotations.

The matrix that corresponds to a rotation in an orthonormal basis is a special case of a so-called orthogonal matrix. These matrices can be inverted easily, by transposition:

Morth−1 = Morth

t .

General matrix inversion requires numerical procedures, which tend to be unstable whenthe matrix is almost singular, and always introduce numerical errors. But inversion ofrotation matrices can be done swiftly and without round-off, by swapping rows withcolumns.

Sets that are related by a rigid motion are called congruent. In the geometric modelingjargon we often refer to an entire class of sets congruent to one another as a rigid object,and call each individual set an instance of the rigid object. The location and orientation of aninstance, collectively called its pose, may be defined by a rigid motion that takes the setfrom an initial, standard pose to its final pose. Figure 2.2.3.2 shows several congruenttriangles in the plane.

Figure 2.2.3.2 – Instances of a triangle

The notion of congruence is fundamental in Euclidean geometry. Euclid’s originalformulation defined congruence informally: two figures were said to be congruent if theycould be “superposed”. The rigorous definition in terms of rigid motions is only about onecentury old. In the spirit of Felix Klein’s famous “Erlangen program”, Euclidean geometrymay be viewed as the study of those properties of geometric objects that are invariant underrigid motions, and two objects are considered equivalent if they are related by a rigidmotion, i.e., if they are congruent. Rigid motion invariants (also called Euclideanproperties) are such things as distances, angles, perpendicularity, and so on, which are themain subjects of study in high school geometry. We will see later that there are other kindsof geometries, each with its fundamental transformations, analogous to the rigid motions ofEuclidean geometry, and with its notion of geometrical equivalence, analogous tocongruence.

GMCh2 12/30/99 2-12

2.3 Free and Applied Vectors

We defined translation of a vector x as the addition to x of a vector , as shown in Figure2.3.1. Vector translation does not correspond to the intuitive notion of translation of an“arrow” by translating its endpoints, without changing the length or the direction of thearrow. An alternative notion of vector that behaves more like an arrow also is useful ingeometric modeling, as the following example illustrates.

xx +

Figure 2.3.1 – Translation by vector addition

Consider the right, circular cylinder shown on the left in Figure 2.3.2. The cylinder ischaracterized completely by two scalar parameters—its diameter D and height H—plus apoint c—the center of a base—and a vector a along the cylinder’s axis. Suppose now thatwe want to move the cylinder to a different location and orientation, shown on the right inthe figure. Mathematically, moving the cylinder corresponds to applying a rigid motion Tto it. How can we compute the values ′ c and ′ a that characterize the cylinder after theapplication of T? Clearly ′ c = T(c). But ′ a ≠ T(a) because T has a translationalcomponent.

a

a'c

c'

D

H

Figure 2.3.2 – Moving a cylinder

This example, and many similar situations, can be handled conveniently if we introduce anew entity, called a free vector, that is not affected by translations. Free vectors and theordinary vectors defined earlier differ only in their behavior with respect to translations.

GMCh2 12/30/99 2-13

The cylinder in our example can be described by scalars D and H, point c, and free vectora. Intuitively, it is helpful to think of a as being attached to the point c. This notion may beformalized by defining yet another entity, called an applied vector, which consists of a pair(p, x), where p is a point and x a free vector. Equivalently, we can define an appliedvector as a pair of endpoints (p, q) with q = p + x . An applied vector is transformed byapplying a transformation to both endpoints. In Figure 2.3.2 the pair (c, a) is an appliedvector, which transforms as shown on the right in the figure.

Free and applied vectors are used extensively in geometric modeling. For example, thenormal direction to a surface is often represented by a free vector plus the point at which thenormal is calculated, i.e., by an applied vector. (Point information is unnecessary forplanar surfaces, which have a single, constant normal.) Tangential directions for curves aretreated similarly.

2.4 Change of Basis

Let x be a vector with components X e in basis E. Consider a new basis F, obtained from Eby a transformation Tef , with a corresponding matrix Mef

e in basis E. Each vector of thenew basis can be written in terms of its components as

fi = EFie

and we can summarize these n equations by the matrix equality

F = f1 fn[ ] = EF1e EFn

e[ ] = EMefe .

What are the components of x in the new basis? Since

x = EX e = FX f = EM efe X f

we conclude that

X e = Mefe X f

Recall that when we apply a transformation Tef to a vector x , the vector is transformed intoa second vector y, and the column matrices of the two vectors are related by

Y e = Mefe X e

The last two equations are very similar but correspond to different procedures. The firstgoverns the change of components of a fixed vector when a basis changes. The secondgives the components of a transformed vector in a fixed basis. Note that the same matrix isinvolved in the two equations, but there is a “reversal of direction” between the two matrixactions. The change of basis equation can also be written in terms of the inverse matrix

X f = Mefe −1

Xe .

GMCh2 12/30/99 2-14

The need for inversion has a simple geometric interpretation, illustrated by the example ofFigure 2.4.1. Consider a vector x in base E, on the left in the figure. If we rotate the basisby an angle to obtain basis F, as shown in the center of the figure, the components of xchange. Now let us apply in basis E the inverse transformation to x , rotating it by the angle– , so as to obtain y , as shown on the right. The components of y in basis E are the sameas the components of the original x in basis F.

f 1

f 2

e 1

e 2x x

e 1

e 2

y

Figure 2.4.1 – Vector transformation versus change of basis

2.5 Homogeneous Coordinates

We begin this section with a pragmatic view of homogeneous coordinate methods. We thenexplain them geometrically, and finally show how they can be used to compute perspectiveprojections.

2.5.1Transformations in Homogeneous Coordinates

Translations and linear transformations can be treated more uniformly if we introduce adifferent system of coordinates, called homogeneous coordinates. For simplicity we workin 2-D, but generalizations to 3-D or n-D are straightforward. We continue to make nodistinction between points and ordinary vectors. Suppose that we have a vector x withcomponents X , and want to apply to it a linear transformation with matrix M, so as toobtain another vector y with components Y . We introduce an additional component andassociate with the vector x the column matrix

X* =x

y

1

=X

1

.

The elements of X* are called homogeneous coordinates. We also add a third row andcolumn to the linear transformation matrices as follows

M* =a b 0

c d 0

0 0 1

.

This matrix can be written in block format as

GMCh2 12/30/99 2-15

M* =M 0

0 1

,

where M is the usual 2 by 2 linear transformation matrix, and the two-zero row and columnare both denoted by 0. Multiplying the matrices

M*X* =M 0

0 1

X

1

=

MX

1

=

Y

1

= Y*

shows how to evaluate the effects of a linear transformation in the new, augmented-matrixformat. Scalings, shears, rotations, and so on, can be achieved by replacing M in the 3 by3 matrix above by the various matrices we discussed earlier.

Until now we have gained nothing with this approach, and we have lost somecomputational efficiency because 3 by 3 matrices require more storage and morecomputation than their 2-D counterparts. But let us now investigate what happens if theelements of the third column of the matrix become non-zero. Consider

M* =1 0 a

0 1 b

0 0 1

,

and apply it to a generic vector:

Y* = M* X* =1 0 a

0 1 b

0 0 1

x

y

1

=x + a

y + b

1

.

This is precisely the result of translating x by a vector with components (a, b). Thereforewe have found a method for computing both translations and linear transformations bymatrix multiplication. In particular, rigid motions in the plane are associated with 3 by 3matrices. In 3-D they correspond to 4 by 4 matrices. For reference, the three matrices thatcorrespond to rotations about the x, y and z axes are:

x :

1 0 0 0

0 cos − sin 0

0 sin cos 0

0 0 0 1

y :

cos 0 sin 0

0 1 0 0

−sin 0 cos 0

0 0 0 1

z :

cos −sin 0 0

sin cos 0 0

0 0 1 0

0 0 0 1

Uniform treatment of translations and rotations is computationally important. It implies thatwe only need one procedure to implement both, and that matrix-multiplication hardware canbe used for both. We will see later that homogeneous coordinates also can deal withprojections, which are needed for displaying objects.

GMCh2 12/30/99 2-16

2.5.2 Geometric Interpretation

Homogeneous-coordinate methods were introduced above as convenient “recipes”. Butthey have a rich body of mathematics and geometric intuition underlying them. Here weexplore it briefly. First we generalize slightly, and write the homogeneous coordinates ofan Euclidean point p as

P* =x

y

w

.

We have increased the dimension of our space by one. In addition, since we identify pointsat w=1 with Euclidean points, we have placed the standard Euclidean plane at w=1. Figure2.5.2.1 illustrates this construction.

p

q

r

x

y

w

w=1

L

Figure 2.5.2.1 - The Euclidean plane imbedded in an auxiliary 3-space

Now we connect an Euclidean point p with the origin, obtaining a straight line L. It is clearthat for each p there is a corresponding L. Furthermore, since L goes through the origin, itis uniquely determined by any point q lying on it. This implies that p also is uniquelydetermined by any q of L. The coordinates of any such q are called the homogeneouscoordinates of p. If we multiply the three coordinates by the same number k, we obtainanother point r of the same L. Points q and r (or any other points of L) are equivalent, inthat they all define the same line and also the same Euclidean point p. Therefore, thehomogeneous coordinates of p may be multiplied by any non-zero number withoutaffecting the point. In particular, if w≠1 and is not zero, we can always scale all thecomponents so as to normalize the coordinates:

x / w

y / w

1

.

The set of all lines through the origin of our auxiliary 3-D space is called the projectiveplane. The elements of the projective plane are called projective points. (This terminology

GMCh2 12/30/99 2-17

can be confusing since projective points are actually lines.) For each Euclidean point pthere is a corresponding line L and projective point p∗ . Therefore we can manipulateEuclidean points through operations on their projective counterparts. This is precisely whatwe are doing when we multiply the homogeneous coordinates of a point by a 3 by 3matrix. Figure 2.5.2.2 illustrates the procedure. Our goal is to apply an affinetransformation T to an Euclidean point p. We do this in a roundabout fashion. First weimbed the Euclidean point in projective space, obtaining p∗ . Then we apply a projectivetransformation T* , which corresponds to a matrix M* , and generate a transformedprojective point q∗ . Finally, we normalize, i.e. project the result on w=1, to get the desiredq.

p q

p * q *

T

T *

Imbed Normalize

Figure 2.5.2.2

An Euclidean line lying in w=1 corresponds to a plane through the origin of the auxiliary 3-space. Such planes are the projective lines. Therefore, each Euclidean line has acorresponding projective line.

The projective plane is “larger” than the Euclidean plane, because there are lines through theorigin that do not intersect the w=1 plane. These are the lines that lie in the w=0 plane. Thatis, the projective plane has an additional projective line, which is the w=0 plane of theauxiliary 3-space. A projective point with homogeneous coordinates (a, b, 0) does not havea corresponding Euclidean point. What is the geometric meaning of such a point?

To answer this question let us consider an Euclidean point with homogeneous coordinates

p(t) =at + c

bt + d

1

.

When t varies from minus infinity to plus infinity, this point traces in w=1 an Euclideanline with direction defined by (a, b). Because these are homogeneous coordinates, we candivide all of them by t, without affecting the corresponding point:

p(t) =a + c /t

b + c / t

1/ t

.

When t tends to either plus or minus infinity, the homogeneous coordinates of p tend to

GMCh2 12/30/99 2-18

a

b

0

.

Therefore this is a point at infinity along the line of direction (a, b). It is not an Euclideanpoint, but it is a projective point. We see that the projective plane consists of all the pointsof the Euclidean plane augmented by the points at infinity.

Note that all parallel lines with a given direction have the same point at infinity. Moresurprising is the fact that the same point at infinity is reached travelling towards both plusand minus infinity. This implies that a projective line is more like a circle than an ordinaryline. The projective plane is a closed surface with interesting properties (which we will notinvestigate in this course).

2.5.3 Change of Frames

Consider two frames E = (E, q) and F = (F, r), as shown in Figure 2.5.3.1. A genericpoint p corresponds to two different vectors in E and F, satisfying the following relations:

pE

← → x

pF← → y

x = + y

= r − q

e 1

e 2f 1

f 2

q

r

p

x

y

Figure 2.5.3.1 – Change of frames

The coordinate matrices of p in frames E and F, denoted Pe and P f , are given by

GMCh2 12/30/99 2-19

P f = Y f

Pe = X e = Y e + De = Mefe Y f + D e = Mef

e P f + De

where De is the column matrix of the components of in basis E, and Mefe , as usual,

denotes the matrix that corresponds in basis E to the linear transformation that maps thevectors of basis E onto those of basis F. The second equation immediately above followsfrom coordinate relations in vector translation and change of basis, derived earlier. The twoequations may be summarized in matrix form as

Pe

1

=

Mefe D e

0 1

P f

1

or

P*( )e= Mef

*( )eP*( ) f

with

P*( )e=

P e

1

, P*( ) f

=P f

1

, Mef

*( )e=

Mefe D e

0 1

.

These equations show that the effect of a change of frame on the homogeneous coordinatesof a point is analogous to the effect of a change of basis on the components of a vector.Whereas a change of basis is associated with multiplication by a matrix Mef

e , a change offrame involves an augmented matrix. This matrix has a simple interpretation. First note thatthe frame F may be defined by an affine transformation that maps the origin q of frame Eonto the new origin r, and maps the basis E onto the new basis F. This transformationmay be decomposed into a rotation that maps E onto F, followed by a translation by thevector . In homogeneous coordinates the composition corresponds to the matrix product

I De

0 1

Mefe 0

0 1

=

Mefe D e

0 1

= Mef

*( )e.

Therefore Mef*( )e

is the matrix that corresponds to the projective transformation that mapsframe E onto frame F. This is a direct analog of the situation we encountered in a change ofbasis.

We know that the columns of matrix Mefe consist of the components of the vectors of F in

the basis E. Therefore, the frame F = (r, F) corresponds to the homogeneous-coordinatematrix

Mef*( )e

=F1

e Fne De

0 0 1

.

GMCh2 12/30/99 2-20

This matrix may also be interpreted as follows. The first n columns contain the coordinatesof the points at infinity in the directions of the vectors of basis F. The last column is the setof coordinates of an Euclidean point, the origin r of the frame. All of these coordinates arerelative to frame E.

2.5.4 Perspective

Thus far we have only used homogeneous-coordinate matrices with a last row whose off-diagonal elements are null. Let us now investigate wnat happens when they are non-null.Consider the product

1 0 0

0 1 0

−1/ d 0 1

x

y

1

=x

y

1 − x / d

.

(We use a –1/d term for reasons that will be obvious soon.) The result is no longer on thew=1 plane. Normalizing it we obtain

x

1 − x /d

y1 − x /d

1

.

What is the physical meaning of this transformation? We will answer this question with thehelp of Figure 2.5.4.1, which shows how to project a point on the y axis of the Euclideanplane from a center of projection v lying on the x axis at x=d. By similarity of triangles

′ y

y=

d

d − x, ′ y =

y

1 − x / d.

This is precisely the y coordinate we computed above by matrix multiplication.

GMCh2 12/30/99 2-21

x

yy'

d

p

Center ofProjection

v

Figure 2.5.4.1 – Central projection of a 2-D point on the vertical axis

In 3-D, an analogous argument shows that multiplication by the matrix

1 0 0 0

0 1 0 0

0 0 1 0

0 0 −1/ d 1

provides us with the x and y coordinates of the projection of a point on the xy plane, from acenter of projection on the z axis at z=d. Projection on a plane is a fundamental operationfor the generation of a display—see Figure 2.5.4.2.

Eye

Object

Projection

Figure 2.5.4.2 – Drawing an object by projecting it on a plane

The two-dimensional perspective transformation discussed earlier affected both the x and ycoordinates of a point. Figure 2.5.4.3 illustrates the effect of a perspective on a rectangle.The result is not a 1-D projection on the y axis. Rather, it is a deformed 2-D object.Orthographic projection of this deformed object on the y axis produces the desired 1-Dimage.

GMCh2 12/30/99 2-22

y

v

p

p '

q

q '

Figure 2.5.4.3 – Perspective transformation applied to a 2-D solid

In 3-D the perspective transformation produces a deformed 3-D object, which must beprojected orthographically onto the xy plane to generate the desired 2-D image. Computinga planar projection involves matrix multiplication, followed by normalization andorthographic projection. This latter involves essentially no computation, since it amounts toignoring the z coordinate. But normalization is relatively expensive, because it requires adivision.

It is easy to prove that a perspective maps lines onto lines. It also maps line segments ontoline segments, but there is a subtlety, illustrated in Figure 2.5.4.4. Observe that theprojection of the line segment pq on the horizontal axis and with viewpoint (i.e., center ofprojection) v extends to infinity on the right and on the left. (It is a segment of a projectiveline that goes through a point at infinity.) This happens because q is behind the viewpoint,and one of the projecting rays is horizontal and meets the horizontal “screen” at infinity.

p

q

v

r

p 'q ' r'

Figure 2.5.4.4 – Projection of a line segment

A line drawing of a polyhedral object may be produced by projecting the endpoints of itsedges and connecting the projected endpoints with line segments, provided that the objectdoes not extend behind the viewpoint. To ensure that this condition is satisfied in anarbitrary scene, one must clip it, i.e., remove those portions that lie behind the viewpoint,before computing the projection of the scene.

GMCh2 12/30/99 2-23

2.6 Applications in Robotics and Simulation

A robotic manipulator is a kinematic chain, i.e., a collection of solid bodies—calledlinks—connected at joints. The most common joints are the revolute joint, whichcorresponds to rotational motion between two links, and the prismatic joint, whichcorresponds to a translation. Most of the industrial robot “arms” in use today have onlyrevolute joints. Figure 2.6.1 shows an idealized robot with two links and two revolutejoints. The first, vertical link rotates by an angle about its axis, and the second link rotates

by an angle in the vertical plane defined by the two links. The angles and are called inrobotics joint angles.

A

B

θ

φ

z

y

C

x

Figure 2.6.1 – Stick-figure model for a 2-link robot

A robot interacts with the objects involved in a task primarily through its “hand”, or endeffector. The pose of the hand with respect to the “lab” frame is crucial, and one must beable to relate hand coordinates to lab coordinates. This can be done by rigidly attachingcoordinate frames to each link, and then performing successive changes of frames along thekinematic chain. Frame A in the figure is the base, or lab frame. Frame B is attached to thefirst link, and therefore rotates about the y (vertical) axis of A. Frame C is attached to thesecond link and rotates about the z axis of B. C is the hand frame of the robot. Therelationship between hand and base coordinates may be expressed in terms of the matricesthat describe the relative motions of the two links in each joint:

X a = Maba Xb = Mab

a Mbcb X c .

Frame B may be constructed by first translating A by the length L1 of the first link along y,and then rotating by about the y axis in basis A. Therefore

GMCh2 12/30/99 2-24

Maba ( ) =

cos 0 sin 0

0 1 0 0

− sin 0 cos 0

0 0 0 1

1 0 0 0

0 1 0 L1

0 0 1 0

0 0 0 1

=

cos 0 sin 0

0 1 0 L1

−sin 0 cos 0

0 0 0 1

.

Frame C may be obtained by a translation along the x axis of B by the length L2 of thesecond link, followed by a rotation about the z axis of B:

Mbcb ( ) =

cos −sin 0 0

sin cos 0 0

0 0 1 0

0 0 0 1

1 0 0 L2

0 1 0 0

0 0 1 0

0 0 0 1

=

cos −sin 0 L2 cos

sin cos 0 L2 sin

0 0 1 0

0 0 0 1

.

Note that these matrices depend on the joint angles. Let us fix some point in the hand, withcoordinates X c . The coordinates of this point in the lab frame are functions of the jointangles, and can be computed by multiplying the matrices above:

X a = Maba ( )Mbc

b ( )Xc

The robot’s motors control the joint angles directly, and the relationship between handcoordinates and joint angles is fundamental in robotics. It gives us the trajectory in labcoordinates for any temporal evolution of the joint angles. The velocity and the accelerationare obtained by differentiating the positional relations. In robot programming typically oneseeks to drive the hand along some specific trajectory in lab coordinates. To find thecorresponding joint angles involves inverting the relationships derived above. This problemis often called “inverse kinematics”.

Suppose now that we want to create a graphic simulation of the robot motion. The first linkis a fixed line segment in frame B, with endpoints on the y axis of B at y=0 and y=–L1.The positions of these endpoints in frame A may be computed by the equation

X a = Maba ( )X b ,

by using the appropriate coordinates Xb for the endpoints. Now we repeatedly incrementthe angle by some small amount, compute the endpoints of the first link, and display it atits updated position. The result is an animation of the motion of the first link. A similarprocedure may be used for the second link, since we also have an equation for computingthe lab coordinates of fixed points in C.

This simulation procedure is not restricted to robotic problems or to simple stick objects.Suppose we have, for example, a model of a human figure with various articulations and“links” of complex shapes. We associate a frame with each link and compute the matricesthat relate the frames in terms of some appropriate parameters. Then, we step along theparameters and display the objects in their updated poses.

Realistic simulation of machinery can be done by the techniques outlined above. However,smooth and realistic animation of the motion of humans and animals raises difficultproblems. For example, how should we change the joint angles at the ankles, knees and

GMCh2 12/30/99 2-25

hips to achieve a realistic walk? These issues are beyond the scope of this course. They areaddressed in advanced courses in computer graphics and animation.

2.7 Applications in Rendering

Processing a model to generate a display is called rendering in computer graphics. Thereare many sophisticated techniques to produce realistic displays. Underlying all of these isthe need to map points of the model in “world coordinates” onto points of the screen, andthis can be done by homogeneous-coordinate transformations. Computationally, what weneed is a function WorldToScreen that takes an argument WorldPoint in 3-D andproduces a 2-D ScreenPoint.Given this function, a simple line drawing of apolyhedron can be generated as follows.

Clip model to remove points behind view pointfor each edge of the model doDrawLine(WorldToScreen(EndPoint1),

WorldToScreen(EndPoint2))end

In this pseudo-code, DrawLine is a primitive drawing routine that operates in screencoordinates. Drawing packages often provide a more convenient primitive routine thatdraws a polyline, i.e., a connected set of line segments defined by an array of vertices.DrawLine is simply a more restricted version of DrawPolyline, operating only on twovertices. Clipping may be used also to select a region of the object to be rendered.

How is the WorldToScreen function specified and implemented? Figure 2.7.1 illustratesthe geometry involved.

Xw

Yw

Zw

Xv

Yv

Xs

Ys

Viewplane

Screen

ViewportViewpoint

Figure 2.7.1 – World to screen transformation

GMCh2 12/30/99 2-26

The world to screen transformation is also called the viewing transformation. It is thecomposition of a projection onto the 2-D viewplane with a 2-D transformation between theviewplane and the viewport, which is the region of the screen where the image is to appear.There are several ways of specifying a set of viewing parameters to define the viewingtransformation. The specification should be easy to understand by users, and thereforeshould refer to entities whose geometrical meaning is clear. Computation of the variousmatrices involved should be transparent to users.

Standard or quasi-standard graphic packages such as PHIGS or OpenGL use sets ofviewing parameters that provide a user with great flexibility in view specification. Here wediscuss a simple set of parameters which is convenient for debugging geometricalgorithms. It makes several assumptions about the relations between the geometric entitiesinvolved in view specification, and trades flexibility for ease of use.

The viewing transformation is specified by the following parameters.

4. The viewpoint p.5. A sphere of radius R and centered at a reference point r.

The user should ensure that the sphere he or she specifies encloses the object to bedisplayed, and that the sphere does not enclose the viewpoint. We make the followingassumptions.

1. The reference point is the origin of the (xv ,yv ) coordinate system in the viewplane.2. The viewpoint and the reference point define the line of sight. The line of sight is

perpendicular to the viewplane.3. The orientation of the viewplane coordinate system is as defined in Figure 2.7.2. In the

figure we assume that the entire configuration (viewpoint, viewplane, sphere, andreference point) has been translated so that the reference point is at the origin. Theframe (xt , yt , z t) is constructed as follows. Its z axis coincides with the line of sight rv.The x axis is tangent to the parallel to the sphere at the point where the line of sightintersects the sphere. And the y axis is tangent to the meridian at the same point. Theframe (xv ,yv ,zv) is (xt , yt , z t) translated to the origin, and therefore the two frames havethe same orientation.

4. The viewport coincides with the entire window, whose width W and height H areknown to the system through interaction with the window manager. (The window neednot cover the whole screen.)

Xt

YtZt

Xw

Yw

Zw

v

r

Figure 2.7.2 – Orientation of the viewplane frame

GMCh2 12/30/99 2-27

When the viewpoint is on the positive yw axis (top view) by convention we set

xv = xw

yv = −zw

zv = yw

.

And for a bottom view, when the viewpoint is on the negative yw, axis we set

xv = xw

yv = zw

zv = − yw

.

When the viewpoint is not on the y axis of the world coordinate system, the geometry inFigure 2.7.2 implies that the viewplane frame can be computed as follows:

zv = v − rv − r

xv =yw × zv

yw × z v

yv = zv × xv

.

Here we denote by x v the unit vector along the xv axis, and similarly for the other vectors.

The viewing transformation may be computed by moving the object, viewpoint, sphere,and viewplane so that the reference point moves to the origin, the viewpoint moves to thepositive z axis of the world frame, and the x and y axes of the viewplane coordinate systembecome coincident with those of the world frame. Because we moved the entireconfiguration, the projection of the object on the viewplane, in viewplane coordinates, isnot affected by the motion. This projection is easy to compute in the new position, simplyby using the familiar perspective transformation matrix that corresponds to a viewpoint onthe z axis.

How do we find the correct motion? It is a translation that takes the reference point to theorigin, followed by a rotation that maps the viewplane basis vectors onto the world basis.This rotation is easy to compute, because it is the inverse of a transformation Twv that takesthe world basis Ew onto the viewplane basis Ev. This latter has a corresponding matrixwhose columns are the components of the vectors of Ev in basis Ew, as we saw in Section2.2.1.

After projection on the viewplane, we need to scale the result so as to fit in the availablescreen viewport. Figure 2.7.3 illustrates the geometry involved.

GMCh2 12/30/99 2-28

V

d

Rθ

Zv

Yv

Figure 2.7.3 – Projecting a sphere on the viewplane

Note that the projection of the sphere is a disk of radius V slightly larger than R. By usingsimple trigonometry we find the value for V:

d = v − r

sin =R

d

V =R

cos=

R

1 − sin2=

R

1 − R / d( )2

The sphere projection is tightly enclosed by a square of size 2V, centered at the origin ofthe viewplane. Now we have to map it into a viewport of width W and height H, as shownin Figure 2.7.4. Recall that we assume that the viewport occupies the entire window. Firstwe scale uniformly to ensure that the transformed square fits into the available viewportarea. The scale factor k must be such that 2kV is less than or equal to the minimumdimension of the viewport, which in general is not square, and therefore:

k =min(W, H )

2V .

Next, we translate the scaled square so that its center coincides with the center of theviewport. (We could also choose to place it elsewhere in the viewport, but the center isperhaps best.) This requires a translation by W /2 in x and H/2 in y.

GMCh2 12/30/99 2-29

2V

W/2

H/2

Viewport

2kV

Enclosureof SphereProjection

Xv

Yv

Xs

YsViewplane

Figure 2.7.4 – Scaling and translating the sphere projection into the viewport

We can summarize the entire procedure as follows.

1. Translate the center of the sphere to the origin. This is a translation by the vector –r,with corresponding matrix

M1 =

1 0 0 −rx

0 1 0 −ry0 0 1 −rz0 0 0 1

.

2. Rotate the axes so that the viewplane frame coincides with the world frame. To do this,first compute the viewplane basis vectors (xv , yv, z v) as explained earlier in this section.Then, construct the matrix that corresponds to the world to viewplane basistransformation:

Xv Yv Zv 0

0 0 0 1

Finally, invert it, which can be done by transposition because the transformation is arotation.

M2 =

Xvt 0

Yvt 0

Zvt 0

0 1

.

3. Apply a perspective with viewpoint on the z axis at a distance

d = v − r .The corresponding matrix is

GMCh2 12/30/99 2-30

M3 =

1 0 0 0

0 1 0 0

0 0 1 0

0 0 −1/ d 1

.

4. Project orthographically on the xy plane:

M4 =

1 0 0 0

0 1 0 0

0 0 0 0

0 0 0 1

.

5. Scale to fit in the viewport. First compute V and k as explained above, and thenconstruct the matrix:

M5 =

k 0 0 0

0 k 0 0

0 0 1 0

0 0 0 1

.

6. Translate to the center of the viewport:

M6 =

1 0 0 W / 2

0 1 0 H / 2

0 0 1 0

0 0 0 1

.

7. Compute the viewing matrix by composing all the previous transformations:

Mview = M6M5 M4M3 M2M1.

To transform a point from world to screen we multiply its world homogeneous coordinatesby the viewing matrix, and then normalize the homogeneous coordinates of the result bydividing by the fourth, or w , coordinate.

We explained how to compute a viewing matrix for a simple set of viewing parameters bymoving the entire scene (including the viewplane) to a convenient position, projecting,scaling, and translating to the screen. The same approach can be used with more elaborateviewing parameters. The viewing matrix is transparent to the users of a graphics package,but must be computed internally by the package whenever a user changes the viewingparameters.

GMCh2 12/30/99 2-31

2.8 Mathematical Underpinnings

There are two major aproaches for developing Euclidean geometry rigorously. They are thesynthetic approach, based on a modern version of Euclid’s postulates, which were firstformulated around 300 B.C., and the analytic approach, which can be traced back toDescartes in the 1600s and uses algebraic methods. Synthetic geometry is a moresophisticated version of high school geometry. The analytic approach is followed in thistext because it is the most useful computationally. Most of the material summarized belowmay be found in Halmos’ classic text [Halmos 1958]. Many other good books cover linearalgebra and geometry, for example [Bloom 1979], [Nomizu 1979], and [Porteous 1981].

A vector space defined over the real numbers is a set V together with two operations, calledvector addition, denoted x + y, and scalar multiplication, denoted ax (or a.x), that have thefollowing properties, for all vectors x , y , z of V, and for all reals a and b. The first fourproperties pertain to the vector sum, and the last four to the scalar multiplication.

1. Commutativity: x + y = y + x2. Associativity: x + (y + z ) = (x + y) + z3. Existence of identity: There exists a unique vector 0 , called the zero or null vector, such

that x + 0 = x4. Existence of inverse: For each x there is a unique vector –x such that x + (–x) = 05. Associativity: a(bx) = (ab)x6. Existence of identity: 1x = x7. Vector distributivity: a (x + y) = ax + ay8. Scalar distributivity: (a + b) x = ax + bx

Vector spaces can be defined more generally over any set of scalars that constitute analgebraic field. (For basic notions of algebra see any text on discrete structures, e.g.[Preparata & Yeh 1973].) The same axioms apply, but real numbers are replaced by fieldelements.

An affine space is a set A of elements called points, together with a vector space V and amapping, called point difference, that takes two points p, q of A and produces a vector xof V, and has the following properties. (Point difference is denoted simply by a minussign.)

1. For all p, q, r of A, p – r = (q – r) + (p – q)2. For every point q of A and for every vector x of V there is one and only one point p

such that x = p – q

Property 2 implies that one can add a vector to a point and obtain another point. It alsoestablishes a one-to-one correspondence between A and V for a fixed “origin” q. Thiscorrespondence can be used to extend operations defined in V to corresponding operationsin A. But care must be taken to ensure that the results are not dependent on the selectedorigin. For example, one can show that p + q depends on the choice of origin, andtherefore is an illicit operation, whereas the average (p + q)/2 is independent of origin. Forother examples see [Goldman 1985].

A transformation T in a vector space V is affine if there exists a linear transformation T* onV such that, for all x , y of V

T(x) − T( y) = T*(x − y) .

GMCh2 12/30/99 2-32

One can show that compositions of affine transformation also are affine, and that all affinetranformations are compositions of translations and linear transformations.

Linear and affine transformations can be defined in an affine space A by using the one-to-one correspondence between A and the underlying vector space V. If a vector xcorresponds to a point p (when q is selected as the origin for A), then we define T(p) asthe point that corresponds to the vector T(x). For this definition to be meaningful for somespecific T, one must show that it does not depend on the origin q.

The theory of determinants can be constructed rigorously by using vector space concepts. Itcan be shown that all the matrices that correspond to a linear transformation T (in differentbases) have the same determinant. Transformations that have positive determinants arecalled direct, whereas transformations with negative determinants are called opposite. If adirect transformation maps a basis onto another, the two bases are said to be equallyoriented, or to have the same orientation. If the transformation between two bases isopposite, the bases have different orientation. The bases of V can be divided in twoequivalence classes, such that any two bases in one class have the same orientation, andany two bases from different classes have different orientations [Artzy 1978, Bloom 1979].V is oriented if one of the two equivalence classes has been selected as positive. Thisselection of positive orientation is arbitrary, but it is customary to assign a positiveorientation to “right-handed” bases.

The inner, or dot product, denoted x . y, is an operation that takes two vectors x , y of avector space V and produces a scalar, and that satisfies the following properties. For any x ,y , z in V and any scalar a:

1. Commutativity: x . y = y . x2 . Distributivity: x .(y + z ) = x . y + x . z3. Associativity: x .(ay ) = a(x . y)4. x . x ≥ 05. x . x = 0 if and only if x = 0

A transformation T in a vector space V is an isometry if it preserves the norms of vectordifferences, i.e., if for all x, y in V

T(x) − T(y) = x − y .

A general isometry is always the composition of a translation with another isometry thatdoes not affect the origin. Isometries that leave the zero vector invariant are calledorthogonal transformations. It can be shown that all orthogonal transformations are linear[Bloom 1979], and therefore a general isometry is an affine transformation. It can also beshown [Halmos 1958] that a linear transformation is orthogonal if and only if the followingconditions are satisfied:

T(x) = x

T(x).T(y) = x.y

T(x) − T(y) = x − y

These conditions are equivalent: any one of them implies the other two. Thus, orthogonaltransformations are those linear transformations that preserve norms and inner products ofvectors. They map orthonormal bases into orthonormal bases. A matrix that corresponds to

GMCh2 12/30/99 2-33

an orthogonal transformation in an orthonormal basis is also called orthogonal, and has adeterminant that equals either +1 or –1.

A rotation is defined rigorously as an orthogonal transformation with positive determinant.Rotations can be used to define precisely the notion of signed angle. We give here a briefoutline of how this can be done, and refer the reader to texts such as [Artzy 1978], [Bloom1979] or [Dieudonné 1969] for details. Consider the Euclidean plane and choose a positiveorientation for it. Given two unit vectors x and y of the plane there is a unique rotation Rthat maps x to y . We associate with R an entity called the angle between x and y (in that

order). We set = 0 when R = I, the identity transformation, and define the angleassociated with the composition R2 • R1 to be the sum of the corresponding angles, 1 + 2 .This suffices to derive most of the standard trigonometric notions and expressions.However, to associate a “measure”, i.e., a real number, with an angle one must ventureoutside of geometry. Thus, consider a circle of unit radius and a rotation that correspondsto a given angle and maps a unit vector onto another. The measure of the angle in radians isthe real number obtained by computing the length of the arc of circle defined by the twovectors. Clearly this construction involves notions from integral calculus.

GMCh3 2/1/00 3-1



3. Representations

Representations were defined in the Introduction as symbol structures that correspond to(mathematical models of) physical entites. Our focus in this course is on mathematicalmodels and computer representations that capture the shape of physical objects. Thischapter discusses fundamental properties of representation schemes, and outlines theknown approaches for representing geometric entities.

3.1 Representation Schemes

This section introduces basic notions and properties of representations through somesimple examples. First we need a few definitions. A polygon is a 2-D region of the planebounded by line segments called edges. Adjacent edges cannot be collinear. A vertex of apolygon is a point of the polygon’s boundary where two adjacent edges meet. Two edgesmay intersect only at a common vertex. That is, self-intersecting figures are not consideredpolygons in this course. A simple polygon is a polygon without holes. Figure 3.1.1 showsexamples.

Figure 3.1.1 – A simple polygon with six vertices and six edges (left), a polygon that is not simple because it has a hole (center),

and a figure that is not a polygon because it self-intersects (right).

A set X is convex if the line segment pq lies entirely within X for every pair of points p, qof X . Figure 3.1.2 shows a set that is not convex because it does not contain entirely a linesegment that connects two of the points of the set.

GMCh3 2/1/00 3-2

•

•

p

q

Figure 3.1.2 – A polygon that is not convex

Given any set X , one can always construct infinitely many other sets that enclose X and areconvex. The smallest such set is called the convex hull of X . Figure 3.1.3 shows a set ofdiscrete points in the plane and its convex hull. Note that the convex hull of a set of discretepoints is a convex polygon whose vertices are a subset of the given points.

••

• • •

•

•

•

Figure 3.1.3 – A set of discrete points and its convex hull

Suppose now that we want to represent in a computer simple polygons. The set of objectsto be represented—in our example, the set of all the simple polygons—is called the domainof the representation scheme. We propose a scheme, which we call Scheme 1, defined asfollows.

1. For each polygon, construct the set of its vertices, in arbitrary order.2. For each vertex, construct a pair of real numbers with the coordinates of the vertex in

some agreed frame.3. Make a list (i.e, a sequence of elements) containing all these pairs of reals.

Therefore the symbol structure used to represent a polygon in Scheme 1 is simply a list ofpairs of real numbers:

x1,y1( ) x2 ,y2( ) xn , yn( )( ) .

This defines the format or syntax of the representation. Any representation obeying thissyntax is called syntactically correct.

The meaning or semantics of a representation must also be defined, by means of amathematical rule that associates the symbol structures to geometric entities. Each pair ofreals corresponds to a point. Since points are the lowest-level geometric entities in this

GMCh3 2/1/00 3-3

representation, we refer to them as primitives. A specific point, also called a primitiveinstance, is represented by a pair of parameters, which are its coordinates.

The list of pairs is a structured representation, constructed with representations of primitiveinstances. Now we need to provide a rule for associating the list to a polygon. But thiscannot be done unambiguously, as shown in Figure 3.1.4.

p 1

p 2

p 3p 4

p 5

p 1

p 2

p 3

p 4

p 5

Figure 3.1.4 – Two distinct simple polygons with the same vertices.

The figure shows that a representation in this scheme may correspond to several distinctobjects. We say that the scheme is ambiguous or incomplete. Representational ambiguityhas unpleasant consequences. Suppose that we wanted to write an algorithm for computingthe area of a polygon represented by the vertices shown in Figure 3.1.4. Should thealgorithm return the area of the polygon on the left of the figure, or the area of the polygonon the right? We cannot write an algorithm to reliably compute the area, because we do notknow which polygon is being represented. An incomplete representation does not containenough information to designate a unique geometric object, and cannot support theautomatic computation of all the geometric properties of the object.

Now let us define Scheme 2 by modifying slightly Scheme 1. We keep the same rule forconstructing vertex-list representations, but restrict the domain to convex polygons.Because the convex hull of a set of points is uniquely defined, a vertex list corresponds to asingle convex polygon. Therefore Scheme 2 is unambiguous.

Suppose that we construct a syntactically correct representation in Scheme 2, i.e. a list ofreal-number pairs, that corresponds to the points shown in Figure 3.1.3. This list violatesthe representation construction rules given above, because some of the points are notvertices of the convex-hull polygon, and only vertices of a polygon should appear in itsrepresentation. The representation is semantically incorrect, or invalid. The validity ofrepresentations in Scheme 2 is not easy to establish. In essence, it requires that we computethe convex hull of the set of points, and then verify that all the given points are vertices ofthe hull.

We consider again the entire domain of simple polygons, but change the rules forconstructing representations, and define Scheme 3 as follows. We now require that thevertices be listed in consecutive order, as one follows the boundary of the polygon. Thetwo polygons of Figure 3.1.4 have distinct representations in Scheme 3. The left polygonis represented by the list

p1 p2 p3 p4 p5( ) ,

whereas the representation for the right polygon is

GMCh3 2/1/00 3-4

p1 p2 p4 p3 p5( ) .

A list of consecutive vertices is equivalent to an edge list. For example, the vertex list forthe left polygon of Figure 3.1.4 is equivalent to

p1p2 p2p3 p3p4 p4p5 p5p1( ) .

A planar polygon is completely determined by its edges, and therefore Scheme 3 isunambiguous.

These examples show that minor modifications in the domain or in the representationconstruction rules may change drastically the properties of a representation scheme.

We have seen several schemes for representing polygons in terms of their vertices. Arethere other significantly different schemes? The answer is yes. Scheme 4 provides anexample. Its domain is the set of convex polygons. But now we represent a polygon by alist of planar half spaces. A half space is the region of a plane that lies on one side of aninfinite straight line. Figure 3.1.5 shows a half space primitive and its representation by athree-tuple (d, , s). Here d is the (perpendicular) distance between the origin and the line,or, equivalently, the length of the normal vector v , shown in the figure. The secondparameter is the angle the normal vector makes with the x axis. And s is a sign thatindicates the side of the line on which the half space lies. A positive s implies that the“material” side is in the direction of v ; a negative sign denotes material on the oppositedirection to v .

Half Spacev

s = +

Figure 3.1.5 – Representation for a primitive half space.

The polygon that corresponds to a list of half spaces is simply the (set-theoretic)intersection of the half spaces. Figure 3.1.6 provides an example. The result is always aconvex polygon because each half space is a convex set, and it can be shown that theintersection of convex sets also is convex. Scheme 4 is unambiguous, because theintersection defines a unique set.

GMCh3 2/1/00 3-5

Figure 3.1.6 – A convex polygon defined as the intersection of five half spaces.

The examples presented above show that geometric objects may represented in many ways,and that the following are important properties of representations and schemes.

• Domain – Which objects can be represented in the scheme? The domain is the geometriccoverage of a scheme.

• Validity – Does a representation correspond to (at least) one object in the domain?Computation on invalid data is meaningless, and often causes system crashes.

• Non-ambiguity – Does a valid representation correspond to only one object in thedomain? Non-ambiguity, or completeness, is crucial for the automatic computation ofproperties of the represented objects.

The following additional properties also help to characterize representation schemes.

• Uniqueness – Does an object in the domain have a unique representation in the scheme?Uniqueness greatly facilitates testing objects for equality.

• Conciseness – How large are the representations? Representations in verbose schemesconsume large amounts of memory, and are difficult to transmit rapidly in distributedenvironments.

• Ease of construction – How hard is it to construct a valid representation?Representations in verbose schemes with complex validity conditions are difficult toconstruct, especially by humans.

• Suitability for applications – Are there good application algorithms that operate on therepresentations of the scheme? Experience shows that representation schemes are notuniformly suitable for all applications. Many modeling systems use multiplerepresentation schemes, and convert between them as needed, depending on thespecific computations they support.

Representations often contain redundant data. For example, we could represent simplepolygons in a Scheme 5 by listing the polygon’s edges, with each edge represented by twovertices. This representation contains two copies of each vertex, and therefore is redundant.Redundancies are often introduced in representations to facilitate certain computations.They can be considered as a manifestation of the trade-off between storing andrecomputing, which is ubiquitous in Computer Science. If we find that certain data areoften needed by our algorithms, we may decide to attach these data to the representationsrather than to recompute them as needed. Redundancy implies relationships or constraints

GMCh3 2/1/00 3-6

between the data stored, and complicates the analysis of the validity of a representation,because one has to ensure that the implied constraints are satisfied.

Storing several representations for each object, in different schemes, is an extreme form ofredundancy. This is sometimes done in geometric modeling systems that support a varietyof applications, to ensure that each application has access to the most suitablerepresentation. Two unambiguous representations, in different schemes, are consistent ifthey correspond to the same object. In a multiple representation system it is crucial thatrepresentational consistency be maintained. Often this is achieved by invokingrepresentation conversion algorithms. A conversion algorithm is called consistent if itguarantees that its output representation is consistent with its input representation.Consistency maintenance in systems that offer powerful object editing facilities is a majorproblem.

3.2 Methods for Representing Geometric Entities

Here we discuss known aproaches for constructing representation schemes for geometricobjects. It is useful to classify them into several categories that correspond to thesubsections below, although there is some overlap between categories.

3.2.1 Primitive Instancing

Every scheme has primitives, which must be instantiated to construct structuredrepresentations. Primitives may be low level, e.g. the points used to represent polygons inthe previous section. But they also may be high level. For example, many modeling systemhave solid primitives such as blocks and cylinders.

A primitive is a parameterized geometric entity, typically represented by a tuple containing atype code (e.g. a string ‘point’ or ‘block’) followed by several real numbers thatcorrespond to specific parameter values. For example, a solid block aligned with theprincipal axes can be represented by the 4-tuple

(‘block’, XSize, YSize, ZSize),

where the last three parameters are real numbers that define the dimensions of the block, asshown in Figure 3.2.1.1.

GMCh3 2/1/00 3-7

XSize

YSize

ZSize

x

y

z

Figure 3.2.1.1 – A primitive block in standard pose.

To represent a block in a general pose one can add a fourth parameter, which is atransformation that takes the block from its standard pose, shown in Figure 3.2.1.1, to itsactual pose.

Pure primitive instancing schemes, which have no structured representations, are not veryattractive. They are akin to a language that has words but no sentences. They tend to have asmall domain. In addition, each primitive type requires special-case algorithms forevaluating its properties. Primitive instancing is important primarily in the context of otherschemes that not only instantiate primitives but also combine them into higher-levelstructures.

3.2.2 Spatial Decomposition

Here the idea is to partition space into regions called cells, and to enumerate those cellswhich are filled with material and therefore constitute the object being represented.Decompositions into regular, fixed-size cells are called spatial enumerations. In 2-D, theprimitive square cells are sometimes called pixels (an abbreviation for “picture elements”),and in 3-D, cubical cells are called voxels (an abbreviation for “volume elements”). Regulardecompositions usually represent only “staircase” approximations of the desired objects.For reasonable accuracy, the decompositions become very large.

Hierarchical decompositions with cells of varying sizes are smaller than their regularcounterparts, because small cells are used only where required for an accurateapproximation. 2-D quadtrees are used extensively in image processing. Figure 3.2.2.2shows a simple 2-D polygon with sides parallel to the principal axes, and its quadtree.

GMCh3 2/1/00 3-8

0 1

23

0 1 2 3

0 1 2 3

0 1 2 3

Figure 3.2.2.1 – A 2-D object and its quadtree.

In the figure, quadrants are numbered 0, 1, 2, 3 in clockwise order, as shown at the top.The quadtree has three types of nodes. Grey nodes correspond to cells that are neithercompletely full nor completely empty. Such cells must be subdivided. Black nodes are fulland white nodes are empty. The root node of the quadtree corresponds to the entire space,i.e., to a square box that encloses the object. The quadtree may be constructed by thefollowing (conceptual) procedure. If a cell is full or empty, mark it black or white,respectively; otherwise mark it grey, subdivide it, and recurse.

Quadtrees are the 2-D analogs of binary search trees, and can be used for spatial search.There is a large body of literature on algorithms for processing quadtrees [Samet 199?].The 3-D analogs of quadtrees are called octrees. Both quadtrees and octrees are specialcases of k-d trees, which are studied in the theoretical computational geometry literature.These are trees with k branches per grey node, and with cells of dimension d.

Issues of completeness and validity are trivial for most spatial decompositions. Forexample, a spatial enumeration is always valid and unambiguous.

Decompositions into cells with curved boundaries that follow closely the object’s shapealso are useful, for example for finite-element analysis. (Finite element analysis, or FEA, isa numerical technique for solving partial differential equations over complicated geometricdomains by subdividing the domain into cells, and approximating the solution within each

GMCh3 2/1/00 3-9

cell by a polynomial function.) A set of cells that decomposes an object is called a mesh inthe FEA field.

3.2.3 Constructive Methods

A constructive representation defines an object by a sequence of operations for constructingthe object. The most common constructive representations are called CSG (for ConstructiveSolid Geometry), and use Boolean (set-theoretic) operations. The operation sequence istypically stored as a tree. Figure 3.2.3.1 shows a simple 2-D object and its CSG tree, builtupon 2-D solid primitive rectangles and disks. The object in the figure is constructed as aunion of two rectangles, from which a disk is subtracted.

Figure 3.2.3.1 – CSG in 2-D.

The Boolean operations used in CSG are slightly modified set-theoretic union, intersectionand difference. To see why the modifications are needed, consider the example of Figure3.2.3.2. The L-shaped object A is intersected with the rectangle B. Although both A and Bare 2-D solids, their set-theoretic intersection shown on the right is not. It has a danglingedge.

A

B

Figure 3.2.3.2 – The intersection of two solids need not be a solid.

Solidity is preserved if we define a regularized intersection operation as follows. First weperform the conventional intersection. Then we take its interior and find the closure of theinterior. Interior and closure are defined rigorously in a branch of mathematics calledtopology. Here an intuitive understanding suffices. The sequence of operations whichconsists of interior followed by closure is called regularization. Figure 3.2.3.3 illustrates

GMCh3 2/1/00 3-10

the regularization procedure for the intersection of Figure 3.2.3.2. The dangling edge hasno 2-D interior and therefore disappears, as shown at the center. The interior here is simplya square that does not include its borders. Closure adds the boundary to a set, and thereforeproduces the closed square shown on the right, which is a solid.

Interior Closure

Figure 3.2.3.3 – Regularized intersection.

Regularized union, difference and complement are defined similarly. A set that equals theclosure of its interior is called regular. Regular sets are used to model solids because theyhave no dangling edges or faces, or other non-solid characteristics. We see that regularizedoperations applied to regular sets produce other regular sets. In other words, regular setsare algebraically closed under regularized Boolean operations. This is important, because itimplies that CSG representations for regular sets are always valid, since a sequence ofregularized Boolean operations on regular primitives produces another regular set.Algebraic closure under the construction operations ensures that the result of an operationin a modeling system is valid, and can serve as input for further operations. Thus complexobjects may be defined by successive operations on simpler ones.

Union

Difference

Result

Figure 3.2.3.4 – Non-uniqueness of CSG

CSG representations are non-unique. For example, the L-shaped object on the right inFigure 3.2.3.4 may be defined as a union or as difference of two primitive rectangles, asshown on the left. Deciding whether two representations correspond to the sameobject—sometimes called the same-object detection problem—is not easy. In particular, the

GMCh3 2/1/00 3-11

null-object detection problem, which consists of checking if a representation corresponds tothe null object, is non trivial. Note that to detect a collision or interference between twoobjects A and B it suffices to check if the intersection of A and B is null. In CSG it is veryeasy to represent the intersection, but hard to determine if it is empty.

3.2.4 Sweeping

Sweeping could be considered a constructive method, but we will treat it separately becauseit is sufficiently distinctive. The Minkowski sum of two sets A and B, denoted A + B, is theregion swept by the set B when its reference point takes all possible positions within A.The orientation of B is fixed during the sweep. Figure 3.2.4.1 illustrates the definition.Here A is a rectangle, and B is a disk with its reference point at the center. The Minkowskisum shown on the right is a larger rectangle with rounded corners. The additional regiongenerated by the sweep is highlighted by darker shading. The center of the figure showsthe disk B at several of the positions it takes during the sweep.

A

B

Figure 3.2.4.1 – Minkowski sum of a rectangle with a disk.

A Minkowski sum with a disk in 2-D or a ball in 3-D is also called a solid offsettingoperation. Offsetting is useful for rounding and filletting objects, and for various otherapplications [Rossignac & Requicha 1984].

The Minkowski difference A – B is defined as

A − B = c(cA + B) .

(Some texts use a different definition, which amounts to adding to A the reflection of Babout the origin.) The Minkowski sum is a growing operation, and often called dilation,whereas the difference is a shrinking operation (because we grow the complement of A),often called erosion. These two operations are studied at length in a field called“mathematical morphology” [Matheron 1975, Serra 1982], and applied extensively inimage processing [Haralick 19??].

When A is a 2-D planar set and B is a line segment normal to the plane of A theirMinkowski sum is called a translational sweep or extrusion. (Oblique sweeps can also bedefined.) Figure 3.2.4.2 shows and example. Extrusions are use extensively in mechanicalCAD systems.

GMCh3 2/1/00 3-12

A B

Figure 3.2.4.2 – An extrusion operation.

A Minkowski sum is not the most general form of sweep, because the moving set mustmaintain a fixed orientation, and cannot change its size or shape. An interestinggeneralization involves a planar set, or cross-section, that translates along a possibly curvedtrajectory, or spine, while undergoing changes. The resulting sets are (usually) 3-D solidscalled generalized cylinders or generalized cones, and are popular in computer vision[Ballard & Brown 19??].

A planar set may also be described by sweeping a disk of continuously-varying radiusalong a planar spine. The spine plus a function that defines the changes in radius constitutethe medial axis transform or skeleton of the set. The notion extends to 3-D, by using solidballs instead of disks. Medial axis transforms have a variety of applications, from imageprocessing [????] to finite element mesh generation [Srinivasan et al 19??].

3.2.5 Interpolation and Approximation

Interpolation and approximation methods typically are used to define higher-dimensionalentities from lower-dimensional primitives. For example, a set of points may define a curvesegment or a subset of a surface (often called a patch in the modeling jargon). We willdiscuss specific schemes later in this course, when we study curve and surface modeling.Here we present some fundamental notions through a simple example.

Suppose, for concreteness, that we want to represent a planar, 2nd. degree parametricpolynomial curve whose generic point p(u) has coordinates

x(u) = au2 + bu + c

y(u) = du2 + eu + f

where u is the parameter, and the 6 coefficients a, b, c, d, e, f are initially unknown. If werequire, for example, that

p(0.0) = p1

p(0.5) = p2

p(1.0) = p3

the curve will pass through 3 given points p1, p2 ,p3 . We say that the curve interpolates thepoints. Writing the equations above in terms of point coordinates we obtain 6 linearequations in the 6 unknown coefficients, which, barring singularities, can be solveduniquely. Therefore the 3 points represent unambiguously the curve.

GMCh3 2/1/00 3-13

An interpolating curve passes through the given points but may have large oscillationsbetween the points. An alternative approach is to require that the curve approximate thepoints, i.e., pass near the points without necessarily containing them. If we select a specificapproximation method that ensures a unique result, the points will represent the curveunambiguously. Approximation methods tend to produce smoother, better-behaved curvesthan interpolation, and are widely used.

In a similar vein, points or curves may represent parametric surface or solid patchesthrough interpolation and approximation schemes.

3.2.6 Boundary Methods

An object may be represented by its boundary plus the host space (e.g., an unboundedcylindrical surface) in which it lies. We already saw an example of a boundaryrepresentation scheme. It was Scheme 3 for representing polygons by their edges,discussed in Section 2.1. Knowledge of the boundary of a set does not always determineuniquely the set. However, for the objects we normally encounter in geometric modeling,boundary information does suffice to determine the “inside” unambiguously.

Boundary representations lower the dimensionality of the entities we must represent. Forexample, for a 3-D solid we need only represent its 2-D faces. And for each face, we needonly represent its 1-D edges (plus host surface information). Thus, one can “recurse indimension”, eventually ending out with 0-D point primitives.

Additional data may be required for entities that lie in certain host spaces. For example, aclosed contour of edges on a spherical surface cuts the sphere into two boundedsubsets—see Figure 3.2.6.1. Therefore, the edge-set is an ambiguous representation. Itmust be augmented with neighborhood information that indicates which of the two possiblefaces is to be selected.

Figure 3.2.6.1 – Two faces lying in a spherical surface and having the same boundary.

The neighborhood of a point p with respect to a set S in Euclidean 3-space is theintersection of S with a solid ball centered on p and with a radius R, as R approaches zero.Figure 3.2.6.2 shows the neighborhoods of (i) a midpoint of an edge of a polygon and (ii)a vertex of the polygon. There are many ways of representing neighborhoodscomputationally. For polygons and other objects represented by edge-lists, neighborhoodinformation is often encoded by ordering the edges, and orienting them consistently. Forexample, we may require that an observer moving along the edges in the specified directionsee the material to his or her left. This convention is shown by the arrows in Figure3.2.6.2.

GMCh3 2/1/00 3-14

p

q

Figure 3.2.6.2 – Neighborhoods for points on a polygon’s boundary.

The validity of boundary representations is a relatively complex issue. The validityconditions depend on the domain of objects to be represented, and on the details of thespecific boundary representation scheme. Validity will be discussed later in this course,when we study the various geometric entities of interest such as curves or solids.

3.2.7 Hybrid Methods

It is possible to construct hybrid schemes, which combine several of the methods describedearlier. For example, one can define extrusions by means of translational sweeps, and thencombine the extruded objects by using Boolean operations, thus defining asweep/constructive hybrid.

The main problem with hybrid schemes is the difficulty of writing algorithms forprocessing such representations. One approach is to implement separate algorithms to dealwith each of the individual schemes, and then combine the results. An often simplerapproach consists of converting a hybrid representation into a single-schemerepresentation, and then writing algorithms for this scheme. For example, thesweep/constructive hybrid could be converted to a BRep, and this representation used fordeveloping the required algorithms.

3.3 Mathematical Underpinnings

3.3.1 Mathematical Models and Computational Representations

The framework introduced in Chapter 1 may be formalized as follows [Requicha 1980].Let M be a space whose elements are abstract entities called mathematical models. Ingeometric computation, mathematical models usually are point sets in Euclidean space. Aswe will see later in this course, sharper characterizations of models exist for the variousentities we will consider, e.g., curves or solids. M is called a mathematical modeling space.

The set of all syntactically correct representations constitutes another space calledrepresentation space. A representation scheme is formally defined as a mathematical relations from M to R, as shown in Figure 3.3.1.1. The relation has a domain D, called the domainof the scheme, and a range V. A representation is valid if it belongs to V. A validrepresentation is both syntactically and semantically correct.

GMCh3 2/1/00 3-15

MMathematical Modeling Space

RRepresentation Space

Domain D Range V

sRepresentation

Scheme

Figure 3.3.1.1 – Representation schemes as relations

A representation r in V is unambiguous or complete if it corresponds to a single model m inM, i.e., if the inverse image of r is a single-element set: s−1 (r) = m{ } . It is unique if

s s−1 r( )[ ] = r{ } . A representation scheme is unambigous or complete if the inverse relation

s−1 is a function. It is unique if s is a function.

3.3.2 General Topology and Regular Sets

The notions of interior, boundary, and so on, are formally defined in a branch ofmathematics called point-set, or general topology [Mendelson 1975]. The main concepts ofgeneral topology that are of interest in geometric modeling are summarized below.

Let W be a set of abstract elements (the “world”) and d : W × W → ℜ , where ℜ denotesthe set of real numbers, a function called a metric or distance, that satisfies the followingaxioms, for all x, y, z of W .

1. d(x, y) ≥ 02. d(x, y) = 0 ⇔ x = y3. d(x, y) = d(y, x)4. d(x, z) ≤ d(x, y) + d(y,z)

A pair (W , d) with these properties is called a metric space. An important example of ametric space is the Euclidean space with its usual distance.

An open ball of radius R > 0 and centered at a point x of W is the set

B(x; R) = y ∈W : d(x ,y) < R{ }.

A subset X of a metric space W is open if it contains an open ball about each of its points.Let W be the real line with its usual distance; an open ball is an open interval. Similarly, anopen ball in E2 is a disk (without its bounding circle), and in E3 it is a solid sphere (alsowithout its bounding surface). Open sets in a metric space have the following properties:

1. The empty set Ø and the universe W are open.2. The intersection of a finite number of open sets is open.3. The union of any collection of open sets is open.

GMCh3 2/1/00 3-16

The intersection of an infinite collection of open sets need not be open. For example theintersection of the open intervals (–1,1), (–1/2, 1/2), (-1/3, 1/3), ..., is a set consisting ofthe single point {0} and therefore not open.

A topological space is a pair (W , T) where W is a set (the universe) and T is a collection ofsubsets of W , called open sets, that satisfy properties 1-3 above (by definition). T is calleda topology. It is clear that any metric space with its open sets defined via open balls is alsoa topological space. Its (metrically-defined) open sets constitute the space’s naturaltopology, which is sometimes also called the topology induced by the metric. However, theaxiomatic definition of topological space given just above is more general, does not requireany notion of distance, and includes spaces in which no metric can be defined. Usually,several topologies can be associated with a single set W , resulting in several differenttopological spaces.

A neighborhood N(x) of a point x in a topological space (W ,T) is any subset of W thatcontains an open set that contains x. For example, the closed interval [0,2] contains theopen ball (0.9, 1.1) and therefore is a neighborhood of the point 1. Note that in thegeometric modeling jargon “neighborhood” often has a more restricted meaning, explainedin Section 3.2.6. Topology provides a notion of “nearness” that does not depend ondistance. Thus, a point y is near a point x if y belongs to a neighborhood of x. Generaltopology can be developed in terms of this notion of nearness—see [Henle 19??].

A subset X of a topological space (W , T) is closed if its complement cX is open. Note thatclosed sets are not the opposite of open sets. Some sets may be both closed and open, e.g.,Ø and W , whereas others may be neither closed nor open, e.g. the interval [0,1), whichincludes 0 but not 1.

A point x is a limit point of a subset X of a topological space (W , T) if each neighborhoodof x contains at least another point of X different from x. Limit points of X need not belongto X . For example, 0 and 1 are limit points of the open interval (0,1) but do not belong tothe interval.

The closure of a subset X of a topological space (W , T), denoted k X (this is not standardnotation), is the union of X with all its limit points. It can be shown that k X is the smallestclosed set that includes X , and that X is closed if and only if X = k X.

A point x of a topological space (W , T) is an interior point of a subset X of W if X is aneighborhood of x, i.e., if X contains an open set that contains x. The set of all interiorpoints of X is called its interior, and denoted iX in this course. It can be shown that iX isthe largest open set that is included in X . Thus, a set is always bracketed by an open and aclosed set, which are the set’s interior and closure.

A point x of a topological space (W , T) is a boundary point of a subset X of W if everyneighborhood of x intersects both X and its complement cX. The boundary of X , denoted∂X , is the set of all boundary points of X . It can be shown that ∂X is a closed set, and thatthe boundaries of a set and of its complement are the same, i.e., ∂X = ∂cX. Also, anysubset X decomposes W into three pair-wise disjoint subsets:

W = iX ∪ X ∪ icX .

It can be shown that the collection of sets of the form

′ X = X ∩ ′ W ,

GMCh3 2/1/00 3-17

where ′ W is a subset of a topological space (W , T) and X is an open set in the topology of(W , T), is a topology ′ T for ′ W . ′ T is called the relative or induced topology, and( ′ W , ′ T ) is called a topological subspace of (W , T). The sets that constitute the relativetopology ′ T are called relatively open, or open– ′ W . It is important to note that thedefinitions of closure, interior and boundary depend on the topology being considered. Forexample, let W be Euclidean 3-space and ′ W a 2-D Euclidean plane. A solid disk X lyingin ′ W is closed both in W and ′ W ; however, the relative boundary of X in the 2-Dtopology of ′ W is a circle, whereas its boundary in the 3-D topology of W is the diskitself.

The geometric modeling notion of neighborhood of a point p with respect to a set Xcorresponds to an open ball centered at p in the relative topology induced in X by theunderlying Euclidean space.

A function f from a topological space (W , T) to another topological space (V, S) iscontinuous if, for every open set X of V, the inverse image f −1(X) is an open set of T. It isa homeomorphism if it is continuous and has a continuous inverse. One can show that ahomeomorphism establishes a one-to-one correspondence between the elements of the twosets W and V, and also between the open sets of the topologies T and S . Two sets relatedby a homemorphism are called homeomorphic or topologically equivalent. Properties thatare invariant under homeomorphisms are called topological properties.

A subset X of Euclidean space is bounded if it can be enclosed by a ball of finite radius. Itis compact if it is both closed and bounded. Boundedness is not a topological property, butcompactness is.

A space is connected if it is not the union of two disjoint non-empty open sets.Connectedness also is a topological property. A disconnected set can always bedecomposed into a union of disjoint, connected open sets called its connected components.A space is path-connected if every pair of points of the space can be joined by a curve thatlies entirely within the space. Path-connectedness and connectedness are equivalent for thesets we normally consider in geometric modeling, but not for general sets.

A subset X of a topological space (W , T) is a closed regular set [Kuratowski & Mostowski1976], or simply a regular set, it it equals the closure of its interior:

X = kiX .

The operation that consists of taking the closure of the interior of a set is calledregularization, and denoted by rX . The regularized set operators are defined byregularization of their standard counterparts:

X ∪* Y = r(X ∪Y )

X −* Y = r(X − Y)

X ∩* Y = r(X ∩Y )

c*X = r(cX )

GMCh3 2/1/00 3-18

It can be shown [Kuratowski & Mostowski 1976] that the regular sets with the regularizedset operations are a Boolean algebra, and therefore have the same algebraic properties asgeneral sets with the standard set operators.

GMCh4 2/15/00 4-1



4. Curves and Surfaces

This chapter discusses mathematical models and computational representations for curvesand surfaces.

4.1 Mathematical Models for Curves and Surfaces

We all have an intuitive understanding of curves and surfaces. But can we answermathematically these basic questions: What is a curve? What is a surface? It turns out thatthere are several acceptable answers, and that different branches of mathematics usedifferent definitions. We first introduce some of the fundamental notions through thesimplest possible examples: the straight line, which is a special case of a curve, and theplane, which is a special case of a surface.

4.1.1 Lines and Planes

A straight line, or simply a line, in Euclidean space is a set of points p that satisfy

p − p0 = u(p1 − p0), u ∈(−∞, +∞), p0 ≠ p1 .

Here p0 and p1 are arbitrary but distinct points of the line. The equation above contains aparameter u and is called a parametric equation. As the parameter u takes all possible valuesfrom minus infinity to plus infinity, the point p traces the entire line.

The parametric equation of the line can be written in a different format. Algebraicmanipulation of the original equation yields

p = (1 − u)p0 + up1.

This is the fundamental equation of linear interpolation. It shows that

u = 0 ⇒ p = p0

u = 1⇒ p = p1

i.e., the line interpolates the two given points. Furthermore, for 0 ≤ u ≤ 1, the point p is atan intermediate location between the two endpoints—see Figure 4.1.1.1. Therefore theequation of a line segment is simply

p − p0 = u(p1 − p0), u ∈[0,1], p0 ≠ p1 .

GMCh4 2/15/00 4-2

Letting

a0 = 1− u

a1 = u

the interpolation equation can also be written as

p = a0p0 + a1p1

a0 + a1 = 1

p1p0

u=0

u=1u=0.25

p

Figure 4.1.1.1 – Linear interpolation

Let us now consider a plane that passes through a point p0 and is normal to a unit vector n.the equation of the plane is

(p − p0 ).n = 0

Figure 4.1.1.2 illustrates the geometry involved. If we denote the point coordinates andvector components in a given frame by

p ↔x

y

z

n ↔a

b

b

and let

p0 .n = −d ,

the equation of the plane takes the familiar form

ax + by + cz + d = 0 .

This is called an implicit equation. Note that –d is the (perpendicular) distance between theplane and the origin.

GMCh4 2/15/00 4-3

x

y

z

–d

p0n

Figure 4.1.1.2 – A plane defined by a point and a normal vector

We defined a line by its parametric equation and a plane by its implicit equation. Lines alsohave implicit equations and planes parametric equations. For a line we consider two non-parallel planes and write

a1x + b1y + c1z + d1 = 0

a2 x + b2y + c2z + d2 = 0

This system of equations defines a line implicitly. For a plane we take three non-collinearpoints and write the parametric equation as follows

p − p0 = u(p1 − p0) + v(p2 − p0), u, v ∈(−∞,+∞) .

This equation essentially states that p is an arbitrary point in a 2-D Euclidean space with a(generally non-orthonormal) frame defined by the three given points, with p0 as origin,and basis vectors p1 − p0 and p2 − p0 . Because a plane is a two-dimensional entity, itsparametric equation has two independent parameters u and v.

A rectangular subset of a plane, often called a rectangular patch, is easy to define, simplyby restricting the values taken by u and v to a 2-D interval, e.g., a ≤ u ≤ b, c ≤ v ≤ d . Othersubsets of a plane are not so easy to define. We will return to this issue later in this chapter.

4.1.2 Curves

Calculus textbooks and most of the literature on curve modeling define a curve as a set ofpoints p that satisfy a set of parametric equations:

px = fx (u)

py = f y(u)

pz = fz(u)

.

These can be abbreviated asp = f(u) ,

where f is usually called a vector-valued function, because it has three components. Theequations of a straight line are of this form, with f linear. If we let u range over the entire

GMCh4 2/15/00 4-4

set of reals we obtain the complete curve. If we restrict u to an interval, we define a curvesegment.

It is important to understand that the same curve (i.e., set of points) can be defined bydifferent parametric equations. The function f defines not only a curve but also aparameterization for it, and there are many ways of parameterizing the same curve.

The function f and some of its derivatives are usually required to be continuous. If f iscontinuous, the curve is called C0 continuous. If both the function and its first derivativeare continuous, we say the curve is C1 continuous. In general, a curve is C i continuous iff plus its first i derivatives are continuous.

If a curve is parameterized by its arc length s, the derivatives of the generic point of thecurve p(s) are related to the tangent and normal to the curve as follows

dpds

= t

dtds

= 1n

Here t is the unit vector tangent to the curve, n is the unit normal, and is the radius ofcurvature, i.e., the radius of the circle that best approximates the curve at point p. Theseequations are derived in the standard texts on geometry, e.g. [do Carmo 1976]. Formulasfor the tangent and normal also exist for other parameterizations, but are more complicated.We can construct a frame for each point p by defining a third vector b , called the bi-normal, as

b = t × n .

The three vectors constitute an orthonormal frame, called the Frenet frame, that in generalvaries as p traces the curve.

C continuity, defined above, is also known as parametric continuity, because it isexpressed in terms of the parameterization. Alternatively, one can define geometriccontinuity, which is intrinsic to a curve itself, and does not depend on the parameterization.A curve is G1 continuous if its unit tangent varies continuously, and is G2 if both the unittangent and normal are continuous. It can be shown that parametric and geometriccontinuity are slightly different notions, and neither one implies the other. G continuity isthe more important concept for applications, but it is harder to establish, and one oftensettles for C continuity.

The parametric curves most commonly used in geometric modeling are defined byfunctions f whose three components are polynomials in u, or are the quotients of twopolynomials in u. They are known as parametric polynomial or parametric rational curves,respectively. Cubic curves are especially important because they can exhibit C2 continuity.The equations of a parametric cubic can be written as

GMCh4 2/15/00 4-5

x = a3u3 + a2u2 + a1u + a0

y = b3u3 + b2u2 + b1u + b0

z = c3u3 + c2u2 + c1u + c0

If we interpret the coefficients as the coordinates of points

p3 ↔a3

b3

c3

, p2 ↔

a2

b2

c2

, p1 ↔

a1

b1

c1

, p0 ↔

a0

b0

c0

the curve equations become

p = p3u3 + p2u2 + p1u + p0 ,

which shows that the generic point of a parametric polynomial curve can be expressed as alinear combination of other points, usually called control points.

The parametric definition of a curve has its limitations. For example, space-filling “curves”can be defined by continuous f functions. These “curves” contradict our intuition of a curveas a 1-D entity, because they actually correspond to 2-D or 3-D regions of space. Inaddition, parametric curves can self-intersect. If self-intersections are undesirable, we canmodel a curve by a different mathematical entity called a manifold.

A closed n-manifold is a set that is locally just like an Euclidean space. For example, eachpoint in a closed 1-manifold has a small interval around it that can be elastically deformedinto an interval of the real line. An elastic deformation, without cutting or glueing, is calledin topology a homeomorphism. Two sets are called topologically equivalent if they arerelated by a homeomorphism. Topology, like Euclidean geometry, is the study ofproperties of objects that remain invariant under certain transformations. For Euclideangeometry the relevant transformations were the isometries, whereas for topology they arethe homeomorphisms. The top three 1-D objects of Figure 4.1.2.1 are homeomorphic, andso are the three bottom objects. However, top objects are not homeomorphic to bottomones, because a top object cannot be elastically deformed into a bottom one without glueingits endpoints.

GMCh4 2/15/00 4-6

Figure 4.1.2.1 – Compact, connected 1-manifolds

The bottom objects in the figure are examples of closed 1-manifolds. The top objects arebordered 1-manifolds. Imagine a small disk centered about each of the points p of a 1-manifold X . Recall that the intersection of a disk centered at p with the set X is called theneighborhood of the point with respect to the set (Section 3.2.6). If X is a manifold, aneighborhood of p must be homeomorphic either to an open interval of the real line (x – ε,

x + ε) or to a “half interval” [x, x+ε). If there are no points that correspond to half-intervals, the manifold is closed; otherwise it is bordered. It can be shown that there areonly two kinds of connected compact (i.e. closed and bounded) sets in Euclidean space thatare 1-manifolds. They are the line segment and the circle. All other connected compact 1-manifolds are topologically equivalent to either a line segment or a circle. Therefore Figure4.1.2.1 illustrates all the topologically distinct connected compact 1-manifolds.

In contrast, Figure 4.1.2.2 shows a non-manifold. The point of contact of the two ellipsesviolates the definition of manifold. Its neighborhood is not homeomorphic to either aninterval or a half interval; rather, it is topologically equivalent to a “cross” formed by twointersecting line segments, as shown in the figure.

p

Figure 4.1.2.2 – Point p has a neighborhood homeomeorphic to two intersectingsegments, and therefore the two ellipses do not constitute a 1-manifold

There is yet another definition of curve, through implicit equations:

GMCh4 2/15/00 4-7

f1(x ,y, z) = 0

f2(x ,y ,z) = 0

This is a generalization of the implicit equation of a straight line, discussed in the previoussection. The most interesting class of curves defined implicitly arises when the functions fare algebraic, i.e., they are polynomials in the spatial variables x, y, z.

Figure 4.1.2.3 – A circle defined implicitly

A set defined by a system of algebraic equations is called an algebraic variety. There is alarge body of mathematics, called algebraic geometry, that studies algebraic varieties.Algebraic curves in 3-space are special cases of algebraic varieties, defined by twosimultaneous equations. Each of the equations per se corresponds to an algebraic surface,and the curve is defined as the intersection of the two surfaces. Figure 4.1.2.3 shows acircle defined by intersecting a sphere with a plane, with equations

x2 + y2 + z2 =1

z = 0.5

Defining a curve as the intersection of two algebraic surfaces can also lead to non-intuitive“curves”. For example, the intersection of two touching spheres is a point, not a 1-D entity.Therefore, it would be better to define a curve algebraically as the 1-D portion of theintersection variety. Unfortunately, computing the dimension of a general algebraic varietyis a difficult problem.

All curves can be parameterized, but algebraic curves do not necessarily admit polynomialor rational parameterizations. On the other hand, parametric polynomial or rational curvescan be implicitized and defined through algebraic varieties, but the degrees of thecorresponding implicit equations generally are much higher than those of the originalparametric equations.

4.1.3 Surfaces

We saw in the previous section that in mathematics there are several concepts of “curve”.For surfaces, the situation is analogous. In calculus and differential geometry a surface isdefined by parametric equations:

GMCh4 2/15/00 4-8

x = f x(u,v)

y = f y(u,v)

z = f z(u, v)

.

Now two parameters are needed, because surfaces are intrinsically two dimensional. Theseequations may be abbreviated as

p = f(u,v) .

The vector-valued function f usually is required to satisfy certain continuity, orsmoothness, conditions. If we restrict the parameters to a 2-D interval a ≤ u ≤ b, c ≤ v ≤ d ,the resulting subset of the surface is a (“rectangular”) patch. Other subsets of a parametricsurface are not as easy to define. Typically they are defined through boundaryrepresentations, as discussed in Chapter 5.

Parametric surfaces used in geometric modeling typically are either polynomial or rational,i.e., the components of the function f are either polynomials in the u, v, variables or arequotients of two polynomials on these parameters.

At each point of a smooth surface there are infinitely many lines tangent to the surface.These lines define the tangent plane at the point. The tangent plane is perpendicular to thenormal to the surface.

If we let

u = gu(t)

v = gv (t)

and substitute in the parametric equation of a surface, we obtain

p = f[gu(t),gv(t)] = h(t) .

This is an equation in a single parameter, and therefore it defines a curve. Since the pointsof the curve also satisfy the equation of the surface, the curve must lie on the surface.Different choices of functions g correspond to different curves on the surface. If we fix oneof the parameters and let the other vary,

u = u

v = v0

,

or

u = u0

v = v,

the corresponding curves

p = f(u,v0 ) = h v0(u)

p = f(u0 ,v) = hu0(v)

GMCh4 2/15/00 4-9

are called constant-parameter curves. The tangents to the constant-parameter curves are the

derivatives of the h functions. Therefore, the tangent vector to a constant-v curve is pu

,

and the tangent to a constant-u curve is pv

. The normal to the surface must be

perpendicular to both of these tangents, and therefore (assuming the two vectors are notparallel) the unit normal is given by

n =

pu

×pv

pu

× pv

.

A planar surface has a fixed normal, but in general n varies from point to point on a curvedsurface. The set of normals to all the points of a surface constitute the surface’s Gaussianimage. One can think of the Gaussian image as a set of directions, or direction cone, or,equivalently, as the subset of the unit sphere where the normal directions intersect thesphere. The unit sphere is sometimes called the Gaussian sphere. For example, for aspherical surface the Gaussian image is the entire unit sphere; for a plane it is a single point;and for a cylinder it is a great circle on the Gaussian sphere. Gaussian images are usedextensively in Computer Vision and in other applications, such as visibility studies.

Let us now consider the topological point of view. In topology a surface is a 2-manifold,i.e., a set X such that the neighborhood with respect to X of each point p of X istopologically equivalent to either an open disk (i.e., a 2-D open ball) or to half a disk. If allneighborhhods are homeomorphic to disks, the manifold is closed, otherwise it isbordered. This definition is a direct generalization of the earlier notion of 1-manifold weencountered in the previous section, and can be generalized further, to spaces with higherdimensionality.

Figure 4.1.3.1 shows that the surface of a solid cube is an example of a piecewise-planar,or polyhedral, closed 2-manifold. It is a collection of polygonal faces such that theneighborhoods of vertices, or of points in the interior of an edge, or in the interior of aface, all can be deformed elastically so as to become disks.

Figure 4.1.3.1 – The surface of a solid cube is a closed 2-manifold

GMCh4 2/15/00 4-10

Cross section of neighborhood

Figure 4.1.3.2 – The boundaries of two cubes gluedat an edge or at a vertex are not 2-manifolds

In contrast, Figure 4.1.3.2 shows two polyhedral “surfaces” that are not manifolds. Theobject on the left fails to be a manifold because it has an edge that is shared by more hantwo faces. The neighborhhods of points on the common edge are made out of four halfdisks that cannot be flattened into a single disk without gluing. The object on the right has avertex that is shared by two cones of faces. Its neighborhood can be deformed into twodisks, but these have to be glued to obtain a single disk.

A surface that crosses itself has at least one edge of self-intersection, and therefore isanalogous to the two cubes glued along an edge in Figure 4.1.3.2. It follows that self-intersecting surfaces are not 2-manifolds. It is also clear from this figure that the union oftwo 2-manifolds (the cube boundaries) may produce objects that are not manifolds. In otherwords, 2-manifolds are not algebraically closed under set operations.

We saw earlier that all connected, compact, closed 1-manifolds are topologically equivalentto a circle. For closed 2-manifolds the situation is much more complicated. First, there areso-called non-orientable 2-manifolds, which cannot be built in Euclidean 3-space, andtherefore will not be discussed in this course. A general orientable 2-manifold that isclosed, connected and compact must be homeomorphic to a sphere with n “handles” or“holes”, for n = 0, 1, ... Figure 4.1.3.3 shows on the left a polyhedral 2-manifold that istopologically equivalent to a sphere with one handle, shown on the right. The sphere withone handle is itself homeomeorphic to a torus, i.e., to a doughnut-shaped surface. It is easyto see if two manifolds are equivalent by imagining one being deformed elastically into theother.

GMCh4 2/15/00 4-11

Figure 4.1.3.3 – A polyhedral surface and atopologically equivalent sphere with one handle

Finally, from the point of view of algebraic geometry a surface is an algebraic varietydefined by a single implicit equation

f (x, y,z) = 0

where f is a polynomial in the spatial variables x, y, z. Algebraic surfaces may self-intersect, and sometimes are not truly two-dimensional.

A plane is the simplest example of an algebraic surface, and is defined by a linear implicitequation. Equations of degree higher than one correspond to curved surfaces. For example,a sphere of radius R, centered at the origin, is defined by

x2 + y2 + z2 − R2 = 0 ,

and a cylinder of radius R and axis coincident with the z coordinate axis is defined by

x2 + y2 − R2 = 0 .

The normal to a surface f (x, y,z) = 0 is parallel to its gradient vector, defined as

∇f ↔

f

xf

yf

z

.

Parametric polynomial or rational surfaces can be implicitized, but the resulting algebraicvarieties typically have much higher degrees.

4.2 Representations for Curves

We saw in Section 4.1.2 that there are at least three useful models for curves: images ofparametric functions, 1-manifolds, and algebraic (implicit) curves. This section focuses onparametric curves, because parameterizations are required by most of the algorithms thatprocess curves. Many geometric modeling systems adopt 1-manifold models for curves,

GMCh4 2/15/00 4-12

but still represent them parametrically. We begin with mathematical preliminaries. First weshow that vector space concepts are useful to study polynomials, and introduce multivariatepolynomials, called blossoms , which are associated with single-variable polynomials andare very useful computationally. Then we focus on approximation schemes for representingcurves. We discuss Bézier and B-spline methods, which are the most widely used. Finally,we look at parametric curve representation by primitive instancing. Most of our examplesare second degree curves, for simplicity, although the most commonly used curves inpractice are cubics, because they can be C2 continuous.

4.2.1 Vector Spaces of Polynomials

Consider the set of all the polynomials f(u) in a single variable u that have degrees at mostd. Addition of polynomials is defined as usual, by summing same-power coefficients.Multiplication by a real scalar corresponds to multiplying all the coefficients by the scalar. Itis very easy to prove that this set with the two operations satisfies all the required axiomsand is a vector space. The dimension of the space equals the maximum number ofcoefficients in a degree d polynomial, which is d + 1.

A second degree polynomial in u can be written as

f (u) = 1 u u2[ ]a0

a1

a2

,

where the a’s are the usual coefficients of the powers of u. Comparing this expression withthe expansion of a vector as a linear combination of a basis

x = e0 e1 e2[ ]x0

x1

x2

we see that the monomial functions

1 u u2[ ]constitute a basis for the space of polynomials of degree ≤2. The components of the vector(i.e., polynomial) f are the coefficients

a0

a1

a2

.

The monomial basis is also called the power basis.

If we select a basis, any polynomial in the vector space can be represented by the columnmatrix of its components. Any choice of basis leads to representations that areunambiguous, unique, and of the same size. However, these representations differconsiderably in other characteristics. For example, the power basis has severe

GMCh4 2/15/00 4-13

computational drawbacks, because it often introduces large numerical errors incomputations. It is also ill-suited for curve design, because the coefficients do not have asimple, intuitive effect on the shape of the curve.

A quadratic polynomial, being a vector in a 3-D space, has 3 degrees of freedom. Thesecan be constrained in a geometrically meaningful fashion so as to control the polynomialshape. For example, let us require that a polynomial be represented by the following threequantities:

b0 = f (0)

b1 = f(0) +1

2′ f (0)

b2 = f (1)

where the prime denotes the derivative. The geometric meaning of these constraints isshown in Figure 4.2.1.1.

b0

b1

b2

0 1

f(u)

u

Figure 4.2.1.1 – A quadratic polynomial and its three defining points.

The derivative of f is

′ f (u) = 0 1 2u[ ]a0

a1

a2

and therefore the constraints above can be expressed in the power basis representation as

GMCh4 2/15/00 4-14

f (0) = 1 0 0[ ]a0

a1

a2

= b0

f (0) +1

2′ f (0) = 1

1

20

a0

a1

a2

= b1

f (1) = 1 1 1[ ]a0

a1

a2

= b2

These three equations can be written together in matrix form as

1 0 0

1 12 0

1 1 1

a0

a1

a2

=b0

b1

b2

.

The column matrices contain the components of the vector f in two basis:

Fp =a0

a1

a2

Fb =b0

b1

b2

Here Fp corresponds to the power basis components. The other basis is called theBernstein basis, and the corresponding polynomials are the Bernstein polynomials.

The matrix equation that expresses the constraints can be re-written as

1 0 0

1 12 0

1 1 1

F p = Fb

But we know from Section 3.4 that the components of a vector in the two basis are relatedby

Fp = Mpbb Fb

where Mpbb is the matrix corresponding to the transformation that maps the vectors of the

power basis into those of the Bernstein basis. It follows that

(Mpbp )−1 =

1 0 0

1 12 0

1 1 1

.

GMCh4 2/15/00 4-15

Inverting this matrix gives us the desired tranformation matrix

Mpbp =

1 0 0

−2 2 0

1 −2 1

.

We also know that the columns of this matrix contain the components of the new basisexpressed in the old basis. Therefore, the Bernstein polynomials are the following

B02 (u) = 1− 2u + u2 = (1− u)2

B12(u) = 2u − 2u2 = 2u(1− u)

B22(u) = u2

.

Here the superscript in the basis functions denotes the maximum degree of the polynomialsin the space.

A similar analysis for an arbitrary degree d yields the general Bernstein polynomials:

Bid =

d

i

u i(1 − u)d −i .

This is precisely the form of the binomial distribution, well known in probability andstatistics. For example, if the probability of heads in a coin toss is u, the probability ofexactly i heads in d tosses is the Bernstein polynomial Bi

d(u) . The probabilisticinterpretation of the Bernstein basis can be exploited in the geometric modelingcontext—see [Goldman 1983].

Thus far we have considered only one polynomial function. Now let us define a point p(u)lying in a parametric curve in the Euclidean plane, with coordinates

p(u) ↔x(u)

y(u)

,

and let these coordinates be second degree polynomials in u. Proceeding as in Section4.1.2 we can write

p(u) = B02(u) B1

2(u) B22(u)[ ]

b0

b1

b2

.

A curve in this representation is called a Bézier curve, and the b’s are called its Béziercontrol points, which together form the control polygon. It follows from our derivation forthe single-function case that

GMCh4 2/15/00 4-16

p(0) = b0

p(0) +1

2t(0) = b1

p(1) = b2

,

where t is the tangent to the curve.

4.2.2 Blossoms

We now discuss the blossoming approach, which provides very useful tools for dealingwith Bézier and spline curves. The blossom g(u1 ,u2 , ,ud ) that corresponds to apolynomial f(u) of degree d is a multivariate polynomial with the following properties:

1) f (u) = g(u,u, ,u)

2) g(u1 ,u2, ,ud ) = g[Permutation(u1 ,u2, ,ud )]

3) g( ,ui = a0q0 + a1q1, ) = a0g( ,ui = q0, ) + a1g( ,ui = q1, ), a0 + a1 = 1

The first, or diagonal, property shows that the univariate polynomial can be recovered fromits blossom by setting all the variables in the blossom to the same u value. Note that thereare d variables in the blossom g for an f polynomial of degree d. The second propertyindicates that the order of the variables in the blossom is unimportant. The third provides aninterpolation formula for computing blossom values, and will be used extensively in thealgorithms described below. It can be shown [Ramshaw 1987] that the blossom of apolynomial f(u) is well-defined and always exists.

Let us begin our exploration of blossom algorithms. We will work with second degreepolynomials for simplicity, but the generalization to higher degrees is obvious. We startwith the blossom values g(0,0), g(0,1) and g(1,1) and compute f(u) = g(u,u) for any ubetween 0 and 1 as follows. We interpolate between g(0,0) and g(0,1) by using Property 3of the blossom, setting

ui = u2

q0 = 0, q1 = 1

a0 = 1− u, a1 = uso as to obtain

g(0,u) = (1 − u)g(0,0) + ug(0,1) .

We interpolate similarly between g(0,1) and g(1,1) to compute g(u,1). Then we interpolateonce more between the intermediate results g(0,u) and g(u,1) to produce the final valueg(u,u). This computation can be summarized as shown by the graph of Figure 4.2.2.1,where the arcs denote multiplication by the factors indicated, and the nodes denote sum.

GMCh4 2/15/00 4-17

g(0,0) g(0,1) g(1,1)

g(0,u) g(u,1)

g(u,u)

1-u u 1-u u

1-u u

Figure 4.2.2.1 – Blossom evaluation by successive interpolation

Doing the calculations in full yields

f (u) = g(u,u) = g(0,0)[(1 −u)(1 −u)] + g(0,1)[u(1− u) = (1− u)u]+ g(1,1)u2 .

The functions of u that multiply the g values are the Bernstein basis functions of theprevious section, and therefore

f (u) = g(u,u) = g(0,0)B02(u) + g(0,1)B1

2(u) + g(1,1)B22(u) .

It follows that the g’s must be the Bézier points of f, i.e., the components of the(polynomial) vector f in the Bernstein basis:

g(0,0) = b0

g(0,1) = b1

g(1,1) = b2

A similar computation can be used for each coordinate of a point p(u) on a Bézier curve.Therefore we discovered a computational scheme for evaluating points on a Bézier curve bysuccessive interpolation of its Bézier control points. This procedure has a simple geometricinterpretation shown in Figure 4.2.2.2. In the figure, and elsewhere in the sequel, wedenote points by the sequence of arguments in the corresponding blossom values. Forexample, the point g(0,1) is denoted simply by 01.

The figure illustrates the computation of the point that corresponds to u=1/3, given theBézier points 00, 01 and 11. First we intepolate between 00 and 01. To do this, weconstruct a point 0u along the segment 00 to 01, at a distance from 00 that corresponds toone third (u=1/3) of the distance between 00 and 01. Similarly, we construct a point u1along the segment 01 to 11. Finally, we interpolate between 0u and u1 to find uu. Note thatthis is a purely geometric construction, involving only ratios of line segments. Thiscomputation is fast and numerically stable, and is known as de Casteljau’s algorithm, afterits inventor, an engineer with the French car maker Citroën. (Interestingly, Bézier and deCasteljau, working independently and for competing companies, came up with essentiallythe same approach for defining free-form curves and surfaces. Bézier published his work,whereas de Calsteljau did not; his manuscripts surfaced only recently, many years after theoriginal work was done.) De Casteljau’s algorithm provides a very convenient method fortracing a curve, i.e., for computing successive points along the curve. Curve tracing is afundamental capability, used for many applications such as curve display and computingintersections.

GMCh4 2/15/00 4-18

b1=01

b0=00

b2=11

0u

u1

uu

u=1/3

Figure 4.2.2.2 – De Casteljau’s algorithm

The computation diagrammed in Figure 4.2.2.1 can be generalized to higher degrees, toparameter intervals other than [0,1], and to unequal u arguments. A key requirement is thatthe successive control points in the top row, from left to right, be such that each pair in thesequence differs in only one of the u arguments. Otherwise, one cannot interpolate betweensuccessive points by using Property 3 of the blossoms. For example, for degree 3, theBézier points are

g(0,0,0), g(0,0,1), g(0,1,1), g(1,1,1) .

For an interval [k0, k1], it can be shown that the Bézier points are obtained from theblossom by substituting 0 by k0 and 1 by k1 in the expressions above. For example, theBézier points for a degree 2 polynomial are

g(k0 ,k0), g(k0 ,k1), g(k1 ,k1) .

The blossom computation is still valid, provided that we use the correct interpolationformulae that correspond to the situation depicted in Figure 4.2.2.3.

p0 = p(k0)

p1 = p(k1)

p(u)

Figure 4.2.2.3 – Linear interpolation between parameters other than 0 and 1.

Instead of multiplication by (1-u) and u we now need slightly more complicatedexpressions:

p(u) =k1 − u

k1 − k0

p0 +u − k0

k1 − k0

p1 .

GMCh4 2/15/00 4-19

Note that the parameter expressions have changed, but the geometric constructionembodied in de Casteljau’s algorithm has not. This may be more obvious if we re-write theinterpolation formula as

p(u) = p0 +u − k0

k1 − k0

(p1 − p0) ,

which shows that we are adding to the initial point p0 a vector in the direction of the linesegment, and with a magnitude that corresponds to the ratio of the distances between pp0and p1p0. The result of the successive interpolation, since it implements de Casteljau’salgorithm, is still a Bézier curve, but now parameterized in the interval [k0, k1] instead of[0,1]. The parameter values k0, and k1 are usually called knots.

We can generalize further, and use a successive interpolation scheme to evaluate blossompoints that are not in the curve, i.e., points that do not satisfy

u1 = u2 = = ud = u .

All that is needed is to use the appropriate linear interpolation formulae, but with differentu’s. Figure 4.2.2.4 shows an important example for a quadratic curve. The control pointsin this example are known as the de Boor points, which will be discussed in detail in thenext sections.

k2 − u2

k2 − k0

k3 − u2

k3 − k1

u2 − k1

k3 − k1

u2 − k0

k2 − k0

u1 − k1

k2 − k1

g(k0 ,k1 ) g(k1, k2 ) g(k2 ,k3 )

g(u2 ,k1) g(u2 ,k2 )

g(u1,u2 )

k2 − u1

k2 − k1

Figure 4.2.2.4 – Evaluating a general quadratic blossom from its de Boor points

The first two points in the top row have the same k1. We interpolate between k0 and k2,using a ratio that corresponds to u2. The result is g(u2, k1). The second and third point inthe first row have the same k2. We interpolate between k1 and k3, with ratios thatcorrespond to u2. Finally, we interpolate the two intermediate results between k1 and k2,with ratios that correspond to u1. Observe in Figure 4.2.2.4 that we still have three controlpoints, although they are not the Bézier points, but now we have four knots instead of twoas in the previous examples.

GMCh4 2/15/00 4-20

One might think that evaluating a blossom for points that do not lie on the curve serves nopractical purpose, but we will see soon that this construction can be quite useful. Thecomputation diagrammed in Figure 4.2.2.4 is a generalized de Casteljau construction,using different ratios and u values at each level.

4.2.3 Bézier Curves

We saw earlier that a Bézier curve of degree d is unambiguously defined by a controlpolygon with d + 1 vertices. A point on the curve is given by a linear combination of thecontrol points bi. For example, for degree 2:

p(u) = B02(u) B1

2(u) B22(u)[ ]

b0

b1

b2

.

The Bernstein basis polynomials in the interval of interest are all positive, as shown inFigure 4.2.3.1.

0.5

0.25

0 1

1

0.5

Figure 4.2.3.1 – Bernstein basis in the interval [0,1].

Furthermore, the sum of the basis functions for any value of u equals 1:

(1 − u)2 + 2u(1− u) + u2 = 1− 2u + u2 + 2u − 2u2 + u2 = 1.

This has two important consequences. First, we can move the curve simply by moving itscontrol points. To see why, consider a general rigid motion T, which can always bedecomposed into a rotation R about the origin, followed by a translation by a vector .Applying T to a point p(u) on the curve, and using the fact that R is linear and that thebasis functions add to 1, yields

GMCh4 2/15/00 4-21

T[p(u)] = T[ Bi2 (u)bi

i = 0

2

∑ ]

= R[ Bi2 (u)bi

i =0

2

∑ ]+

= Bi2 (u)R(bi

i = 0

2

∑ ) + [ Bi2(u)

i =0

2

∑ ]

= Bi2 (u)[R(bi

i = 0

2

∑ ) + ]

= Bi2 (u)T(bi

i = 0

2

∑ )

Therefore the transformed points T(bi ) are the Bézier points for the transformed curveT[(p(u)]. The property we just derived is called affine invariance. Observe that anexpansion in the power basis does not have this property, which is yet another reason notto use the power basis.

A second consequence is that p(u) is a convex combination of the control points, whichimplies that it lies in the convex hull of the control polygon. This is an important propertyof Bézier curves. It can be used to compute an enclosure for the curve. Enclosures arewidely used in geometric modeling, as we will see later in this course. Figure 4.2.3.2shows a cubic Bézier curve inside the convex hull of its control polygon.

001

000

011

111

Figure 4.2.3.2 – Cubic Bézier curve and the convex hull of its control points.

By construction (see Section 4.2.1) the curve passes through the endpoints of the controlpolygon and is tangent to the adjoining edges. But the inner vertices of the polygon do notlie on the curve. This is evident from their blossom arguments, which are unequal.Therefore, Bézier methods represent curves through approximation, not interpolation.

The control points of a Bézier curve provide much better design handles than thecoefficients in the power basis. It is relatively easy to shape a curve by moving the controlpoints. Because curve tracing can be done at high speed, a curve design system can provideimmediate graphic feedback to the user as he or she drags the control points. We shall seein the next section that B-spline curves are even better for shape design than their Béziercounterparts. The major advantages of the Bézier formulation are its numerical stability andthe simplicity of the algorithms associated with it.

GMCh4 2/15/00 4-22

The blossom computation can also be used to evaluate the Bézier points for a differentparameter interval in the curve. Suppose that we are given the cubic shown in Figure4.2.3.3, with the control polygon defined by the points 000, 001, 011, and 111, as usual.We know that the Bézier points for the parameter interval [0,k] are 000, 00k, 0kk, andkkk. These points can be calculated by the blossom algorithm, and are also shown in thefigure for 0 < k < 1. Now, we can similarly calculate the Bézier points for the interval[k,1], i.e., the points kkk, kk1, k11 and 111. We have not changed the geometry of thecurve, because we are still working with the same blossom, and a blossom is uniquelyassociated with a curve. But we have replaced the original control polygon by two newpolygons, one for the parameter interval [0,k] and the other for the interval [k,1]. Thereforewe have subdivided the curve into two Bézier curve segments, each of which can bemanipulated separately. Note, however, that if we move the newly-found control pointsindependently, the two segments in general will no longer belong to a single cubic curve.

000

001011

111

00k

0kk kkk kk1

k11

Figure 4.2.3.3 – Subdivision of a Bézier cubic curve

The blossom algorithm can also be used to find the de Boor points of a curve, which are ofthe form

d0 = k0k1 , d1 = k1k2, d2 = k2k3

for a quadratic curve,

d0 = k0k1k2 , d1 = k1k2k3 , d2 = k2k3k4, d3 = k3k4k5

for a cubic, and so on for higher degrees.

Given the de Boor points, the curve can also be traced by using the blossom algorithm, aswe saw in the previous section. A point on the curve is a linear combination of the de Boorcontrol points. Therefore, we have changed the basis of the polynomial vector space, fromthe Bernstein basis to a new one, which is called the B-spline basis. (The B stands forbasis.) In the B-spline representation, the curve generally does not go through any of its deBoor control points. We will see in the next section, that the B-spline basis has significantadvantages from the point of view of shape design.

GMCh4 2/15/00 4-23

The blossom algorithm is all we need to convert between the two basis, and to trace a curvein either representation. To convert from the Bernstein basis to the B-spline basis, we usethe blossom computation to find the de Boor points starting with the Bézier points. Toconvert from the B-spline representation to its Bézier counterpart we evaluate the Bézierpoints from the de Boor points. To trace the curve we evaluate successive points on thecurve by the blossom algorithm, starting either with the Bézier or the de Boor points.

4.2.4 B-Spline Curves

A single polynomial curve of degree d has only d + 1 degrees of freedom. Suppose that wewant to interpolate or approximate a large number of points, to gain a finer control of curveshape. There are two options. We can use a polynomial of high degree, or a piece-wisepolynomial function of low degree, which amounts to several polynomial segments joinedtogether smoothly. High degree polynomials often have undesirable oscillations that aredifficult to control, and also have poor computational behavior because of numerical errors.Piece-wise polynomial functions, called splines, are usually preferable. Spline curves gottheir name from physical splines, which consist of flexible metal or wooden strips adjustedby means of weights, and have been used for many years for designing curves.

Figure 4.2.4.1 introduces some of the nomenclature. The u axis is broken into intervals,called spans, delimited by knots. Within each span the spline f(u) coincides with apolynomial of low degree, typically a cubic. Different spans correspond to differentpolynomials, and therefore also to different blossoms. At the knots certain continuityconditions are required. For degree d, we can obtain up to Cd-1 continuity. Note that Cd

continuity at a knot would force the two polynomials in adjacent spans to coincide, and wewould no longer have a true spline. We will see below that additional knots are oftenintroduced before and after the desired spans, and therefore the number of knots usually isnot s + 1.

k2k1k0

f (u)

ks−1 ks

u

Figure 4.2.4.1 – A spline over s spans, and its knot sequence.

Within each span the spline is an ordinary polynomial, and these satisfy all the vector spaceaxioms. Addition of two Cm functions, or multiplication of a Cm function by a scalar,produce another function with the same continuity properties. Therefore all the splines f(u)of degree at most d, defined over a knot sequence {k i}, and with given continuityconditions at each knot, constitute a vector space. What is the dimension of this space? Forconcreteness, consider cubic splines, assume maximum C2 continuity at all interior knots ofthe sequence, and no continuity requirements at the end knots. A cubic has 4 degrees offreedom, and therefore the first span contributes 4 dimensions. The second spanpolynomial, however, is constrained by 3 continuity conditions (for the function and itsfirst two derivatives) at k1, the first knot of the second span. Therefore the second spanonly contributes 4 – 3 = 1 additional dimension. The same reasoning applies to all the otherremaining spans, including the last one. Hence, the total number of degrees of freedom, or

GMCh4 2/15/00 4-24

the dimension of the vector space, is 4 + (s – 1). Similar arguments apply to arbitrarydegrees. In general, for degree d, s spans, and maximal continuity at interior knots, thedimension of the space of splines is d + s.

To construct a spline one could use a Bézier polynomial for each span, and introduceconstraints to ensure continuity at the knots. But this quickly becomes very complicated. Amore elegant approach constructs the spline over all the spans simultaneously, andautomatically guarantees continuity between spans. The key to this construction is to workwith a specific basis for the spline space, called the B-spline basis. The basis vectorsthemselves have the desired continuity properties, and this ensures that all the other vectorsof the space, which are linear combinations of the basis, have the same properties as well.

We already encountered the B-spline form of a single-span curve in Section 4.2.2, and wesaw that its corresponding control points are the de Boor points. For a quadratic curve,with uniformly-spaced integer knots, the de Boor points are given by the blossom values01, 12, 23. These points define the curve in the parametric interval [1,2]. What are thecorresponding basis functions? They can be evaluated as usual, by the blossom algorithm,as shown in Figure 4.2.4.2.

01 12 23

2 − u

2 − 0

u − 0

2 − 0

u −1

3 −1

3 − u

3 −1

2 − u

2 −1

u1 u2

uu

u −1

2 −1

Figure 4.2.4.2 – B-Spline evaluation

The basis function that corresponds to the 01 component (control point) can be found easilyfrom the figure by assuming that points 12 and 23 are zero and 01 is unity. The other basisvectors can be computed similarly. The results are:

2 − u2 − 0

2 − u2 − 1

= 12 (2 − u)2 = N0

2(u)

u − 0

2 − 0

2 − u

2 − 1+

3 − u

3 −1

u −1

2 − 1= 1

2 [u(2 − u) + (3 − u)(u − 1)] = N12(u)

u − 1

3 −1

u − 1

2 − 1= 1

2 (u −1)2 = N22(u)

where the N’s are the B-spline basis functions, shown in Figure 4.2.4.3.

GMCh4 2/15/00 4-25

The basis function N12 (u) is shown in its entirety in the figure, whereas the other two are

shown only in the interval [1,2]. If we drew these two functions completely, we would seethat all the basis functions are shifted versions of one another. (This would not be true ifthe knots were non-uniform.) We still have 3 control points and 3 basis functions, althoughthey are now the de Boor points and the B-spline basis, instead of the Bézier points and theBernstein basis. However, we now have four knots instead of the two needed for a Béziercurve. To define the curve in the interval [1,2] we had to add an extra knot before [1,2] andanother after it. In general, we need to add d – 1 knots before and after the desired interval.This gives a total of 2d knots that must be considered for each given interval.

0.5

0.125

0.75

0 1 2 3

N12

N02 N2

2

Figure 4.2.4.3 – B-Spline basis functions

The basis functions are non-negative and add to unity for any u. Therefore a B-splinecurve, like its Bézier counterpart, has the convex hull property: it lies entirely within theconvex hull of its control polygon. It also has the affine invariance property: the curve canbe moved in space simply by moving its control points. The support of a basis function,i.e., the region over which the function is non-zero, is precisely 3 spans. In general, a B-spline basis function of degree d has a support of d + 1 spans. This implies that changingthe position of the control point that corresponds to a specific basis function only affects thecurve over the support of the basis function, which is d + 1 spans. Thus, control pointmanipulation has a local effect. This is very convenient for curve design, since it allowslocal tuning of the curve’s shape without causing side effects in other regions.

B-Splines are especially interesting when there are several spans. We use the de Boorpoints for successive spans, and share some of them across adjacent spans. For example, aquadratic spline with 6 control points covers the 4 spans shown in Figure 4.2.4.4. Eachtriple of de Boor points, at the top of the figure, contributes to the 1-span interval shown.

GMCh4 2/15/00 4-26

01 12 23 34 45 56(1,2)

(2,3)(3,4)

(4,5)

0 1 2 3 4 5 6

Figure 4.2.4.4 – A B-spline over the four- span interval [1,5].

The vector space of quadratic splines that corresponds to this 4-span example hasdimension s + d = 6. Therefore we should have 6 basis functions and 6 control points. Thebasis functions are translated versions of the bell-shaped curve shown in Figure 4.2.4.3.Each has a support of d + 1 = 3 spans. Note that we had to add d – 1 = 1 knots at thebeginning and at the end of the desired 4-span interval [1,5]. This gives a total of

(d − 1) + (s + 1) + (d −1) = s + 2d −1

knots for the general case, or 7 for our example. If the knots are a sequence of integersstarting with 0, for s spans and degree d, the sequence is

0 d − 2Initial d−1Knots

d −1 s + d −1Desired s Spans

s + d s + 2d − 2Final d−1 Knots

.

To evaluate the spline we use the blossom algorithm with the de Boor points thatcorrespond to each desired span. For example the points 12, 23, 34 correspond to theinterval [2,3]. In general, for a spline of degree d and s spans, with knot sequence (0, 1, 2,. . . , s + 2d – 2), the d + 1 de Boor points

i(i + 1) (i + d −1), (i +1)(i + 2) (i + d), (i + d)(i + d +1) (i + 2d − 1)

correspond to the span [i + d – 1, i + d], where 0 ≤ i ≤ s – 1. (Recall that each blossom fordegree d has d arguments, which here are knot values.) There are 2d knots that affect thiscalculation:

i, i + 1, , i + 2d −1.

It is convenient to index the de Boor points d by the value of the leading knot in thecorresponding blossom. The sequence of de Boor points above is then simply

di ,d i +1, ,d i +d ,

with 0 ≤ i ≤ s – 1.

GMCh4 2/15/00 4-27

Each span of a spline has a corresponding polynomial and a blossom associated with it.Therefore, we can also represent the polynomial within a span by its Bézier points, andthese can be computed by the blossom algorithm from their de Boor counterparts. Doingthis for all the spans of a B-spline, we convert it into a sequence of Bézier curves, eachrepresented by its Bézier points. The endpoints of the Bézier control polygons are sharedbetween adjacent spans. This conversion is very useful, because many of the algorithmsfor curve evaluation and manipulation are significantly faster for Bézier curves than for B-splines. Some curve and surface modelers provide B-splines for curve and surface design,because they automatically ensure the desired continuity across spans, but convert theminternally into the Bézier form for computational purposes.

Our discussion of B-splines can be generalized in several directions. First, we have tacitlyassumed that all the knots are distinct. If there are coincidences, the previous results apply,with minor modifications. One can show that two coincident knots correspond to the lossof one order of continuity. For an intuitive feeling of why this happens, condider the spanin Figure 4.2.4.5, and imagine that the width tends to zero, which implies that the twoknots coincide in the limit. The spline originally is continuous over the span, but as thewidth approaches zero, the function has to vary more and more rapidly, and eventuallyjumps between the two values in the adjacent spans. We will return to the topic of multipleknots at the end of this section.

Figure 4.2.4.5 – When knots coincide, a spline looses continuity.

As a second generalization, assume that the knots are not the integer values 0, 1, ..., butrather take arbitrary values in a non-decreasing sequence

k0,k1, ,ks + 2 d− 1 .

All of our theory and algorithms are still valid, provided that we replace

0 → k0

1 → k1

s + 2d − 1 → ks +2 d −1

In general the knots are not evenly spaced and the resulting spline is called non-uniform.

Finally, one can work in homogeneous coordinates, with each coordinate being a non-uniform B-spline. Since normalization involves division by the w coordinate, the resultingcurves are no longer polynomial in Euclidean space. Each coordinate is a ratio ofpolynomials, and therefore it is a rational function. The resulting splines are called Non-Uniform Rational B-Splines (NURBS), and are widely used for modeling free-form orsculptured objects. NURBS curves can represent conics such as circles and hyperbolas

GMCh4 2/15/00 4-28

exactly, and can be manipulated in projective space, which is sometimes advantageous.(Note that parametric quadratic polynomials are parabolas.)

Let us return now to the multiple knot case. The de Boor points for a span of a quadratic B-spline that corresponds to a parametric interval [b, c] must be of the form

ab bc cd

Four knots, a, b, c, d, are involved, but they need not be all distinct. For example, if a = bwe have a double knot at the beginning of the sequence. The blossom values of the de Boorpoints become

aa ac cd

and the parametric interval becomes [a, c].

If we let the last two knots coincide as well, we obtain the control points

aa ad dd

and the corresponding interval [a, d]. But these control points are actually the Bézier pointsfor the same blossom (and hence curve). Therefore, a quadratic Bézier curve is simply aspecial case of a single-span B-spline with double knots at the beginning and end of theparametric interval. In general, a Bézier curve of degree d is a single-span B-spline of thesame degree, with d multiple knots at the beginning and end of the parametric interval.

Now we will see that the blossom algorithm for curve evaluation and the subdivisionalgorithm amount essentially to a knot insertion procedure. Consider the standard de Boorpoints for a quadratic span

01 12 23

and evaluate the spline at u = 1.5, as shown in Figure 4.2.4.6.

• ••

01

12

23

1,1.5

1.5,21.5,1.5

Figure 4.2.4.6 – Spline evaluation for u = 1.5.

Observe that the triple

(0,1) (1,1.5) (1.5,1.5)

GMCh4 2/15/00 4-29

is the set of de Boor points with knots 0, 1, 1.5, 1.5, and corresponding parametricinterval [1, 1.5]. Similarly, the triple

(1.5,1.5) (1.5,2) (2,3)

is the set of de Boor points with knots 1.5, 1.5, 2, 3 and interval [1.5, 2]. Therefore ourevaluation procedure has subdivided the original spline for the parameter interval [1,2] intotwo spline segments, one for the interval [0, 1.5] and the other for [1.5, 2], and hascomputed the de Boor points for each of the new spans. Each of the new curve segmentshas a double knot at the joint between the segments. Therefore we have inserted into theoriginal knot sequence a double knot at 1.5. In general, for a spline of degree d, evaluationand subdivision are accomplished by the insertion of a knot of multipicity d.

Continuing this line of reasoning, we can think of the conversion of a quadratic B-spline tothe Bézier form as knot insertion, so as to obtain double knots at the beginning and end ofthe desired span.

4.3 Representations for Surfaces

Both parametric and implicit surfaces are useful in geometric modeling. The most importantexamples of parametric surfaces are Bézier and B-spline bounded surfaces, or patches. Inpractice, implicit algebraic surfaces are typically of low degree, and the most commonlyused are the quadrics, or second degree surfaces.

4.3.1 Bézier and B-Spline Patches

Both Bézier and B-spline curves of degree d can be written as

p(u) = vii = 0

n

∑ id(u)

where the v’s are the control points (sometimes also called control vertices), the φ(u) arethe basis functions, and n is either d, for Bézier curves, or s – 1, for B-splines with sspans. This expression generalizes readily to surface patches as

p(u,v) = v ij ijd

j = 0

m

∑i =0

n

∑ (u) .

Now we have two parameters u and v, and a grid of control vertices, and new basisfunctions that depend on both parameters. The most common patches are called tensor-product surfaces, and have basis functions that are the products of two of the familiar 1-dimensional Bernstein or B-spline basis functions:

p(u,v) = v ij id

j = 0

m

∑i =0

n

∑ (u) jd(u) .

This surface equation can be re-written as

GMCh4 2/15/00 4-30

p(u,v) = vij jd (v)

j = 0

m

∑

i= 0

n

∑ id (u)

= c i(v)i = 0

n

∑ id(u)

where

c i(v) = v ijj = 0

m

∑ jd (v).

These expressions provide us with a simple recipe for computing points on the surface.First fix i and v. For each i, compute the value c i(v) as a point on a Bézier or B-splinecurve defined by the control points v ij . Next, consider the c’s as new control points, andcompute p(u,v) as a point on a curve with such control points. The procedure is illustratedin Figure 4.3.1 for a biquadratic Bézier patch.

i = 0

i =1i = 2

c0

c1 c2

p(u,v)•

Figure 4.3.1 – Evaluating a point on a biquadratic Bézier patch.

For each fixed i = 0, 1, 2 we apply the blossom algorithm with the desired value of v to theappropriate control polygon to find the c’s. These form a new control polygon. We applythe blossom procedure to the new control polygon with the desired u. The result is thepoint p(u,v) on the surface. Observe that we have a grid of 9 control points for the patch.The surface only interpolates the 4 corner points, and the c points obtained in the firstphase of the computation generally do not lie on the surface.

Other computations we studied for curves, for example, subdivision, also generalize easilyto surfaces.

4.3.2 Quadrics

The quadrics are the algebraic surfaces of second degree. The implicit equation for ageneric quadric may be written as

GMCh4 2/15/00 4-31

ax 2 + by2 + cz2 + 2dxy + 2eyz + 2 fxz + 2gx + 2hy + 2 jz + k = 0 .

Therefore the surface can be represented by an array containing the coefficients of thisimplicit equation. More elegantly, the coefficients can be collected in the followingsymmetric matrix

Q =

a d f g

d b e h

f e c j

g h j k

.

If we write the generic point of 3-space in homogenous coordinates

X =

x

y

z

w

the equation of a quadric in homogeneous coordinates becomes

X tQX = 0 .

This equation is convenient for applying geometric transformations to a quadric surface,and for other practical and theoretical purposes.

The most common quadric surfaces are the sphere, the cylinder and the cone, collectivelyknown as the natural quadrics. These are typically represented by primitive instancing. Forexample, a sphere may be represented by a point (the center) and the radius, a cylinder byan applied vector (along the axis) and the radius, and the cone by an applied vector (alongthe axis and anchored at the apex) and the aperture angle. Alternatively, we may define thesurfaces in an agreed standard pose, and represent them by size parameters and a rigidmotion that takes the surface from its standard pose to its actual pose. For a cylinder, forexample, the standard pose could be such that the cylinder’s axis coincides wih the z-axis.Then, the radius of the cylinder plus a rigid motion would suffice to represent any of thecylinder’s instances. Representations that include a rigid motion are wasteful of storage,but are computationally convenient. (They were used extensively in the PADL systemsdeveloped at the University of Rochester in the 1970s and early 1980s.) Suppose, forexample, that we want to intersect a line with a cylinder. This is straightforward in a framein which the cylinder is in standard position, and we can easily transform the line to such aframe by using the rigid motion in the cylinder’s representation.

GMCh5 3/2/99 5-1



5. Solids

This chapter discusses 3-D solids. First we address mathematical modeling issues, andthen representations. We consider only single objects that are rigid and made ofhomogeneous materials. Assemblies of several components and inhomogeneous objectsraise additional issues, which are briefly discussed in the section on Further Explorations.

Solids may be represented by using the various methods discussed in Chapter 3. Soliditydoes not raise significant new issues for most of these methods. However, BReps of solidsdeserve a more detailed treatment than that provided in Chapter 3, and are the main focus ofour representational discussion in this chapter.

5.1 Mathematical Models for Rigid and Homogeneous Solids

Most of the physical objects we encounter in real life are solids. Sometimes they can bemodeled as surfaces or curves. For example, in stress analysis, thin shells are usuallyanalyzed as if their thickness was effectively zero. But full 3-D models are required formany applications. In addition, we often need to model not only solid objects but alsooperations on them. For example, fabrication processes such as machining or welding areimportant in CAD/CAM. The geometrical aspects of machining may be modeled as the(regularized) difference between the initial state of the workpiece and the volume swept bythe cutter in its motion. Welding and other additive processes may be modeled by setunion.

Computationally-useful mathematical models for rigid solids should exhibit the followingproperties.

Rigidity – This is easily achieved since the distances and angles among points of a set inEuclidean space are fixed. Rigid motions preserve distances and angles. Therefore all theinstances of a set obtained from one another by rigid motions can be used to model a rigidobject in all its poses.

Finiteness – A physical object should have a finite extent. To ensure finiteness all we needis to require that our sets be bounded.

Solidity – A model for a solid should be homogeneously 3-D, without dangling faces oredges. We saw earlier that this requirement is met by regular sets in 3-space.

Closure under Boolean operations – Boolean operations applied to solids should produceother solids. This has two important advantages. First, the results of a Boolean operationcan be used as inputs to other Booleans, and a solid model can therefore be incrementallyconstructed by successive Boolean operations (perhaps interspersed with other operations).

GMCh5 3/2/99 5-2

Second, subtractive and additive manufacturing-process models are guaranteed to producesolids.

Finite describability – Point sets used to model solids should be describable by a finiteamount of data, to ensure that they can be represented in computers, which have finitememories. (Here we assume that real numbers can be represented exactly in computers,i.e., we use the Real Random Access Machine model of computation.) Polyhedra can bedescribed finitely by the coordinates of their vertices plus connectivity information thatspecifies how vertices define edges and faces. But polyhedral models have too small adomain. What we need are models akin to polyhedra, but with curved faces.

Boundary determinism – A solid should be modeled by a set that is unambiguously definedby its boundary. This may seem a truism, but actually there are examples, such as theLakes of Wada discussed in [Hocking & Young 1988], in which three or more boundedsets all have the same boundary. Sets that exhibit such behavior are counter-intuitive andpoor models for physical objects. In addition, without boundary determinism we will notbe able to use BReps, which are one of the most popular schemes for representing solids.

It can be shown that sets that are bounded, regular and semi-algebraic possess all thedesired properties, and therefore provide appropriate models for solids. These sets areusually called simply r-sets. Intuitively, r-sets are curved polyhedra with faces lying onalgebraic surfaces. A more precise characterization follows.

A semi-algebraic half space is a set of points that satisfy an algebraic inequality

{p : f (p) ≤ 0}

where f is a polynomial. For example, the inequality

ax + by + cz + d ≤ 0

defines a planar half space, i.e., the portion of 3-space which lies to one side of the planedefined by the equation

ax + by + cz + d = 0 .

A semi-algebraic set is the result of a finite number of (standard, unregularized) set-theoretic operations on semi-algebraic half spaces. For example, a finite solid cylinder isthe intersection of three semi-algebraic half spaces. One of these is cylindrical and the othertwo are planar, as shown schematically in Figure 5.1.1. (Each half space itself isunbounded; only bounded portions of the half space boundaries are shown in the figure.)

Because –f is also a polynomial, and –f ≤ 0 is equivalent to f ≥ 0, we could have definedsemi-algebraic half spaces with inequalities of the form f ≥ 0. Furthermore, the intersectionof the two halfspaces f ≥ 0 and f ≤ 0 is the set defined by the equation f = 0, and this is analgebraic set, or algebraic variety, defined earlier in Chapter 4. Therefore algebraic sets arespecial cases of semi-algebraic sets.

GMCh5 3/2/99 5-3

Figure 5.1.1 – A finite cylinder is the intersection of acylindrical and two planar half spaces

It can be shown that the interior, boundary and closure of a semi-algebraic set are alsosemi-algebraic. Therefore, a finite number of regularized Boolean operations on semi-algebraic sets produces another semi-algebraic set. This implies that a set defined by CSGon semi-algebraic half space primitives also is semi-algebraic. Furthermore, if theprimitives are r-sets, the result also is an r-set, because regularized Booleans preserveboundedness, regularity and semi-algebraicity. This implies that CSG representations in thedomain of r-sets are always valid.

A polyomial has a finite number of coefficients, and a semi-algebraic set is the result of afinite number of (non-regularized) Boolean operations on a finite number of half spacesdefined by polynomial inequalities. Therefore, a semi-algebraic set is always finitelydescribable.

It is also true, but not trivially proved, that a bounded semi-algebraic set in 3-space isdetermined uniquely by its boundary, which is semi-algebraic as well. In addition, semi-algebraic sets do not exhibit certain pathological behaviors found in more general classes ofsets. For example, there are sets such as “Alexander’s horned sphere” or “Antoine’snecklace” [Hocking & Young 1988], that are homeomorphic to spheres, and yet the portionof 3-space which lies inside each of them is not a topological ball. Such sets are said to be“wildly imbedded” in Euclidean space. Semi-algebraic sets cannot be wildly imbedded;they are always “tamely imbedded”.

In summary, r-sets are bounded, regular, semi-algebraic sets. They are rigid, finite, solid,closed under Boolean operations, finitely-describable, and uniquely determined by theirboundaries. Therefore, they provide suitable models for rigid, homogeneous solid objects.

Some authors and systems prefer a more restricted class of models: r-sets that are bordered,connected 3-manifolds. Such sets have boundaries that are themselves closed 2 manifolds.Manifold models have computational advantages. For example, one can be sure that anedge is shared by only two faces, and this simplifies certain algorithms. However,manifolds are not closed under Boolean operations, as shown by the glued-cube examplesof Figure 4.1.3.2.

R-sets need not be connected. For example, two disjoint cubes are an r-set. These cubesare considered rigidly linked, and move together when rigid motions are applied to them. Intypical manifold-based systems, the two cubes are considered two distinct objects that canbe “assembled”, although they need not even be near each other. The two objects can bemoved together by applying a rigid motion to the assembly. Solids or assemblies made outof components that are not in contact with each other are physically counter intuitive, butcause no significant mathematical or computational difficulties.

GMCh5 3/2/99 5-4

5.2 Boundary Representations for Solids

Boundary schemes are the most widely used representations for solids. The followingsections discuss basic concepts in boundary representations, and then focus on validityissues, and how to help construct valid representations by means of so-called Euleroperators.

5.2.1 Boundary Graphs

Essentially, a BRep is a graph structure with nodes corresponding to faces, edges andvertices in a cell decomposition of the solid’s boundary. Links between the nodes expressconnectivity information. Figure 5.2.1.1 provides a simple example. The BRep graph isshown only partially.

(a)

Solid

Face1 Face2 Face3 Face4 Face5

Edge5Edge2

Vertex3Vertex2

(x,y,z)

CombinatorialStucture

MetricInformation

(b)

Figure 5.2.1.1 – A pyramid (a) is represented by a graph (b)containing face, edge and vertex nodes.

GMCh5 3/2/99 5-5

Observe that the top levels of the graph in the figure contain only connectivity information.Together they constitute the combinatorial structure of the representation. The vertexcoordinates are the metric information associated with the representation. In the geometricmodeling jargon, the combinatorial structure is often called the topology, and the metricinformation the geometry.

In the representation shown in the figure, the faces are all simple polygons, without holes.When the polygons are all of the same type, for example, all quadrangles, or all triangles,the BRep is called a tessellation. Tessellations with triangular faces are calledtriangulations. Tesselations are used extensively in hardware accelerators for rendering andother applications.

Faces with holes are used by many BRep modelers. They normally are represented byintroducing an additional level in the combinatorial structure between faces and edges.Edges are grouped into closed circuits or loops, and face nodes point to loop nodes, whichin turn are linked to their associated edge nodes.

For polyhedral objects it is usually clear what is meant by a face or an edge. But for curvedobjects intuition alone does not suffice. For example, what are the faces and edges of theobject shown in Figure 5.2.1.2? For solids bounded by free form or sculptured surfaces, itis usually impossible to determine by inspection which portions of the boundary are theactual faces in the underlying representation.

Figure 5.2.1.2 – What are the faces?

Faces and edges are best viewed as representational artifacts that must be carefully definedby the designers of BRep schemes. First, let us assume that there is a set of primitivesurfaces (for example, planes and natural quadrics) for a boundary scheme. Then, typicalface definitions should satisfy the following conditions.

1. A face is a subset of the solid’s topological boundary.2. The union of all the faces equals the boundary of the solid.3. Each face is a subset of only one primitive surface instance.4. A face is homogeneously 2-D, it has no dangling vertices or edges.5. A face is a connected se6. Faces are quasi-disjoint, i.e., meet only at edges or vertices.7. A face is the largest subset of the boundary that satisfies all the previous conditions

GMCh5 3/2/99 5-6

It can be shown that, under mild conditions, faces that satisfy these conditions are defineduniquely [Silva 19??]. This does not imply that the corresponding BReps are unique. Tworepresentations for the same object may still differ simply because of a different ordering ofentities (permutational non-uniqueness), or different poses for the object (positional non-uniqueness). The faces just defined are sometimes called c-faces, because they areconnected. Not all schemes use c-faces. For example, the PADL-1 system used so-calledb-faces which could have overlaps, and PADL-2 used m-faces, which need not beconnected.

Faces of curved objects are represented as discussed in Chapter 4. Typically this involves arepresentation for a host surface or patch, a set of bounding edges, and neighborhoodinformation. Host surface representation involves additional metric information (or“geometry”) beyond vertex coordinates.

Edges also must be defined precisely. Typical BRep schemes use edges that satisfy thefollowing conditions.

1. An edge is a subset of a face’s boundary.2. The union of all the edges associated with a face equals the face’s boundary.3. Each edge is a subset of the intersection of two primitive surface instances.4. An edge is a compact, connected 1-manifold. (Some schemes require that an edge be

homeomorphic to a line segment.)5. Edges are quasi-disjoint, i.e., meet only at vertices.6. An edge is the largest subset of the boundary of a face that satisfies all the previous

conditions.

Often the intersection of two primitive surfaces crosses itself and therefore is not a 1-manifold. Then, the intersection must be segmented to satisfy the conditions just listed.

Normally, edges are parameterized and represented as explained in Chapter 4.Approximations, for example by splines, are often used. This can cause delicate numericalproblems, because points on the edge do not necessarily lie in the surfaces whoseintersection creates the edge. Approximate representations in the (u,v) parametric spaces ofthe surfaces involved are even more pernicious, because each edge is represented twice,and the two representations do not coincide exactly.

Many BReps contain additional information that is redundant but important for algorithmefficiency. For example, face normals may be stored, or edges may have back pointers tothe faces which share them. Additional links are used to facilitate graph traversal. Anexample of a heavily linked BRep is the winged edge structure due to Baumgart in the mid1970s, which is the intellectual root of many of the data structures in use today.

We describe the basic winged edge scheme for manifold objects with the help of theexample of Figure 5.2.1.3. Edges are oriented by (arbitrarily) selecting for each a startvertex and an end vertex. Edges are depicted in the figure as vectors, to indicate theirorientation. By convention, we orient each face clockwise, as seen by an observer outsidethe solid. The orientation of faces f1 and f2 are shown by the curved arrows in the figure.Since the solid is a manifold, each edge will belong exactly to two faces. In one of thesefaces, called the clockwise face (cw), the edge orientation agrees with the face orientation.In the other adjacent face, called the counter-clockwise face (ccw), the edge and faceorientations are opposite. Therefore, the cw face of e1 is f1, and its ccw face is f2.Following the orientation of f1, the next edge to e1 in its cw face is e2. This edge is calledthe next clockwise edge (ncw) of e1. Similarly, the next counter-clockwise edge (nccw) of

GMCh5 3/2/99 5-7

e1 in its ccw face f2 is e4. Each edge in the winged edge structure points to its ncw andnccw edges. It also points to its two adjacent faces (not shown in Figure 5.2.1.4).

e1

e2

e3e4

e5

e6f1f2

Figure 5.2.1.3 – Edge and face orientations for a cube.

e1 e2 e3e4 e6

f1 f2

e5

cw ccw

ncw

nccw

nccw

Figure 5.2.1.4 – A portion of the winged edge structure for a cube.

A face in the winged edge structure points to only one of its edges, as shown in Figure5.2.1.4. The face-edge pointer is labelled to indicate if the face is the cw or ccw face of theedge. This information is sufficient to efficiently retrieve all the edges in a face. Forexample, start with edge e1 of face f1. The next edge, e2, of f1 is found by following thencw pointer of e1. To get the next edge, observe that f1 is the ccw face of e2. Therefore, wefollow the nccw pointer of e2 and obtain e6. Continuing this procedure we get all the edgesof f1 in their correct order. Winged edge structures are attractive because they are relativelyconcise, and provide efficient means for traversing the boundary by following pointers.

The winged edge structure was designed originally for manifold objects and for simply-connected faces (i.e., faces without holes). It can easily be generalized to multiply-connected faces by introducing in the structure another level that corresponds to theseparate edge loops that bound the faces. It can also be generalized to non-manifoldobjects, although it becomes considerably more complicated.

5.2.2 Validity of BReps

GMCh5 3/2/99 5-8

The validity of BReps is a complicated issue. For concreteness and simplicity, we willconsider a specific representation scheme: boundary triangulation for polyhedral objects.(The discussion is easy to generalize to other BRep schemes.) The domain of the scheme isthe set of all polyhedral r-sets. The faces are triangles and the edges are line segments. Inaddition, faces have the face properties 1-7, and edges have the edge properties 1-6 ofSection 5.2.1. This BRep scheme is a simplified version of the scheme illustrated in Figure5.2.1.1. To convert the representation in the figure to the triangulation scheme, the base ofthe pyramid would have to be split into two triangles by introducing an additional edge.The validity conditions for the boundary triangulation scheme are the following:

1. Each face must have exactly 3 edges, otherwise it will not be a triangle.2. Each edge must have exactly two vertices, otherwise it will not be a line segment.3. The edges associated with a face must form a loop or closed circuit, to ensure that they

enclose a 2-D area. This condition is satisfied if and only if each vertex in a facebelongs exactly to two of the face’s edges.

4. The faces must form one or more closed surfaces or shells, to ensure that they enclose a3-D volume. This condition is satisfied if and only if each edge belongs to an evennumber of faces. (Had we restricted the domain to manifold polyhedra, an edge wouldhave to belong exactly to two faces.)

5. Each 3-tuple of coordinates must correspond to a distinct point in 3-space.6. Edges must either be disjoint or intersect at a common vertex, otherwise there would be

missing vertices in the representation.7. Similarly, faces must either be disjoint or intersect at a common edge or vertex.

These conditions are easy to establish intuitively, and can be derived mathematically.Conditions 1-4 are combinatorial. They are easy to check algorithmically by counting nodesor links in the boundary graph. In contrast, conditions 5-7 are metric, i.e., they involvecoordinates of vertices and equations of lines and planes. They are computationallyexpensive to check, because they require intersection tests.

We conclude that validity checking for BReps is not computationally attractive, and shouldbe avoided. Most geometric modeling systems attempt to imbed the required validityconditions in the algorithms used to construct the representations, instead of testingrepresentational validity after the BReps are built. In the next section we discuss a set ofconstructors that help ensure BRep validity.

Some modeling systems also provide users with operations, sometimes called tweaking,that manipulate boundaries directly. For example, in Figure 5.2.2.1 a sharp edge ischamfered, i.e., replaced by a face. This operation can be implemented by Booleansubtraction, but many BRep systems strive for higher processing speed, and simplymodify the boundary graph directly, by adding a new face, with associated edges andvertices. The mathematical meaning of tweaking operations is not always well defined, andmany systems do not ensure that such operations produce valid solids.

GMCh5 3/2/99 5-9

Figure 5.2.2.1 – A chamfering operation.

5.2.3 Euler Operators

The Euler operators for BRep manipulation were introduced in the geometric modelingliterature by Baumgart in the mid 1970s. They were originally used in mathematicstextbooks to prove Euler’s theorem, which states that the Euler characteristic χ of a closed,connected, compact, orientable 2-manifold satisfies the equation

= f − e + v = 2 − 2h ,

where f, e and v denote the number of faces, edges and vertices, and h is the number ofholes or handles of the manifold. This expression assumes that faces are simplyconnected, i.e., have no holes. If there are s connected surfaces, or shells, Euler’s theoremcan be applied to each connected component:

f1 − e1 + v1 = 2 − 2h1

f2 − e2 + v2 = 2 − 2h2

Summing all of these expressions yields

f − e + v = 2s − 2h ,

where f, e, v and h denote total numbers. The Euler characteristic is a topological propertyof the surface. It is invariant under homeomorphisms, independent of how the surface isdecomposed (under mild conditions that decompositions must satisfy), and depends onlyon the number of shells and handles.

To discuss Euler operations we need to introduce a BRep scheme more general than theboundary triangulation scheme of the last section. Now faces and edges may be curved,and they are open, in the sense that a face does not include its bounding edges, and an edgedoes not include its bounding vertices. We require that open faces and edges behomeomorphic to open disks and open line segments, respectively. Otherwise, faces andedges have the properties listed in Section 5.2.1, just like their counterparts in thetriangulation scheme. The new representation is called a cell decomposition of theboundary.

GMCh5 3/2/99 5-10

Euler operators manipulate cell decompositions. To describe what they do, it is convenientto draw a cell decomposition of a closed surface in the plane. The result is called a Schlegeldiagram, and can be constructed as follows. Suppose that we have a cubical surface madeof an elastic material. We can imagine a balloon in which we drew the edges of a cube. Wecut a slit in the back face, between two points a and b, as shown in Figure 5.2.3.1, andthen we open the surface and force it to lie in a plane.

e1

e2

e3

e4

a b

Figure 5.2.3.1 – The surface of a cube with a slitted back face.

e1

e2

e3

e4a b

q

p

Back Face

r

Figure 5.2.3.2 – Schlegel diagram for the surface of a cube.

The result is shown in Figure 5.2.3.2. Observe that the top and bottom curves bothcorrespond to ab, and therefore must be identified. This means that points p and q in thefigure are actually the same point. The arrows in the figure serve to indicate how point

GMCh5 3/2/99 5-11

correspondences are established. Thus, if the arrows were drawn in opposite directions, pwould be identified with r instead of with q. The back face of the cube is mapped onto theregion bounded by the two circular arcs and the four labelled edges. After the cut, wedeformed the surface elastically, and therefore the planar diagram is homeomorphic to thesurface with a slit. If we identify points as indicated above, the Schlegel diagram ishomeomorphic to the original, uncut surface. The outer, circular edges ofen are not drawnin a Schlegel diagram, and the back face is then understood to be the remainder of theplane.

Now we will successively remove elements of this cell decomposition, as shown in thefollowing figures. In Figure 5.2.3.3 we remove the bottom, front edge and merge the frontand the bottom face. Note that the resulting, merged face is no longer planar. Intuitively,we can think of these operations as erasing lines drawn on a balloon. The operationillustrated in this figure is called kfe, for “kill face and edge”.

kfe

Figure 5.2.3.3 – Removal of an edge and a face

Figure 5.2.3.4 – The result of another kfe (left) and a resulting face (right).

Another application of kfe yields the result shown on the left, in Figure 5.2.3.4. Theresulting cell decomposition contains an unusual face, shown on the right in the figure. Thepoints on the dashed edges do not belong to the face. We can open up the face at the

GMCh5 3/2/99 5-12

vertical, intruding edge, and deform the face homeomeorphically into an open disk.Therefore it is a valid cell.

Two more kfe operations produce the result shown on the left of Figure 5.2.3.5. Now weneed a new operation, termed kev, for “kill edge, vertex”, which produces thedecomposition shown on the right in the figure.

kev

Figure 5.2.3.5 – Applying a kev operation.

Three more kev operations produce the result shown on the left of Figure 5.2.3.6. Observethat we are drawing explicitly the back face, to emphasize that there are two faces in thedecomposition. Applying kfe we produce a decomposition with a single face, as shown onthe right in the figure.

e1

e2

e3

e4a b

Back Facee1

e3

e4a b

Back Face

kfe

Figure 5.2.3.6 – Applying kfe to end out with a single face.

Now we remove two more edges and vertices via kev operations and generate thedecomposition shown on the left of Figure 5.2.3.7. Here we show explicitly the twovertices associated with the remaining edge.

GMCh5 3/2/99 5-13

e3

a b

Back Face

a b

Back Face

kev

v1v1v2

Figure 5.2.3.7 – Generating a single face and single vertex decomposition.

If we remove this edge and one vertex, via kev, we end out with a decomposition thatcontains only one face and one vertex, as shown on the right. This diagram corresponds toa spherical balloon in which only one vertex has been drawn. The face therefore istopologically equivalent to a sphere minus one point, which is a valid cell because it ishomeomorphic to an open disk. To see that this is true, project a sphere onto a plane byusing the North pole as the center of projection. Figure 5.2.3.8 illustrates the analogousprocedure in 2-D. It is clear that points on the sphere map one-to-one onto points on theplane, and so do open sets. Therefore the mapping is a homeomorphism, and the sphereminus a point (the North pole) is topologically equivalent to the whole plane, which is itselfequivalent to an open ball. If this last statement is not obvious, map the interval [0,1)through the function f (x) = x / (1− x) , which is a homeomeorphism since it establishes aone-to-one correspondence between points and also between open sets—see Figure5.2.3.9. The result is a semi-infinite line starting at the origin. Now, if we apply a similarmapping in every direction emanating from the origin, the image of the unit open diskcentered at the origin is the entire plane.

Figure 5.2.3.8 – Projecting a sphere on a plane using the North pole as the viewpoint.

Finally, we can make the single-face, single-vertex object didappear completely byinvoking the ksfv operator, which kills a face a vertex and a shell, i.e., a connectedsurface.

The destructive operators introduced above have constructive inverses. We reverse thedestruction procedure described earlier, and build the cubic surface decomposition bybeginning with the empty set and applying msfv, “make shell, face and vertex”. Then weapply mev, “make edge and vertex”, to obtain the decompositions shown in Figure5.3.2.7. We continue with several applications of mev and mfe, “make face and edge” untilwe obtain the original decomposition.

GMCh5 3/2/99 5-14

f (x)

0 1Figure 5.2.3.9 – Mapping a line segment into a semi-infinite line.

The make and kill operators are called Euler operators . Observe that the operators do notchange the Euler characteristic of the surface. For example, mfe increases the number offaces and the number of edges by one. Thefore, the expression = f − e + v remainsinvariant. (It is this property that makes the operators useful in the proof of Euler’stheorem.)

To be able to construct tori and other surfaces with holes we need an additional pair ofoperators. Consider the two blocks shown on the left in Figure 5.2.3.10. We can gluethem as shown on the right in the figure. To do this we kill the bottom face of the top block(but keep the edges that bound that face), and introduce an extra edge on the top face of thebottom block. The extra edge is needed to ensure that the new (open) face of the joinedcubes is homeomorphic to an open disk, and therefore is a valid cell. In addition, wechange the number of shells from 2 to 1, which causes a decrease of 2 in the generalizedEuler’s formula = f − e + v = 2s − 2h . Increasing the number of holes by one has thesame effect on the Euler characteristic, and the operator is usually called kfmeh, for “killface and make edge and hole”, for reasons that will be apparent after we present belowanother example of its application.. It corresponds to a mathematical operation onmanifolds called the connected sum, which glues two manifolds to produce anothermanifold.

The operator kfmeh can also be applied to a single shell, for example to connect twoopposite faces of a cube, as shown in Figure 5.2.3.11. Initially we draw the two smallshaded faces within the top and bottom face of the cube (with the necessary extra edges).Then we kill the small faces and connect the associated edge loops by a new vertical face,analogous to a cylinder. We need to add an extra edge to the new face to ensure that it ishomeomorphic to an open disk. The outcome of the operation has one less face than theoriginal, one more edge, and a through hole. In this case, kfmeh does precisely what itsname implies.

GMCh5 3/2/99 5-15

New Edge

kfmeh

Figure 5.2.3.10 – Glueing two shells.

New Edge

kfmeh

Through Hole

Figure 5.2.3.11 – Making a through hole.

The Euler operators provide convenient means for constructing complex BReps, and helpensure that they are valid. The early literature on geometric modeling stated erroneously thatthe Euler operators guarantee the validity of BReps. It is easy to see that this claim is false.Assume that the dimpled cube shown in Figure 5.2.3.12 was built trhough Euleroperations. By simply changing the coordinates of one vertex we obtain the set of self-intersecting faces shown on the right, which does not correspond to a valid BRep.Therefore, a representation constructed with Euler operators may or may not be valid,depending on the specific values of the metric data assigned to the faces, edges andvertices.

The most important properties of Euler operators were established by Mäntylä [Mäntylä1984]. He showed that it is always possible to assign metric data to a BRep constructed viaEuler operators so that the BRep is valid in the domain of orientable manifold objects.Conversely, any valid manifold BRep can be constructed by a sequence of Euleroperations. In essence, Euler operators ensure that the combinatorial conditions for validity

GMCh5 3/2/99 5-16

are satisfied. The resulting BRep may or may not be valid, depending on the metric dataassociated with it. Unfortunately, metric conditions are expensive to check.

Figure 5.2.3.12 – Changing the coordinates of one vertex may invalidate a BRep.

GMCh6-1 3/29/99 6-1



6. Fundamental Algorithms

Fundamental algorithms are the building blocks we use to construct computationalsolutions to application problems. This chapter covers some of the fundamental algorithmsthat underly many geometric modeling computations. We begin with a short introductionthat emphasizes the connections between algorithms and the representations on which theyoperate.

6.1 Algorithms and Representations

We consider a very simple example, which is a lower-dimensional version of the familiarproblem of computing the image of a 3-D object by orthographically projecting it onto a 2-D screen. Our objective is to design an algorithm for computing the orthographicprojections of convex polygons on a 1-D screen. 1-D images are not visually very exciting(to put it mildly), but they are simple, and suffice to illustrate some important concepts.

A convex polygon may be defined as the convex hull of a set on non-collinear points—seeChapter 3. The orthographic projection of a set S on a line L may be defined as follows.First we choose a coordinate system such that L coincides with the x axis. Then, for eachpoint p = (x, y) of S , we construct the projected point q = (x, 0). The set of all such q isthe projection we want. With these definitions, we can state our problem in standardmathematical terms:

Given: A convex polygon SFind: The orthographic projection of S on a line L

This is a well-defined mathematical problem, but it is not a well-posed computationalproblem, because we have not specified how the polygon is to be “given”, and what is theformat of the result. In other words, we have not specified how the input and output arerepresented.

It is also interesting to note that the definitions of convex polygon and projection aremathematically correct but not computationally effective, in the sense that they cannot bedirectly embodied in algorithms. A convex hull is the smallest convex set that encloses thegiven points, and the projection is obtained by zeroing the y coordinate of every point ofthe polygon. Both of these are infinite processes that cannot be implemented directly incomputers. We cannot compute all the enclosing convex sets to choose the smallest one,nor can we project and infinite set of points on a line. This does not mean we cannotcompute projections of convex polygons. Although a polygon contains an infinite numberof points, it can be represented by a finite number of vertices.

GMCh6-1 3/29/99 6-2

Let us specify the input and output representations as follows. The convex polygon isrepresented in Scheme 2 of Chapter 3, i.e., by a list of its vertex coordinates; the projectionis represented by the x coordinates of its endpoints. This output representation isunambiguous because the orthographic projection of a convex polygon on a line is a linesegment, as shown in Figure 6.1.1. Now we have a well defined computational problem,and can go ahead and design algorithms to solve it. The following are two possiblesolutions, expressed in a pseudo-C++ language. We assume the existence of classesPairOfReals and VertexList, with the obvious meanings, and functions SortX,which sorts the VertexList by x coordinate, FirstX and LastX, which return the xcoordinates of the first and last element of a VertexList, and MinX and MaxX, whichfind the minimum and maximum x coordinates in a VertexList.

x

y

Xmin Xmax

Figure 6.1.1 – Projecting a convex polygon on the x axis

Algorithm 1:

PairOfReals Project1 (VertexList vl) {VertexList Sorted = SortX (vl);float Xmin = FirstX (Sorted);float Xmax = LastX (Sorted);return (Xmin, Xmax);}

Algorithm 2:

PairOfReals Project2 (VertexList vl) {float Xmin = MinX (vl);float Xmax = MaxX (vl);return (Xmin, Xmax);}

Algorithm 1 sorts the vertex list by increasing value of x coordinate, and returns the firstand last elements of the sorted list. Algorithm 2 traverses the list and extracts the elementswith maximum and minimum x coordinates. It is clear that both algorithms are correct, andtherefore they are functionally equivalent. Is one better than the other? To answer thisquestion we need criteria for comparing algorithms.

The most commonly used criterion is efficiency, which is studied in the Computer Sciencefield of analysis of algorithms. Typically, the worst-case running time for an input of sizen, as n tends to infinity, is taken as the measure of efficiency. Asymptotic worst-caseperformance for geometric algorithms tends to be very pessimistic, and is of limited

GMCh6-1 3/29/99 6-3

practical importance because the worst cases tend to be “pathologic” and infrequentlyencountered. A more relevant measure of complexity should capture the cost ofcomputation for “most” cases, or for “average” objects. Unfortunately, this is usually hardto assess because of a lack of meaningful statistical models for geometric entities (e.g.,what is an average polygon?), or mathematical intractability of the models. (A promisingapproach is the use of randomized algorithms, in which randomness is injected in acontrolled fashion into a deterministic computation—see e.g. [Mulmuley 1994].) For ourexample, it is well-known that the worst-case complexity of sorting n real numbers is oforder n log n, whereas the complexity of minimum or maximum computations is only oforder n. Therefore Algorithm 2 is asymptotically more efficient than Algorithm 1 in theworst case sense.

But efficiency is not the only relevant criterion. It is the easiest to characterize formally, andalso the best understood. Other criteria include:

• Robustness in the presence of numerical errors, such as those introduced by floatingpoint computation.

• Extensibility, for example, when the geometric domain increases.• Suitability for hardware implementation.

The customary approach to efficiency analysis assumes that the representations for theinput and output are given, and are fixed. However, a designer of geometric modelingsystems has the freedom to choose which representations to use. This causes seriousproblems, because no theory is available to compare algorithms that are functionallyequivalent from the mathematical viewpoint, but operate on different representations—e.g.,two algorithms that compute the volume of a solid, one of them for CSG and the other forBReps. The difficulties arise primarily because of input-size considerations. For example,an object may be “small” when represented in CSG, but have a “large” BRep, while justthe opposite may be true for a different object—see [Tilove 1981] for other examples and ageneral discussion.

For our example of convex-polygon projections we could also have represented thepolygons in Scheme 4 of Section 3.1, i.e., as intersections of half-spaces. Thisrepresentation for the input would lead to different algorithms for solving the samemathematical problem.

In summary, here we applied the methodology described in Section 1.3 to a simpleproblem. We began with an application (graphics in 1-D), formulated it mathematically,selected suitable representations for the input and output, and designed algorithms. Weshowed that there are many computational approaches that are functionally equivalent, inthe sense that they all solve the same mathematical problem. We will continue to apply thismethodology in the remainder of this course. In the next few sections we ignore specificapplications, and focus on low-level algorithms that serve as computational utilities for theapplication algorithms discussed in Chapter 7.

GMCh6-1 3/29/99 6-4

6.2 Point-Solid Classification

6.2.1 Point Membership Classification

The characteristic function of a set S is usually defined as follows:

(p) =1 if p ∈S

0 otherwise

This function is also called the indicator of the set, because it tells us if a point p belongs ornot to the set. It turns out that in geometric modeling we need a slightly different function,which provides us with more information. It is called point membership classification(abbreviated PMC) and defined thus:

M(p) =in if p ∈iS

on if p ∈ S

out if p ∈icS

The strings in, on or out returned by M tell us whether the point p is inside, on theboundary, or outside of S .

Point membership classification algorithms are closely related to unambiguousrepresentations for sets. To see why, suppose that we have a representation r for a set S ,and an algorithm that evaluates M(p) for the representation r. Then, for any point p inEuclidean space we can find whether the point belongs to S or not. In other words, weknow which points constitute the set. This implies that r must be an unambiguousrepresentation for S .

In the next two subsections we discuss PMC algorithms for the most commonrepresentations for solids, CSG and BReps.

6.2.2 Point Classification for CSG Solids

A CSG representation is a tree. This immediately suggests a divide and conquer, orrecursive descent algorithm for computing PMC. A first cut at the design of such analgorithm is as follows. PtClassResult is an enumeration data type with string valuesin, on or out. ClassPtSolC is the procedure that classifies a point with respect to a solidrepresented in CSG (hence the C at the end).

Algorithm 6.2.2.1

PtClassResult ClassPtSolC(Pt p; Sol S) { if Prim(S) then return ClassPtPrimSolC(p,S) // S is a leaf else return CombClassPtSolC(ClassPtSolC(p,S.Left), ClassPtSolC(p,S.Right),⊗); }

GMCh6-1 3/29/99 6-5

In words: first we check if the node of the CSG tree that corresponds to S is a leaf (andhence a primitive solid) by evaluating the predicate Prim(S). If so, we call a primitive-specific procedure ClassPtPrimSolC. Since in practice a system has only a fewprimitives, and these are relatively simple, it is not hard to write the primitive classifiers. IfS is not a primitive, it must be a Boolean operator. Then we call ClassPtSolCrecursively on the left and right arguments, S.Left and S.Right, of the operator node,and combine the results by means of the procedure CombClassPtSolC. This combiningprocedure depends on the Boolean operator, denoted by ⊗.

The combining procedure is the crucial component of Algorithm 1. Let us construct a tableto guide the design of CombClassPtSolC. Suppose that S = A ⊗ B , and the operator isregularized intersection. By examining Figure 6.2.2.1 we conclude that if a point pclassifies inside A, or inA, and also inB, then it must be inside their intersection, i.e., inS.The other possible cases are shown in the following table.

S = A ∩* B inB onB outBinA inS onS outSonA onS ? outSoutA outS outS outS

A

B

S

Figure 6.2.2.1 – Intersection of two sets

There is an ambiguity when the point classifies both onA and onB. Figure 6.2.2.2 depictsthe regularized intersection of an L-shaped polygon A with a rectangle B. In the figure,points p and q both are onA and onB, but p is onS , whereas q is outS.

p

qA

BS

Figure 6.2.2.2 – On/on ambiguity

Analogous tables for the difference and union operators also exhibit on/on ambiguities:

GMCh6-1 3/29/99 6-6

S = A −* B inB onB outBinA outS onS inSonA outS ? onSoutA outS outS outS

S = A ∪* B inB onB outBinA inS inS inSonA inS ? onSoutA inS onS outS

This discussion shows that the classification values for the two arguments of a Booleanoperator do not always provide sufficient information for determining the classification of apoint with respect to the solid which results from the operation. Figure 6.2.2.2 shows thatthe result depends on whether the two solids locally are on the same or on opposite sides ofthe overlapping boundary. That is, the resulting classification depends on theneighborhoods of the point with respect to the two solid arguments A and B. Theregularized intersection

N(p, A) ∩* N(p, B)

is a half disk, and therefore p is a boundary point of S , whereas

N(q, A) ∩* N(q, B) = ∅

and q is outside of S—see Figure 6.2.2.3.

N(p,A)

N(p,B)

N(p,S)N(q,A)

N(q,B)

N(q,S) = Ø

Figure 6.2.2.3 – The regularized intersections of the neighborhoods of points p and q with respect to A and B are their neighborhoods with respect to S

PMC can be done by divide and conquer, provided that we take into considerationneighborhood information in on/on cases. We define the augmented PMC function as

M*(p) =in if p ∈iS

[on,N(p,S)] if p ∈ S

out if p ∈icS

GMCh6-1 3/29/99 6-7

To evaluate the augmented classification function it is convenient to first compute theneighborhood, and then infer the result, as shown by the following pseudo-code.

Algorithm 6.2.2.2

NbhdSol NbhdPtSolC(Pt p; Sol S) { if Prim(Sol)then return NbhdPtPrimSolC(p,S) else return CombNbhdPtSolC(NbhdPtSolC(S.Left) NbhdPtSolC(S.Right),⊗); }

PtClassResult ClassPtSolC(Pt p; Sol S) { NbhdSol N = NbhdPtSolC(p,S); if Full(N)then return in else if Empty(N) then return out else return (on, N); }

Observe that we did not change the names of the classification procedure and of the datatype it returns, but their meanings in Algorithm 6.6.6.2 and Algorithm 6.2.2.1 are slightlydifferent. Now ClassPtSolC computes the augmented point classification andPtClassResult contains a neighborhood for on points.

The classification procedure invokes the neighborhood evaluator, and then simply looks atthe results and produces the correct output. The predicates Full and Empty check if theneighborhood is a complete ball or is null. When neither of these is true, the neighborhoodcontains a portion of the solid and of its complement, and therefore the point p is on theboundary of S .

The neighborhood evaluator takes the familiar divide and conquer form. It relies on aneighborhood combiner, CombNbhdPtSolC,which computes Boolean operations onneighborhoods.

Algorithm 6.2.2.2 provides a complete high-level procedure for classifying a point withrespect to a solid represented in CSG. It uses a few predicates, which are very easy toimplement, and NbhdPtPrimSolC and CombNbhdPtSolC, two procedures whichdepend strongly on how neighborhoods are represented. Issues of neighborhoodrepresentation and manipulation are discussed in a separate section in this chapter.

With minor modifications, Algorithm 6.2.2.2 can also be used in 2-D problems, to classifypoints with respect to polygons, or, more generally, with respect to surface segments thatare defined constructively in terms of 2-D primitive patches. But constructiverepresentations for surface segments are seldom used in geometric modeling.

6.2.3 Point Classification for BRep Solids

Classification algorithms for solids represented by their boundaries are very different fromtheir CSG counterparts. Conceptually, the simplest algorithm for PMC with respect to aBRep solid consists of casting a ray (i.e., a semi-infinite line) from the point, and countinghow many times it intersects the solid’s boundary. If the number of intersection points isodd, the point is in; if it is even, the point is out. Figure 6.2.3.1 shows a 2-D example.Note that the choice of ray used to test a point is arbitrary, and different rays may have

GMCh6-1 3/29/99 6-8

different numbers of intersections with the boundary, but for each point all of thesenumbers have the same parity, i.e., they are all odd or all even. For example, if we had casta ray from point p to the right in Figure 6.2.3.1, the number of intersections would be 1instead of 5. Computationally, instead of a semi-infinite line we use a line segment that issufficiently long to finish outside of a box that encloses the solid completely.

12345

p

1

2

q

Figure 6.2.3.1 – The ray cast from in point p has 5 intersections with thepolygon’s boundary, whereas the ray from out point q has 2.

Despite its apparent simplicity, there are subtleties associated with this algorithm. First, wealso need to consider points that are on the boundary. How do we count intersections whenthe endpoint of a ray is on? Second, and more perniciously, numerical errors associatedwith the representations or with the intersection calculations may produce wrong counts.Third, a ray may intersect an edge or a vertex, or partially lie in a face or an edge. How dowe count intersections in such singular cases? Figure 6.2.3.2 illustrates some of thedifficulties. Numerical errors in the endpoint coordinates for the edges of the square aregrossly exagerated in the figure, to show what can happen in actual computations. Thepoint p is in the polygon, yet a ray emanating from p may not intersect any edge, orintersect two edges, or a vertex. In the first two cases the computed number of intersectionsis even, and the point will be erroneously classified out. When the ray goes through avertex, it intersects two edges; should we count one or two intersections?

p

1

2

0 ?

Figure 6.2.3.2 – Classifying a point with respect to a squarewhose edge representations have numerical errors.

It is possible to resolve all of these difficulties, but the algorithm becomes considerablymore complicated. Instead, we can avoid many of them. Since the ray we cast is arbitrary,we choose a ray that does not pass near singularities, i.e., a ray that does not intersect anyvertices or edges and does not lie (totally or partially) in any faces or edges. In case of

GMCh6-1 3/29/99 6-9

doubt, e.g., when an intersection point is very close to a vertex, it is safer to assume thatthe intersection is singular. The endpoint of the ray, however, is the given point p to beclassified, and cannot be moved. Therefore we need to check if p is itself a point on theboundary. In practice, it is difficult to select a ray that is a priori known to be free ofsingularities. It is easier to select a random ray and check if it is singularity-free; if not,weselect another ray, and repeat the process. For complex objects, several iterations may beneeded.

We discussed in some detail the intersection-counting PMC algorithm not because it is avery good algorithm, but because it illustrates many of the difficulties that arise ingeometric computation. How to deal with singularities and the effects of numerical errorsare issues that do not affect the asymptotic complexity of the algorihms, and therefore areusually ignored in the theoretical literature. However, if such issues are not addressedcarefully, the resulting systems suffer from severe robustness problems and are unusablefor real-world applications.

We now turn to another PMC algorithm that is also based on ray casting but does not countintersections. First we check if the given point p is a vertex or lies in an edge or a face. Ifnot, we cast a random ray from p, intersect it with all the faces of the solid, and select theintersection point closest to p. If we cannot decide with certainty that this first intersectionpoint is not singular, we select another ray and repeat the procedure. Finally, we examinethe first intersection and infer the point classification from the type of transition at theintersection. The algorithm in pseudo-code is as follows. We assume that the ray isparameterized, starting with u = 0 for the given point p.

Algorithm 6.2.3.1

PtSolClassResult ClassPtSolB(Pt p; Sol S) {

for each face F of solid S do {

if PtSurf(p,Surf(F)) then // Surf(F) is the host surface of F. For p to be on S // it must belong to some F and therefore to some Surf(F) // PtSurf(p,G) is a predicate that returns true if p ∈ G

if ClassPtFace(p,F) == (in or on) then return on; // Exit if p is in the 2-D interior or boundary of a face // Otherwise we know p is not on S; it is in or out };

do forever {

R = CastRay(p); // Pick R randomly IntList = Ø; // Initialize a list of intersection points // Each point is represented by its u parameter value // plus a Boolean flag, Singular, which is true when the // point is in the 1-D interior of an edge or is a vertex

for each face F of solid S do // Intersection loop

GMCh6-1 3/29/99 6-10

if not RaySurf(R,Surf(F)) then { // Predicate RaySurf(R,G) is true if R ⊂ G PList = IntRaySurf (R,Surf(F)); // Compute the intersections // If Surf(F) is curved there may be several intersections // Put them in a list PList

for each point q in PList do { Cvalue = ClassPtFace(q,F); if Cvalue == on then q.Singular = true // q on edge or vertex else q.Singular = false; if Cvalue == (in or on) then Append(q,IntList); } } // End of intersection loop if IntList == Ø then return out // No intersections else { r = FirstPoint(IntList); // Minimum u value if not r.Singular then { Tvalue = TransitionSol(r); // This function determines if the ray at r is going // from inside the solid to outside or vice-versa if Tvalue == (in,out) then return in else return out; } // Otherwise the first point is singular and we go // back and cast another random ray } // End of forever loop }

The first loop of this algorithm determines if p is on the boundary of S . To do this wecheck if p belongs to any of the faces F of S in two steps. First we check if p lies in thehost surface of F. If it does, we classify p with respect to F by means of a 2-D analog ofAlgorithm 6.2.3.1. (ClassPtFace is described later in this section.)

If p is not on S, we enter the do forever loop. We need to find all the intersections ofthe ray R with the boundary of the solid. Therefore we search for intersections between theray and the faces F of S . First we ensure that the number of intersections is finite, i.e., Rdoes not lie in the host surface of a face F. Then we invoke IntRaySurf to obtain theintersection points between the ray and the host surface. (Line/surface intersection routinesare described in a later section in this chapter.) A ray may intersect a host surface withoutintersecting its associated face(s), as shown by the 2-D example of Figure 6.2.2.3.Therefore we need to classify the intersection points in 2-D with respect to the facesthemselves. The output of ClassPtFace(q,F) tells us if there is an intersection, if it isa non-singular intersection, i.e., if the intersection point is in the 2-D interior of the face, orif the intersection is singular, i.e., lies on an edge or is a vertex. We remember if theintersection is singular or not by associating a Singular flag with each intersection point.After all intersections are calculated and stored in IntList, we extract the intersectionpoint closest to p. If this first point is singular we simply cast another ray and repeat theprocedure. It does not matter if intersection points other than the first are singular.Therefore only one iteration is usually needed, even for complex objects.

GMCh6-1 3/29/99 6-11

Finally, we invoke the TransitionSol function and infer from its value theclassification of p. TransitionSol is described in detail later in this chapter, in thesection on neighborhood representation and manipulation. Here we simply note that thevalue returned by this function specifies the type of transition encountered as an imaginaryobserver travels along the ray in the positive direction (i.e., the direction of increasingparameter u). For example, the transition type for the first point r in Figure 6.2.3.3 is outto in, or (out,in.). This value implies that p is out of S . TransitionSol is especiallyeasy to implement robustly when intersections are non-singular. Thus it is preferable to castadditional rays until a first non-singular intersection is found than to attempt to computetransition types at singular intersections.

F Surf(F)

R

q

Sr

p

Figure 6.2.3.3 – A ray R intersects the host surface of a face F of a2-D solidS at a point q but does not intersect the face F itself

There are still several issues to be addressed in Algorithm 1.

1. How do we find if a point belongs to a surface? This is also a classification problem andwe could define a function ClassPtSurf to solve it. Instead we define a predicatePtSurf, which is trivially computed when the implicit equation of the surface is known.We simply plug the coordinates of the point in the equation of the surface and check if it issatisfied. However, if only a parametric equation of the surface is known, the problem ismuch more complicated. We can solve it by computing the distance between the point andthe surface, as explained later in this chapter. If the distance is zero, the point is on thesurface.

2. How do we know if a ray lies in a surface, i.e., how do we compute the predicateRaySurf? This must be done by routines specific to each primitive surface used in themodeling system. For planar surfaces, it suffices to check if two points of the ray belong tothe surface. As another example, consider a cylinder. Here a ray lies in the cylindricalsurface if two of its points belong to the surface and the ray is aligned with the cylinder’saxis.

3. What happens when a ray lies in a face’s surface? This issue is very easy to dispatch:nothing happens. We can ignore in the intersection loop any F for which R ⊂ Surf (F).Tosee why, consider the 2-D example of Figure 6.2.3.4. The ray lies in the host surface ofthe two top faces of the notched object. Transitions between in, on, or out states along theray can occur only at the points a, b and c. But these points are computed in the loop whenwe intersect the ray with the host surfaces of the vertical faces A, B and C. Typicalintersection routines fail (or crash) when the number of intersections is not finite, andtherefore we need to know if the ray lies in the surface before attempting to computeintersection points.

GMCh6-1 3/29/99 6-12

p a b c

A B

C

Surf(A) Surf(B) Surf(C)

Figure 6.2.3.4 – Ray lying on a host surface

4. How do we classify points with respect to faces? This is done by the 2-D analog of the3-D classification algorithm we have been discussing. We work on the 2-D host surface ofthe face, and cast a ray lying in this surface. For curved surfaces a (curved) ray can becomputed, for example, by intersecting the surface with a plane containing the point to beclassified. We present the point/face algorithm below in pseudo-code.

Algorithm 6.2.3.2

PtFaceClassResult ClassPtFace(Pt p; Face F) { // p is assumed to lie in Surf(F)

for each edge E of face F do {

if PtCurve(p,Curve(E)) then // Predicate PtCurve(p,C) returns true if p ∈ C // Curve(E) is the host curve of E. For p to be on F // it must belong to some E and therefore to some Curve(E)

if ClassPtEdge(p,E) == (in or on) then return on; // Exit if p is in the 1-D interior or boundary of E. // (The 1-D boundary of E are its vertices.) // Otherwise we know p is not on F; it is in or out }

do forever {

R = CastRay(p); // Pick R randomly on the Surf(F) IntList = Ø; // Initialize a list of intersection points // Each point is represented by its u parameter value // plus a Boolean flag, Singular, which is true when the // point is a vertex

for each edge E of face F do // Intersection loop

GMCh6-1 3/29/99 6-13

if not RayCurve(R,Curve(E)) then { // Predicate RayCurve(p,C) is true if R ⊂ C PList = IntRayCurve(R,Curve(E)); // If Curve(E) is not a straight line there may be // several intersections. Put them in a list PList

for each point q in PList do { Cvalue = ClassPtEdge(q,E); if Cvalue == on then q.Singular = true // q is a vertex else q.Singular = false; if Cvalue == (in or on) then Append(q,IntList); } } // End of intersection loop if IntList == Ø then return out // No intersections else { r = FirstPoint(IntList); // Minimum u value if not r.Singular then { Tvalue = TransitionFace(r); // This function determines if the ray at r is going // from inside the face to outside or vice-versa if Tvalue == (in,out) then return in else return out; } // Otherwise the first point is singular and we go // back and cast another random ray } // End of forever loop }

We need to be able to determine if a point lies in a host curve. Host curves are intersectionsof primitive surfaces. Therefore we test the point for membership in each of the surfaces tosee if it belongs to both, and hence to the curve. (Point/surface algorithms were describedabove.) Deciding if a ray is a subset of a host curve is treated similarly: we check the rayfor membership in each of the surfaces by the techniques described earlier. TheTransitionFace function is very similar to TransitionSol; both are discussed laterin this chapter. Finally, ClassPtEdge is trivial for points and edges representedparametrically. It amounts to comparing the parameter value of the point with the valuesthat correspond to the edge’s enpoints. (Note that we only invoke the point/edge classifierfor points that lie in the edge’s host curve.)

Let us collect together here the main low-level routines needed by ClassPtSolB.

• A PtSurf predicate, which often is implemented as in-line code since it is so simple.• ClassPtFace for points lying in Surf(F).• A RaySurf membership predicate to determine if a ray lies in a surface.• IntRaySurf to compute the parameter values of the intersection points. This can be

done be invoking IntCurveSurf (discussed later in this chapter) with the host curveof the ray as one of the arguments, and retaining those points that belong to the ray,i.e., for which u ≥ 0.

• TransitionSol for determining the type of transition at an intersection point.

GMCh6-1 3/29/99 6-14

In addition, ClassPtFace needs the following routines, which are 2-D analogs of thosejust listed above.

• A PtCurve predicate, often implemented as in-line code since it is so simple.• ClassPtEdge for points lying on Curve(E). This is a very simple routine that

compares parameter values along the host curve.• A RayCurve membership predicate to determine if the ray lies in the curve.• IntRayCurve to compute the parameter values of the intersection points. This can be

done be invoking IntCurveCurve (discussed later in this chapter) with the hostcurve of the ray as one of the arguments, and retaining those points that belong to theray, i.e., for which u ≥ 0.

• TransitionFace for determining the type of transition at an intersection point.

6.3 Curve-Solid Classification

6.3.1 General Membership Classification

A point p is either inside, outside, or on the boundary of a reference set S. But a candidateset X that contains a continuum of points generally is not entirely contained in either S or itscomplement. Therefore, we need a more general notion of membership classification. Firstlet L be a straight-line segment and S a solid. We define the membership classification of Lwith respect to S as the function

M(L,S) = (LinS,LonS, LoutS)

LinS = r1(L ∩ iS)

LonS = r1(L ∩ S)

LoutS = r1(L ∩ cS)

Therefore M divides the line segment into three subsets LinS , LonS, and LoutS , which are,respectively, inside, on the boundary, and outside of S . In the formulas above, r1 denotesregularization in 1-D. It implies that the results should be composed of closed line segmentswith no dangling, or isolated, points.

More generally, we consider a candidate set X that is regular in the topology associatedwith a space ′ W , and a reference set S that is regular in the topology of a space W ⊃ ′ W .Then we define membership classification as

M(X , S) = (XinS, XonS, XoutS)

XinS = ′ r (X ∩ iS)

XonS = ′ r (X ∩ S)

XoutS = ′ r (X ∩ cS)

where the regularizations are in the topology of ′ W . We focus on line classification in thenext subsections, but we will later also use classification of 2-D surface segments withrespect to solids.

GMCh6-1 3/29/99 6-15

6.3.2 Line Classification for CSG Solids

Not surprisingly, M must be augmented with neighborhood information to be computed bydivide and conquer, because of on/on ambiguities that arise for points lying on overlappingsolid boundaries. The augmented classification function contains also the neighborhoodN(LonS,S) of a generic point of LonS with respect to S . The augmented segmentation of Lis denoted LwrtS (pronounced “L with respect to S”), and equals

LwrtS = [LinS,( LonS, N(LonS,S)),LoutS] .

LwrtS may be computed by the following divide and conquer procedure for a solid Srepresented by CSG.

Algorithm 6.3.2.1

LineSolClassResultC ClassLineSolC(LineSeg L; Sol S) { if Prim(S) then return ClassLinePrimSolC(p,S); else return CombLineSolClass(ClassLineSolC(L,S.Left), ClassLineSolC(L,S.Right),⊗);}

This algorithm is very similar to Algorithm 6.2.2.1 for point classification.

To classify a line with respect to a primitive solid, consider the primitive as a CSGcombination of half spaces, classify the line with respect to each half space, and thencombine the results. For example, a solid block is the intersection of 6 planar half spaces.Therefore, we classify the line with respect to the block by using Algorithm 6.3.2.1, withprocedure ClassLinePrimSolC replaced by ClassLineHalfSpace. Classificationwith respect to a half space amounts mainly to intersecting the line with the boundingsurface of the half space, which is discussed later, in the section on curve and surfaceintersections.

How do we combine line classifications? Figure 6.3.2.1 illustrates the combine procedurewith a simple example. The solid S is the regularized union of two subtree solids A and B.(Each of these two may itself be composed of several primitives, but this is irrelevant forthe combining procedure.) In this example there are no on components, and the resultingclassification can be written as follows.

XinS = XinA ∪1* XinB

XonS = ∅XoutS = X −1

* XinB

Here the operations have a subscript to denote that they are regularized in the 1-D topologyof the line. We can see that the combine procedure consists essentially of Booleanoperations in 1-D. In a more complicated example, we may encounter on/on ambiguitieswhen object boundaries overlap, and these require also neighborhood manipulations.

GMCh6-1 3/29/99 6-16

A

B

XinA

XoutA

XinB

XoutB

XinS

XoutS

S = A ∪∗ B

X

Figure 6.3.2.1 – Combining line classifications.

The implementation of CombLineSolClass depends on how we choose to representclassification results. A convenient representation for LineSolClassResult is depictedin Figure 6.3.2.2. It consists of a sorted list of pairs (u, Nu), where u is a parameter valuethat corresponds to a transition point (i.e. , to a classification change), and Nu is theneighborhood with respect to a solid of a point p(u + ε), where ε > 0 is an infinitesimal

distance. Intuitively, this is a point immediately to the right of the transition point p(u). Τhe

list is sorted in ascending order of parameter values. In the figure, E stands for an Emptyand F for a Full (i.e., an entire ball) neighborhood. The transition points segment the lineinto subsets such that all the points in a subset have the same neighborhood. For example,all the points in the segment a1a2 have a full neighborhood (associated with the leftendpoint of the segment). The classification values in, on or out for each segment need notbe stored because they can be inferred trivially from the associated neighborhoods. The lastneighborhood value is meaningless and can be ignored.

u

a0 = 0

Na0 = E

X

a1

Na1 = F

A

a2

Na2 = E

a3 = 1

Na3 = E

Figure 6.3.2.2 – Representation for classification results.

GMCh6-1 3/29/99 6-17

Observe that the partial results of line-solid classification at each stage of the recursion inAlgorithm 6.3.2.1 are segmentations of the same line segment that is being classified.Therefore, the representation for LineSolClassResult illustrated in Figure 6.3.2.2 isappropriate for all these intermediate results, and, together with a representation for theoriginal LineSeg being classified, defines the results unambiguously.

The combine procedure can be divided conceptually into the three following phases:

• Merge the two input sorted lists of parameter values. The classification of a linesegment with respect to the resulting solid S = A ⊗ B can only change at the transitionpoints of the input classifications with respect to the solids A and B.

• Compute the neighborhoods with respect to S by combining the input neighborhoodrepresentations, according to the operator in S = A ⊗ B .

• Clean up the results, by merging segments with the same classification

The procedure is illustrated in Figure 6.3.2.3 for the union of two simple objects. Theclassification of line segment X with respect to A is the sorted list of pairs (a, Na) shown inthe figure. The other input is the classification of X with respect to B, also represented by asorted list of pairs. The output is the classification with respect to S shown at the bottom ofthe figure. The two sorted lists of parameter values a and b are merged to produce a list of sparameter values. A neighborhood Ns is computed as the regularized union of thecorresponding Na and Nb . For example, Ns1 = Na1 ∪∗ Nb0 = F ∪∗ E = F . Note that thereare three consecutive segments with full neighborhoods. These are merged in the clean-upphase of the algorithm.

u

a0 = 0

Na0 = E

X

a1

Na1 = F

A

a2

Na2 = E

a3 = 1

Na3 = E

• • ••

• • •

•

•

• • •• •

b0 = 0

Nb0 = E

b1

Nb1 = F

b2

Nb2 = Eb3 = 1

Nb3 = E

s0 = 0

Ns0 = E

s1

Ns1 = Fs2

Ns2 = F

s3

Ns3 = F

s5 = 1

Ns5 = E

s4

Ns4 = E

B

S = A ∪∗ B

Figure 6.3.2.3 – Combining two classification result representations.

GMCh6-1 3/29/99 6-18

In practice, the three phases of the combine procedure can be merged into a single scan ofthe two sorted lists. When a new point is placed in the output list, we compute itsassociated neighborhood, and check whether the point corresponds to a transition, or isredundant and need not be stored. In the example of Figure 6.3.2.3, points s2 , s3 areredundant.

The combine procedure just described works with parameter values, not with thecoordinates of the transition points. This implies that the procedure can be applied to anyparameterized curve that is homeomorphic to a line segment. It can also be applied to acircle parameterized by the central angle if we modify the calculations slightly to take intoaccount that 0 = 2 = = 2n . This modified procedure is then applicable to anyparameterized closed curve homeomorphic to a circle. Most geometric modeling systemsparameterize all their curves, and segment them into subsets that are connected, compact 1-manifolds (i.e., they are homeomorphic to either circles or line segments). Therefore ourcombine procedure has wide applicability.

6.3.3 Line Classification for BRep Solids

Algorithms for line/solid classification for BRep objects are very similar to the pointclassification algorithms discussed earlier in Section 6.2.3. This is not surprising, sincepoint classification was done essentially by casting a ray and classifying it. However, in thePMC case we were free to choose a convenient ray that avoided singularities, whereas nowwe must classify the given input line segment. In addition, it is not sufficient to examineonly the first point of intersection because we need to segment the input into its in, on andout subsets.

The BRep is assumed to contain neighborhood representations for all of its faces andedges. We represent classification results by a simplified form of the representation in theprevious section. We store in ascending order the parameter value u for each point wherethe classification changes along the line, plus the classification value at a point u+ . Recall

that in Section 6.3.2 we also stored a neighborhood representation at u+ .

We need a slightly modified point/face classifier ClassPtFaceN that returns not only anin/on/out value, but also the neighborhood of the point of intersection with respect to thesolid (hence the suffix N). The neighborhoods for intersection points that are in the 2-Dinterior of a face or in the 1-D interior of an edge can be copied directly from the solid’sBRep. If the intersection occurs at a vertex, the neighborhood is assigned the special valueUnknown. The algorithm in pseudo-code follows.

Algorithm 6.3.3.1

LineSolClassResultB ClassLineSolB(LineSeg L; Sol S) {

IntListNC = Ø; // Initialize a list of intersection points // Each point is represented by its u parameter value and // the Nbhd and ClassVal associated with it

for each face F of solid S do {

GMCh6-1 3/29/99 6-19

if not LineSurf(L,Surf(F)) then { // Predicate LineSurf(L,G) is true if L ⊂ G PList = IntLineSurf (L,Surf(F)); // Compute the intersections // If Surf(F) is curved there may be several intersections // Put them in a list PList

for each point p in PList do { C = ClassPtFaceN(p,F); // Returns a classification value plus a neighborhood // If p is a vertex the Nbhd of C = Unknown if C.Value == (in or on) then Append(p,IntListNC); } } // End of face loop if IntListNC == Ø then { // No intersections // L is entirely in, on or out q = SelectPt(L); // Pick an arbitrary point of L and classify it return ClassPtSolB(q,S); }

else { // Intersection list processing Append(EndPoint1,IntListNC); Append(EndPoint2,IntlistNC); // Add the endpoints and set their neighborhoods to Unknown IntListNC = Sort(IntListNC); // Sort by u value and merge coincident points until end of IntListNC do { if p.Nbhd == Unknown then p.ClassVal = Unknown else { Tvalue = TransitionSol(p); // Tvalue is a pair of elements such as (in,out) Previous(p).ClassVal = FirstElem(Tvalue); p.ClassVal = SecondElem(Tvalue); } // End else Next(p); } // End of list traversal

// At this point we may miss classification values for some // segments because of intersections at vertices. Classify // the mid-points of such segments

until last but one element of IntListNC do { if p.ClassVal == Unknown then { q = MidPoint(p,Next(p)); p.ClassVal = ClassPtSolB(q,S); } // End then Next(p); } // End do } // End of intersection list processing

GMCh6-1 3/29/99 6-20

return IntListNC without the Nbhd field}

Intersection list processing is illustrated in Figure 6.3.3.1. Points b, c and d are computedby intersecting the line segment L with the faces of the solid. (Actually b will be foundtwice, as the intersection of L with the left and back faces of the solid; duplicates aremerged.) The endpoints of the line segment are assigned an Unknown neighborhood.Since b is a vertex, its neighborhood is also set to Unknown. After TransitionSolreturns, we know that the transitions at c and d are (on,in) and (in,out), respectively, butTransitionSol is unable to determine the transition types of points a, b and e, whichhave Unknown neigborhoods. This situation is shown schematically in the first labelledhorizontal line of the figure. The transition information at c implies that segment bc is onand segment cd is in. Therefore we can attach a ClassVal of on to b and a ClassValof in to c. (Recall that in our representation the classification value of a segment is attachedto the initial point of the segment.) The transition type of d implies a ClassVal of out ford and in for c. Note that we have found c’s value twice. This redundancy occurs most ofthe time in this algorithm, and can be avoided through more sophisticated programming,but contributes little to the overall cost. The value of b was found by propagation of thetransition type of c, although the neighborhood of b was Unknown because it is a vertex.The second horizontal line in the figure shows the situation at the end of the first listtraversal. The ClassVal for a is still Unknown, and so is e’s, but this latter is irrelevantand can be ignored. We classify the midpoint q of the segment ab and find that q is out,and hence so is the whole segment ab, as shown in the third horizontal line.

? ?(on,in) (in,out)?

a

a

b

bc

c

d

ed

e

? on

out

in out ?q

on in out ?

Figure 6.3.3.1 – Intersection list processing in ClassLineSolB

GMCh6-1 3/29/99 6-21

We avoided the difficult problems of representing vertex neighborhoods and reasoningabout the transitions at vertices by using the strategy of testing an interior point of asegment by point classification. We could similarly avoid dealing with neighborhoods ofpoints lying on edges (e.g., c in Figure 6.3.3.1), but these are considerably easier thanvertex neihborhoods, and it is generally more effective to handle them directly. Note thatmidpoint classification amounts essentially to another line classification (withoutsingularities), which costs almost as much as the original one, and therefore should be usedsparingly.

6. 4 Neighborhoods

This section discusses how to represent and combine neighborhoods, and how to infertransition types from neighborhood information.

6.4.1 Representation and Combination

The neigborhood of a point with respect to a geometric entity (e.g., a solid) can berepresented explicitly, as we will see below, or implicitly, by pointers to entities that areadjacent to the point. For example, if a point is in the interior of an edge that is shared bytwo faces of a solid, pointers to these two faces suffice to define completely the localgeometry of the solid in the neighborhood of the point. Implicit representations are used inmany geometric modeling systems, but often lead to complicated algorithms for combiningneighborhood information. In this text we focus on explicit neighborhood representations,which are relatively simple to combine.

The representations discussed below assume that geometric entities have associatedcoordinate systems that can be used as references, for example, for measuring angles.These coordinate systems can be defined as follows. We assume that all curves andsurfaces are oriented at construction time. Thus, a surface constructor associates a referencenormal with the surface being instantiated. It does not matter which of the two possibleorientations for a surface normal is chosen, but the system must be able to know which iswhich. Reference normal assignment may be done explicitly, by attaching to the surfacerepresentation a normal vector to the surface at a given point, i.e., an applied vector. Thisvector must then be carried along with the surface representation and updated if the surfaceis moved in space. Alternatively, the normal may be defined implicitly. For example, asystem may define a reference normal to spherical and cylindrical surfaces as the inwardpointing normal, in the direction of decreasing radial coordinate.

Similarly, a curve constructor associates a reference tangent vector with the curveinstance. Again, this vector can be represented expicitly and associated with the curverepresentation, or implicitly. Often, the reference tangent for a parametric curve isrepresented implictly by agreeing that it points in the direction of increasing parametervalues. For example, a line segment is usually represented by two endpoints p and q suchthat p corresponds to u = 0 and q to u = 1. The reference tangent then points from p to q.When a curve lies on a surface we use the curve’s tangent and the surface’s normal todefine a binormal = × .

The most important types of neighborhoods needed in geometric modeling algorithms arethe following.

GMCh6-1 3/29/99 6-22

3-D face neighborhood : neighborhood of a point that lies in the interior of a face withrespect to a solid. The solid S is known, and so is the face F on which p lies. In addition,the face’s reference normal is also known. Since the point is not on an edge and is not avertex, there are only two possible situations: either the solid’s material locally is on theside towards which points, or is on the opposite side. One bit of data, called a side bit,suffices to distinguish between these two situations. For example, we can represent thefirst case by a 1, and the second by a 0. A complete 3-D face neighborhood representationfor a point p on a face F with respect to a solid S is then a pair

Nbhd(p,S) = (SideBit(p,S), RefNormal(p,S)).

Alternatively we can combine the two entities into just one normal vector that pointstowards the material side:

InwardNormal(p,S) = if SideBit(p,S) then RefNormal(p,S) else –RefNormal(p,S)

Figure 6.4.1.1 shows a point on a cube’s face, and the associated reference normal ’. TheSideBit in the representation for the neighborhood of p with respect to S is a 0,signifying that ’ points away from the material. The corresponding InwardNormal

representation is the vector – . Note that both of these representation schemes also applyto curved surfaces.

p F

S

Figure 6.4.1.1 – The side bit for p is 0.

3-D face neighborhood representations are easy to combine by comparing normaldirections. We define a predicate

SameSide = (InwardNormal(p,A) == InwardNormal(p,B))

which is true when the two material sides coincide and false when they are opposite.Neighborhoods represented by their InwardNormal values can be combined as follows.We are given the neighborhoods of a point with respect to solids A and B and we want tocombine these so as to obtain the neighborhood of the same point with respect to solidC = A ⊗ B , where ⊗ denotes a regularized Boolean operation.

GMCh6-1 3/29/99 6-23

Algorithm 6.4.1.1

NbhPtSol CombNbhdPtSol (NbhPtSol nA, nB) {

// nA is the neighborhood of a point p with respect to solid // A and nB the neighborhood of the same point with respect // to B; p is in the interior of faces fA and fB of A and B. // Neighborhoods are represented by their inward normals // or the special values Full and Empty.Input neighborhoods // are assumed neither Full nor Empty

if SameSide then case ⊗ of {

Union: nC = nA; Difference: nC = Empty; Intersection: nC = nA; }

else

case ⊗ of {

Union: nC = Full; Difference: nC = nA; Intersection: nC = Empty; } }

Figure 6.4.1.2 illustrates neighborhood combination in 2-D when the operator is union.The same algorithm applies to objects with curved surfaces, provided that the normals arecomputed at the same point p.

ppA

BA

B

Figure 6.4.1.2 – 3-D face neighborhood combination for the union operator.On the left the rectangles A and B overlap, and on the right they just touch.

3-D edge neighborhood: neighborhood of a point that lies in the interior of an edge withrespect to a solid. The situation now is more complicated. We work in the plane normal tothe reference tangent vector at p. This normal plane intersects the point neighborhood in asector subtended by the intersections of the normal plane and the faces adjacent to the givenedge—see Figure 6.4.1.3. All that we need is to represent this sector, as a pair of angles.To obtain an origin for measuring the angles we construct a local coordinate system withbasis vectors ( , , ) as follows:

GMCh6-1 3/29/99 6-24

= y ×= ×

Here is the tangent vector to the edge and y is the basis vector that corresponds to the yaxis of the lab coordinate frame. If these two vectors are parallel we choose

= z

= x

= y

We use the axis as origin, and measure the angles in the plane with the positive sense

being given by the motion of a right-handed screw as it advances in the positive direction.For example, the sector shown on the right in Figure 6.4.1.3 is represented by the angularinterval in degrees (–90,0).

p

S

y

0

−90

Figure 6.4.1.3 – Representation for a 3-D edge neighborhood.

Sector representations are easy to combine through regularized Boolean operations, asshown in Figure 6.4.1.4. If we replace the sectors by the corresponding arcs of, say, theunit circle, the combine procedure required is precisely the same as for combining circulararc classifications, discussed in Section 6.3.2.

–45

22.5

4567.5

•

••

•

GMCh6-1 3/29/99 6-25

Figure 6.4.1.4 – The regularized intersection of sectors (–45,45) and(22.5, 67.5) is the sector (22.5,45), which can be computed by intersecting

the circular arcs that correspond to the argument sectors.

Edge neighborhoods for non-manifold solids can be represented by lists of angular sectors,instead of single sectors. The circle classification procedures used for combiningneighborhoods can handle such lists.

For objects with curved surfaces, a sector representation for edge neighborhoods can stillbe constructed, by using the tangents to the curves of intersection between the curvedsurfaces and the plane normal to the edge. But this representation is incomplete. It mustbe supplemented, for example, with pointers to the actual curved surfaces. The tangentapproximation can be used to combine neighborhoods in most cases. We use the standardcircle classification combine procedure and produce a new sector list. We also carry alongwith each angle the corresponding surface pointer and output a list of angular intervals,each with two associated surface pointers. This procedure fails when two distinct surfaceshave the same tangent approximation in the normal plane to the curve. Figure 6.4.1.5provides an example. The two circles in the figure have the same tangent. When thisoccurs, we resort to higher-order approximations for the curves of intersection of thesurfaces with the plane. (This approach was used successfully in the PADL-2 system.)In essence we must sort the surfaces (or the curves of intersection of the surfaces with thenormal plane) around the edge, so as to properly identify and process the (curved) sectorsthat lie between pairs of surfaces. We omit the details, because they are not very instructiveand the first-order approximations fail very seldom.

Figure 6.4.1.5 – The linear approximation does not suffice for combiningneighborhoods of points on the intersection of two tangent cylinders.

2-D edge neighborhood: neighborhood of a point that lies in the interior of an edge of asolid with respect to a face of the solid. We use the binormal to define a direction normalto the edge and tangent to the face’s surface. Then, a 1-bit representation suffices todistinguish between the two possible cases, much like in 3-D face neighborhoods—seeFigure 6.4.1.6. In some geometric modeling systems 2-D neighborhood information isencoded in the orientation of the edges.

GMCh6-1 3/29/99 6-26

F

p

Figure 6.4.1.6 – The representation for the 2-D edge neighborhood of p is 0,since F is on the side of the binormal ’s tail.

3-D vertex neighborhood: neighborhood of a vertex with respect to a solid. 3-D vertexneighborhoods are considerably more difficult to represent and combine than their edge orface counterparts. They are not needed in the algorithms discussed in this text. However,some geometric modeling systems are based primarily on vertex-neighborhoodmanipulations [Mantyla TOG], [ref on Noodles].

2-D vertex neighborhood: neigborhood of a vertex with respect to a face of a solid. Theseare very much like 3-D edge neighborhoods, and also are not needed in our algorithms.

6.4.2 Transition Evaluation

The algorithms we presented earlier for set membership classification with respect to BRepsolids relied on our ability to infer the type of transition encountered when a line (or, moregenerally, a curve) intersects the boundary of a solid. We assume throughout this sectionthat an observer travels along the curve in the direction of its reference tangent , whichusually is the direction of increasing parameter values.

Transition evaluation is very easy when a curve/solid intersection is non-singular, i.e.,when the point of intersection lies in the interior of a face of the solid, and the intersectionis “clean” (also called transversal). A transversal intersection occurs when a curve is nottangent to the solid’s face. Figure 6.4.2.1 illustrates the procedure in 2-D. It is clear fromthe figure that

⋅ > 0 ⇒ out → in

⋅ < 0 ⇒ in → out

Here we denote by the inward normal to a face. If the inner product above is zero, acurve is tangent to a face. This cannot happen for line/plane intersections in our algorithms.If it did, the line would lie entirely in the plane and there would be a continuum ofintersection points. Recall that in our algorithms we do not compute intersections of lineswith faces in which they lie. However, for general curves and surfaces tangentialintersections may occur. They are treated as singular intersections, by techniques similar tothose described below.

GMCh6-1 3/29/99 6-27

p p

(in,out)

(out,in)

Figure 6.4.2.1 – Transition evaluation for non-singular line/plane intersections

When the intersections are singular, we can take two approaches. Either we avoid thesingularities, or we resolve them. Avoiding singularities is easy but not very efficient.Figure 6.4.2.2 presents a 2-D example. The line/rectangle intersection occurs at a vertex.Instead of reasoning about the vertex neighborhood so as to infer the transition type, wesimply select and classify a point in the interior of a line segment between intersectionpoints. The point classification is also the classification of the entire segment. Pointclassification is done by casting a ray, but the ray is arbitrary and can be selected to avoidsingularities. The drawback of this method is that it requires additional ray classificationsand therefore can be expensive.

Figure 6.4.2.2 Avoiding singularities by classifying a segment’s midpoint.

Resolving vertex singularities is complicated, and will not be addressed in this text.Avoiding such singularities as explained above often leads to more robust and efficientalgorithms. Also, vertex singularities occur rarely. Edge singularities are much morecommon, and can be resolved as follows.

We work in the plane normal to the edge at the point of intersection p, as shown in Figure6.4.2.3. We assume that the line segment L to be classified does not lie in the host line ofthe edge E. (If it did, the computed intersection points in our algorithm would be vertices ofthe solid, and we would have vertex singularities, instead of edge singularities.) We projectthe line segment on the normal plane of the edge. The projection is another line segment.We represent the 3-D edge neighborhood by an arc (or several arcs, for non-manifoldneighborhoods) on the unit circle in the normal plane, and intersect the projected linesegment (extending it, if necessary) with the unit circle. Finally, we classify the two pointsof intersection q and r with respect to the arc(s) that represent the neighborhhod. Thetransition type follows directly from these classification values, read in the order of

GMCh6-1 3/29/99 6-28

increasing parameter (or reference tangent) for the line. We consider the unit circleparameterized by the angle about the center, and classify the points q and r simply bycomparing their polar coordinates with the angles that correspond to the neighborhood.

L

E

p

Normal PlaneNormal Plane

p

q

r

Out

In

Projection of L on Normal Plane

Figure 6.4.2.3 – The line segment L intersects the solid at the point p in theinterior of edge E. The neighborhood of p with respect to the solid cube is

the thick arc in the plane normal to E at p, shown on the right. The projectionof L on the normal plane intersects the unit circle at q and r. Point q is out

of the neighborhood arc, and r is in. Therefore the transition type is (out,in.) .

GMFC 4/12/99 1



6.5 Boolean Operations

This section discusses several problems related to computing BReps of solids that arecombined by regularized set operations, usually called simply Boolean operations.

6.5.1 Boundary Evaluation and Merging

Regularized set operations have precise mathematical definitions, and therefore theevaluation of S = A ⊗ B is a well-defined mathematical problem. Here A, B, and hence Sare r-sets, and ⊗ is a regularized set operator (union, difference or intersection). We sawin Section 6.1 that a mathematical problem may have many computational versions,depending on the representations selected for the input and output. If the input and outputrepresentations are CSG trees, the problem is trivial: the output CSG tree consists simplyof the input subtrees joined by a new operator node. As a more interesting example, thereare efficient algorithms for computing Boolean operations when the input and output arerepresented by octrees. In this text, we focus on Boolean opeartion calculations that involveBReps, because these are the most common and important versions of the general problem.

The algorithms to be discussed are all based on a fundamental fact: boundaries may bedestroyed, but they cannot be created by Boolean operations. In precise terms, this can beexpressed as follows.

Boundary Inclusion Relationship:

∂(A ⊗ B) ⊆ ∂A∪ ∂B .

This is easy to prove rigorously. It implies that the BRep of a solid S that results from oneor more Boolean operations can be computed by first generating a superset of its boundaryas the union of the boundaries of the sets being combined, and then discarding thoseportions of the superset that are not on S . This is an instance of a commonly-usedalgorithmic paradigm called generate-and-test.. In pseudo-code the algorithm is as follows.

Algorithm 6.5.1.1

Generate a sufficient set of tentative faces Ffor each F do {FwrtS = ClassFaceSol(F,S);AddToBRep(Fons); }

Generation of tentative faces is guided by the boundary inclusion relationship discussedabove. Testing is done by set membership classification. Not shown in the pseudo-code

GMFC 4/12/99 2

above (and in the algorithms to be discussed in the next subsections) is a clean-up phase, inwhich adjacent faces lying on the same surface are merged, and links required by thespecific BRep scheme being used are established. (It is possible to clean up as you go, byusing a sophisticated AddToBRep routine. This makes the code harder to understand, andwill be ignored in this text.)

A sufficient set of faces is any set that includes ∂S . The boundary inclusion relationshipimmediately suggests that we use the faces of A and the faces of B to compute the BRep ofS = A ⊗ B . If we do that, we have a boundary merging problem, which can be defined asfollows.

Boundary Merging:

Given: BRep of A and BRep of BFind: BRep of S = A ⊗ B

Suppose now that A and B have CSG representations. Applying the inclusion relationshiprecursively down each subtree, we conclude that the set of all the faces of all the primitivesin the CSG representation of S also is a sufficient set. If we use this set we have a non-incremental boundary evaluation problem.

Non-Incremental Boundary Evaluation:

Given: CSG representation of SFind: BRep of S

This is a pure representation-conversion problem, in which we convert a CSG tree into aBRep. A variation on this problem consists of converting a hybrid CSG/BRep into a BRep.Suppose that we represent S by a CSG tree in which the leaves are not primitive. Instead,they are solids represented by BReps. Now we use the faces of the leaf BReps as the set oftentative faces. (This approach was used in the boundary evaluator of the PADL-2 system.)Boundary merging is a special case of this problem in which the right and left subtrees areleaves, represented by BReps.

Non-incremental boundary evaluation can be useful, for example, to de-archive a storedobject defined in CSG. However, it is inadequate to support interactive design. Supposethat a user is constructing an object by applying successive Boolean operations. A gooduser interface must show the user the result of each operation. This is normally done bycomputing a BRep and generating a graphic display from the BRep. If boundary evaluationwere non-incremental, each small change in the object would cause a complete re-evaluation of the boundary—a slow and inefficient approach. What is needed is a methodfor updating the boundary incrementally as the user constructs the object.

Incremental Boundary Evaluation:

Given: CSG representation of S , and BReps of its subtrees A and BFind : BRep of S

Note that we could solve the incremental boundary evaluation problem by recursiveapplication of a boundary merging algorithm, or we could ignore intermediate BReps andcompute the final BRep for S non-incrementally (and slowly). But there are non-incremental evaluation methods that exploit both boundary information and CSGclassification algorithms, and are worth studying.

GMFC 4/12/99 3

Algorithm 6.5.1.1 could be implemented by writing a suitable face classifier. However, itis more profitable to look directly for the edges of the resulting solid. These edges are theprimary components of the face representations that appear in the BRep of S . The nextsubsections discuss edge-based generate-and-classify algorithms for non-incrementalboundary evaluation anf for boundary merging.

6.5.2 The Generate-and-Test Paradigm for Edges

The basic algorithm is similar to the face-based generate-and-test Algorithm 6.5.1.1. Wegenerate a sufficient set of tentative edges and then classify them. There is a subtlety. Allwe need to do in the face-based algorithm is to extract the on subset of each face, but nowthere may be portions of curves that are on S and yet they are not edges of the BRep of S .The example in Figure 6.5.2.1 illustrates the issue. We assume that our BRep’s faces are c-faces, i.e., connected, maximal subsets lying on primitive surfaces. In the figure wesubtract the block B from the L-shaped object A. Edge E is on S but it is not an edge of theface F of S . It lies in the interior of th F. These edges can be immediately discarded bychecking their neighborhoods. If the neighborhood indicates that the sector filled withmaterial is bounded on both extremes by the same surface, then the edge is on the boundaryof S but is not the boundary of a face of S . The generate-and-test algorithm for edgesfollows.

A

B

F

E

A

B

E

F

Figure 6.5.2.1 – The tentative edge E is on S = A −*B but is not an edge in its BRep.(3-D view at the top, and section view at the bottom.)

GMFC 4/12/99 4

Algorithm 6.5.2.1

Generate a sufficient set of tentative edges Efor each tentative edge E do { EwrtS = ClassCurveSol(E,S); if not SingleSurf(NbhdSol(EonS,S)) then AddToBRep(EonS); } // SingleSurf is true if the neighborhood is bounded // by the same surface on both extremes.

Here we are again ignoring the clean-up phase, in which the resulting edges are properlylinked to the BRep structure, and so on. Note that it is very easy to group the edges intopotential faces simply by remembering on which surfaces they lie. This information isavailable from the edge generation phase, as we will see shortly.

Two issues remain to be addressed: how to generate tentative edges, and how to classifythem. The second was discussed earlier, in the section on curve/solid classification. We canuse edge classification procedures based either on CSG or BReps, depending on which ofthese representations is available.

Unlike faces, new edges can be generated by Boolean operations. Because of the boundaryinclusion relationship, however, any edge of the resulting solid must be a subset of one ormore of the faces in a sufficient set of tentative faces. Figure 6.5.2.2 shows that the edgesof a Boolean composition are either (i) subsets of the edges of the objects being combinedor (ii) the intersection of two faces from different objects. We call the first self-edges, andthe second cross-edges. Therefore we generate tentative edges by collecting all the self-edges of the objects being combined, and intersecting pairs of their faces to produce cross-edges. In practice, intersecting faces is difficult, because the faces can be of arbitrarycomplexity. Instead, we can generate an over-sized cross-edge, for example, byintersecting two supersets of the faces. For planar faces the simplest supersets arerectangles. These are very easy to intersect and, provided that each encloses a face, producea tentative edge that is larger than necessary. This does not affect the correctness of thealgorithms because tentative edges are classified and only their on subsets are output.

A

B

Face F

Face G

Figure 6.5.2.2 – Tentative edge generation

We are now ready to present simple but complete algorithms for Boolean operations. In thesequel, we assume that BReps contain face and edge neighborhoods.

GMFC 4/12/99 5

6.5.3 Incremental Boundary Evaluation

We are given both a BRep and a CSG tree for two solids A and B and we want to computethe BRep of S = A ⊗ B . We consider as tentative faces the faces of A and the faces of B.Thus, the edges of A and B are the self-edges in the tentative edge generation step. Cross-edges are produced by intersecting supersets of the faces of A with supersets of the faces ofB. These supersets can be, for example, faces of primitives, which normally are simpleshapes.

After the tentative edges are generated, they must be classified with respect to S . We do thisby using the CSG representation of S and the classification algorithms for CSG. Note thatthe classification of the self-edges of A with respect to A is known: the edges are on A andtheir neighborhoods are available in the BRep of A. Therefore we simply classify themwith respect to B and combine the results to get the classification with respect to S .Similarly, the self-edges of B can be classified only with respect to A and the resultscombined with the known classifications with respect to B.

A complete algorithm in pseudo-code follows. (This algorithm is a simplified version ofPADL-2’s boundary evaluator.)

Algorithm 6.5.3.1

for each face F of A do { // Self-edges of A for each self-edge E of F do { EwrtA = (E, NbhdSol(E,A)); // Read from the BRep of A EwrtB = ClassEdgeSolC(E,B); // Use CSG classification EwrtS = CombClassEdgeSolC(EwrtA, EwrtB,⊗); if not SingleSurf(NbhdSol(EonS,S)) then AddToBRep(EonS); } }

for each face F of B do { // Self-edges of B for each self-edge E of F do { EwrtB = (E, NbhdSol(E,B)); // Read from the BRep of B EwrtA = ClassEdgeSolC(E,A); // Use CSG classification EwrtS = CombClassEdgeSolC(EwrtA, EwrtB,⊗); if not SingleSurf(NbhdSol(EonS,S)) then AddToBRep(EonS); } }

for each face F of A do { // Cross-edges for each face G of B do { if not Surf(F) == Surf(G) then { E = F’∩ G’; // Use supersets: F’⊇ F, G’⊇ G EwrtS = ClassEdgeSolC(E,S) if not SingleSurf(NbhdSol(EonS,S)) then AddToBRep(EonS); } } // B face loop } // A face loop

This algorithm has obvious inefficiencies, which are easy to fix. For example, cross-edgesare computed twice.

GMFC 4/12/99 6

6.5.4 Boundary Merging

Now we are given the BReps of solids A and B and we want to compute the BRep ofS = A ⊗ B . We can use Algorithm 6.5.3.1 provided that we replace the edge/solidclassification routines with their BRep counterparts. These were discussed earlier, inSection 6.3.3.

Instead of classifying the cross-edges with respect to the two solids A and B by usingBRep algorithms, we can use face/edge operations, as shown in Figure 6.5.4.1. Observethat the intersection of the two faces F and G in the figure consists of two segments. Oneis a self-edge of G and the other is the intersection of the 2-D interiors of F and G,regularized in 1-D. The self-edge need not be considered here because it is processedwithin the self-edge loops of the algorithm. Therefore, what we need to compute is

E = r1(i2F ∩ i2G) ,

where the interiors are in the 2-D topologies of the host surfaces of the faces, and theregularization is in the 1-D topology of the host line of the cross-edgeC. Now

i2F ∩ i2G ⊆ C ⇒ (i2 F ∩C) ∩ (i2G ∩C) = i2 F ∩ i2G .

Regularizing and using the standard notation for classification results yields

E = r1((i2F ∩ C) ∩(i2G ∩ C)) = CinF ∩1* CinG ,

where the regularized intersection is in 1-D.

F

G

C

CinF

CinG

ConS

ConG

Figure 6.5.4.1 – Edge/face operations

Therefore the procedure is a follows. First we generate an oversized cross-edge C asbefore, i.e., by intersecting supersets of the two faces F and G under consideration. Thenwe classify the edge with respect to each of the faces, and combine the results by

GMFC 4/12/99 7

intersecting the two in subsets. We classify the cross-edge C with respect to faces F and Gby using a 2-D edge/face classifier for faces represented by their boundaries. Note that theedge E thus computed always has a partially full neighborhood with respect toS , regardlessof the Booelan operator used to combine A and B. In other words, E is an edge of S .Therefore E = ConS, and we don’t need to further classify it.

A complete algorithm for boundary merging that uses the face/edge procedures justdescribed follows.

Algorithm 6.5.4.1

for each face F of A do { // Self-edges of A for each self-edge E of F do { EwrtA = (E, NbhdSol(E,A)); // Read from the BRep of A EwrtB = ClassEdgeSolB(E,B); // Use BRep classification EwrtS = CombClassEdgeSol(EwrtA, EwrtB,⊗); if not SingleSurf(NbhdSol(EonS,S)) then AddToBRep(EonS); } }

for each face F of B do { // Self-edges of B for each self-edge E of F do { EwrtB = (E, NbhdSol(E,B)); // Read from the BRep of B EwrtA = ClassEdgeSolB(E,A); // Use BRep classification EwrtS = CombClassEdgeSol(EwrtA, EwrtB,⊗); if not SingleSurf(NbhdSol(EonS,S)) then AddToBRep(EonS); } }

for each face F of A do { // Cross-edges for each face G of B do { if not Surf(F) == Surf(G) then { C = F’∩ G’; // Use supersets: F’⊇ F, G’⊇ G CwrtF = ClassEdgeFaceB(C,F); CwrtG = ClassEdgeFaceB(C,G); E = RegInt1(CinF,CinG); // Regularized intersection in 1-D. E = ConS NbhdSol(E,S) = CombNbhdSol(NbhdSol(E,A),NbhdSol(E,B),⊗); // The Nbhds of E with respect to A and B are copied from // the BReps of A and B if not SingleSurf(NbhdSol(E,S)) then AddToBRep(E); } } // B face loop } // A face loop

Cross-edge processing requires a ClassEdgeFaceB procedure, which is the 2-D analogof the ClassEdgeSolB procedure discussed in Section 6.3.3. Recall thatClassEdgeSolB invoked a transition evaluation routine, which used the neighborhoodswith respect to the solid for the points in the intersection list. These neighborhoods wereassumed to be stored in the BRep with the face representations. Similarly,ClassEdgeFaceB will need access to neighborhoods with respect to faces for the pointsof intersection between C and the edges of each face. These neighborhoods are not

GMFC 4/12/99 8

normally stored in the BReps, but they can be inferred from their 3-D counterparts asfollows—see Figure 6.5.4.2.

E

Figure 6.5.4.2 – Inferring a 2-D neighborhood from a 3-D neighborhood

We represent the neighborhood of the edge E with respect to the solid as explained inSection 6.4.1—see Figure 6.4.1.3—by an angular sector (or, equivalently, an arc of theunit circle) on the plane normal to the edge. The neighborhood of E with respect to theshaded face F is represented by a side bit relative to the binormal , also as explained inSection 6.4.1, Figure 6.4.1.6. To compute this side bit we intersect the binormal with theunit circle. If the intersection is on the boundary of the angular sector that corresponds tothe 3-D neighborhood of E, as shown in 2-D on the right of the figure, then points

towards the face F. Therefore the 2-D neighborhood of E with respect to F is 1. If intersected the unit circle at an interior point of the angular sector, then the side bit would be0.

Cross-edges can be inferred from self-edges at considerable computational savings, forpolyhedral solids. For example, in Figure 6.5.4.1 the vertices that bound the cross-edge E= ConS are intersections of either self-edges of F with face G or of self-edges of G withface F. It is easy to see that any vertex of a Boolean combination of two polyhedra (andhence any cross-edge vertex) must lie in a self-edge of one of them. The proof is asfollows. A vertex of a polyhedron S = A ⊗ B must lie in at least three faces of S . Becauseof the boundary inclusion relationship, these faces are subsets of the faces of A and B.Since there are three faces and only two solids, a vertex must lie in at least two faces of oneof the solids A or B, and therefore it must lie in a self edge of that solid.

6.5.5 Low-Level Routines

Let us summarize here the most important computational utilities required by the Booleanoperation algorithms discussed in the previous sections for general, curved solids.

• Surface/surface intersections for computing cross-edges.

GMFC 4/12/99 9

• Curve/surface intersections. These are needed by edge/solid classifiers, regardless ofwhether they are based on boundaries or CSG. Edge/solid classification for BRepsolids starts by intersecting the tentative edges with the host surfaces of the faces. ForCSG, the edge/primitive classifiers must likewise intersect the edge with the surfaces ofthe primitives.

To be continued…

GMCh7 5/10/99 7-1



7. Application Algorithms

Unambiguous geometric models are potentially capable of supporting fully automaticalgorithms for any applications that involve object geometry. However, only a fewapplication algorithms have reached maturity and are routinely used in industry. Thischapter discusses currently understood applications in graphics and simulation; mass-property calculation (i.e., volume, moments of inertia); and interference (i.e., collision)analysis. It also touches upon other applications such as planning for inspection androbotics, which are emerging from the research labs. Generally, analysis algorithms arewell-developed, whereas synthesis algorithms, which require geometric reasoning, are not.

7.1 Graphics and Kinematic Simulation

7.1.1 Overview

Most of the rendering software in use today operates on BReps. Simple images areproduced swiftly, but photo-realistic renderings, with texture, shadows, and so on, maytake several minutes or even hours per image. If objects are modeled by using Booleanoperations, the BRep must be evaluated before rendering, and this is a time consumingprocedure. Thus, although rendering itself is fast, the entire process suffers from what issometimes called “the Boolean bottleneck”. Rendering methods that do not requireboundary evaluation, and therefore avoid the Boolean bottleneck, are attractive formodeling systems that define objects primarily through Booleans. Graphic algorithms forBReps are covered in standard graphics texts, and are not discussed extensively here. Wefocus on algorithms that do not require explicit boundary information.

Graphic displays of solids and surfaces are typically produced in three styles:

• Wireframes.• Line drawings with hidden lines removed.• Shaded displays.

Wireframes – All the edges of the object are displayed, regardless of whether they are trulyvisible or not. For curved objects, which have few edges, displays often contain additionalcurves, usually called generators. These may be computed by intersecting the object with aset of parallel planes, or, more commonly, by tracing curves of constant parameter value inparametric surfaces. Figure 7.1.1.1 provides an example. Given parametric representationsfor the curves to be drawn, the display process consists of (i) stepping along the curvethrough suitable parameter increments, (ii) generating a piecewise linear approximation onthe fly, and (iii) projecting the line segments on the screen, as discussed in Chapter 2.Parameter increments may be constant, or they may be smaller in regions of high curvature,

GMCh7 5/10/99 7-2

for better approximation and visual effect. Wireframe displays are easy to generate fromBReps (especially if generators are not used), but hard to interpret for complicated objects.Rotating a wireframe helps human users perceive the represented object (or objects, if theinternal representation happens to be an ambiguous wireframe).

Figure 7.1.1.1 – Wireframe display of a curved solid.

Line drawings with hidden lines removed – Here the invisible edges or edge segments arenot displayed. The resulting images are much more intelligible than their wireframecounterparts. However, hidden-line removal requires substantial computation. For objectswith curved surfaces, profile or silhouette curves are also computed and displayed, for amore realistic effect. A silhouette corresponds to the points of the object where the surfacenormal is perpendicular to the line of sight. If p(u,v) is a generic point on the surface,n(u,v) is the normal vector to the surface at p, and v is the viewpoint, the equation of asilhouette is (v – p). n = 0. This is an equation in the two parametric variables u and v, andtherefore implicitly defines a curve in parameter space. Note that silhouettes are not edgesin a BRep, and are viewpoint dependent. Figure 7.1.1.2 presents a simple example thatshows that some BRep edges are not displayed, because they do not correspond todiscontinuities in the surface normal, and that some edges (silhouettes) that are not in theBRep are displayed. Hidden-line algorithms normally operate on boundary representationsand are discussed at length in texts on Computer Graphics.

Figure 7.1.1.2 – A display with hidden lines removed.

Shaded displays – These are the most realistic, but also the most expensive to compute.They can be very sophisticated, with texture, reflectance effects, shadows, and so on.Figure 7.1.1.3 is an example of a high-quality shaded display of a metalic part. Oftenshaded displays are generated from boundary representations by visible surface algorithms,which are discussed in graphics textbooks. In this course we present only a very simpleboundary representation algorithm, and then focus on shaded graphics algorithms forCSG, which are usually not described in the graphics texts. There are also algorithms forgenerating shaded displays directly from octree and voxel representations. Thesealgorithms are simple, and especially attractive when implemented in hardware. Efficientsoftware for rendering octrees is available commercially.

Figure 7.1.1.3 – A shaded display of an automotive part.

Kinematic simulation amounts essentially to displaying a sequence of images, or frames, inwhich an articulated object is shown in several poses along a trajectory of motion. (Here“frame” is not a coordinate system; it is used with its standard meaning in the animationfield, i.e., as an individual image in an animation sequence.) Pose computation wasdiscussed in Chapter 2. Major challenges in kinematic simulation are computational speedand bandwidth. For realistic motion effects the frame rate ideally should be at least 30 fps(frames per second). A 1,000 by 1,000 pixel display, with 24 bits of color per pixel,corresponds to 3MB (mega bytes) of data per frame. At 30 fps this is a data rate of 90MB/sec, which is very large, and requires special hardware. In addition, motion tends toexacerbate certain imperfections in computed images. For example, if the silhouette of anobject is not accuratetly approximated, the object/background boundary flickers annoyinglyin the animation.

GMCh7 5/10/99 7-3

7.1.2 Depth Buffering

Depth- or z-buffering is a simple rendering technique, well-suited for hardwareimplementation, and widely used in graphic accelerators. It is especially useful for objectsrepresented by BReps in which the faces are very simple polygons such as triangles orquadrangles. Such BReps are also called tessellations; if all the faces are triangular, thetessellation is called a triangulation.

The basic z-buffer algorithm uses two arrays I[x,y] and Z[x,y]. The indices x,yrange over all the pixels of the display window. I[x,y]stores the intensity, or colorvalue, that corresponds to pixel x,y, and Z[x,y] is the corresponding depth. The depth,or Z value, is the distance between the viewpoint and the point on the object’s boundarythat is being projected on the pixel—see Figure 7.1.2.1.

Pixelx,y

Screen

p qv

Figure 7.1.2.1 – Depth buffering geometry, illustrated in section view.

The agorithm may be summarized as follows: initialize the buffers; scan each face; checkthe distance between each point on the face and the viewpoint; if the distance is less thanthat stored in the z-buffer, overwrite the z-buffer and compute the new intensity by usinginformation about the surface normal and the light sources. At the end of the scan theintensity buffer contains the correct image values. Enough points must be generated in eachface to ensure that all the pixels covered by the face’s projection have at least onecorresponding face point.

In pseudo-code the algorithm may be expressed as follows.

GMCh7 5/10/99 7-4

Depth Buffering Algorithm for BReps

for each [x,y] doZ[x,y] = BigNumber; // “Infinite” distanceI[x,y] = BackgroundColor;end; // Initialization loop

for each face F of solid S do// Scan convert the facefor each point P in a dense grid on face F do

[x,y] = ProjectOnScreen(P);d = Norm(V-P);// V is the viewpoint// Function Norm computes the length of a vectorif d < Z[x,y] then

Z[x,y] = d; // Update buffersN = NormalToSurface(P);I[x,y] = Intensity(P,N,LightSources);end; // if

end; // Scan loopend; // Face loop

Display(I[x,y]);end;

Figure 7.1.2.1 illustrates the need for depth testing. Points p and q project on the samepixel. If the z-buffer contains the depth that corresponds to q when p is reached in the facescan, then the buffers must be updated, to reflect the fact that only p is visible, because itsdepth is lower than q’s.

The z-buffer algorithm does not require any complicated geometric computations. Forexample, no intersections of surfaces or curves are computed. And it can support realisticrendering if the Intensity function is sufficiently elaborate. (Images produced with thealgorithm as described have a jagged appearance, and should be anti-aliased. Anti-aliasingis a filtering operation necessary to combat sampling effects, and is described in standardgraphics texts. It is analogous to low-pass filtering, familiar to electrical engineers.)

This z-buffer algorithm is not directly applicable to CSG representations, because the facesof an object are not explicitly available. But it can be extended to CSG by using thegenerate-and-test paradigm we encountered earlier, in the study of Boolean operations. Wegenerate points on primitive solid faces, which are guaranteed to be larger than the faces ofthe object, and discard some of the points by classifying them with respect to the solid.Only those points that classify on the boundary are processed.

The CSG algorithm differs from its BRep counterpart only by the addition of theclassification test implemented by function ClPtSol(). Note that it is always possible toavoid points lying on edges or vertices of the solid in the face scans. If necessary, a pointcan be moved randomly by a small amount, so that the perturbed point projects on the samepixel and lies in the interior of a face. This simplifies significantly the representation andcombination of point neighborhoods in the point classifier. Many other techniques areavailable for speeding up the computations—see [Rossignac & Requicha 1986].

In pseudo-code the algorithm is as follows.

GMCh7 5/10/99 7-5

Depth Buffering Algorithm for CSG

for each [x,y] doZ[x,y] = BigNumber; // “Infinite” distanceI[x,y] = BackgroundColor;end; // Initialization loop

for each face F of each primitive of solid S do// Scan convert the primitive facefor each point P in a dense grid on face F do

[x,y] = ProjectOnScreen(P);d = Norm(V-P);// V is the viewpoint// Function Norm computes the norm of a vectorif d < Z[x,y] then

if ClPtSol(P,S) == onS thenZ[x,y] = d; // Update buffersN = NormalToSurface(P);I[x,y] = Intensity(P,N,LightSources);end; // Inner if

end; // Outer ifend; // Scan loop

end; // Face loopDisplay(I[x,y]);end; // Algorithm

7.1.3 Ray Casting

Ray casting can be used with any representation scheme, but is most attractive for CSG.The basic idea is very simple: cast rays between the viewpoint and the screen pixels, andcompute the first point of intersection between each ray and the object to be rendered. Frominformation about this entry point determine the color to be displayed and write it to theappropriate pixel. In pseudo-code the algorithm is as follows.

Ray Casting Algorithm for CSG

for each Pixel P[x,y] doR = V - P; // Create ray from viewpoint V to pixel PRwrtS = ClLineSol(R, S); // Classify ray against solid Sif RinS == 0 then I = BackgroundColor // No intersectionelse

Q = FirstPoint(RinS); // Entry point into solid SN = NormalToSurface(Q);I = Intensity(Q, N, LightSources);end; // else

Display(P, I); // Write color I on pixel P[x,y]end; // Ray casting algorithm

Whereas the z-buffer algorithm scans object faces and projects their points on the screen,ray casting scans the screen and computes the intersections of rays with the object’s faces.Unlike z-buffering, ray casting requires line/face intersections, which can be difficult to

GMCh7 5/10/99 7-6

compute when faces lie in complicated curved surfaces. Experimental results have shownthat ray casting and depth buffering for CSG have comparable complexities, and that z-buffering has advantages when the object’s surfaces are complex.

The efficiency of ray casting algorithms may be increased by a variety of techniques—see[Roth 1982]. For example, we can sample the screen, and cast a ray for every eighth pixel,say, in a scan line. If two consecutive rays hit the same face of the object we interpolate theintensities for the in-between pixels without casting additional rays. If two faces are hit, wesubdivide the distance and cast a new ray half way between the two pixels, i.e., at adistance of four pixels from the first. Figure 7.1.3.1 illustrates the procedure. Forsimplicity, we assume the viewpoint is at infinity and the rays are parallel. We first castrays 1 and 2, which hit Faces 1 and 2. Next we divide by half the distance between rays 1and 2, and cast 3, which hits Face 1. We interpolate colors between pixels 1 and 3. Nextwe cast 4, which hits Face 2. We interpolate colors between 2 and 4. Finally we cast ray5, which hits Face 1, and we assign the correct color to pixel 5. Note that this procedure isnot entirely safe, because we will miss a small protrusion or depression if it is sitedbetween two rays that hit a single face.

Edge

Screen

1

2

3

45

Face1

Face2

Figure 7.1.3.1 – Generating line drawings by ray casting.

Interestingly, this technique can be used to generate line drawings with hidden linesremoved. In essence, we are searching for edges, because we cast rays until we find twosuccessive rays, in adjacent pixels, that intersect different faces. If we turn on the pixelswhere the transitions occur, we produce a suppressed hidden-line display. In Figure7.1.3.1, rays 4 and 5 bracket the intersection edge of Faces 1 and 2. We turn on the pixelthat corresponds to ray 5. (Alternatively, we could have turned on the pixel of ray 4, ordone some anti-aliasing average.) Figure 7.1.1.2 was produced precisely by this technique,using the ray casting algorithms of the PADL-2 system.

Photo-realistic images may be obtained by ray casting by using a sufficiently elaborateIntensity function. Figure 7.1.1.3 was generated by ray casting on a CSGrepresentation, through a rendering module developed by the Ford Motor Corporation forthe PADL-2 system.

Ray casting is a relatively simple and inherently parallel computation, which can be done byusing special purpose hardware. Existing experimental systems are capable of rendering

GMCh7 5/10/99 7-7

complex objects in about 1 second [Voelcker et al ???]. A drawback of the ray castingapproach is its viewpoint dependence. A change of viewpoint requires a completerecomputation. This implies that real-time rotation of shaded images produced by raycasting on CSG is not feasible in the current state of the technology, even with specialhardware. In contrast, z-buffering hardware for BReps routinely supports real-time objectrotation. As noted earlier, computation of BReps is relatively slow, and therefore BRep-based rendering also is slow when an object is displayed for the first time, because itsboundary must be evaluated. Once the BRep is computed, however, BRep display is fastand meets real-time constraints.

7.1.4 Graphic Interaction

Defining objects in 3-D by using a 2-D screen is inherently difficult. The vendors ofCAD/CAM systems, as well as many researchers, have spent much time and money in thedesign and implementation of graphic user interfaces (GUIs) for geometric modelingsystems. User interfaces are critical for the industrial acceptance of CAD/CAM software.Today’s commercial systems have modern interfaces with the look-and-feel users havecome to expect, with liberal use of menus, point-and-click, and drag-and-drop operations.

Relative positioning is used to establish the 3-D poses of primitives and sub-objects.Typically, one picks on the screen an entity or set of entities that serve as reference forpositioning others. For example, pick a planar face and position another plane parallel tothe first at a given offset distance. Defining poses through geometric constraints is intuitiveand very convenient.

A good GUI requires two key capabilities: picking reference geometric entities from ascreen, and creating other entities in constrained poses with respect to the references. Theimplementation of graphic picking depends on the primary representation used in a system,and on the type of display normally presented to the user. Most of the existing modelingsystems are based on BReps, and display objects as line drawings, with or withoutsuppressed hidden lines. The entities picked in such systems normally are edges. Thesecan be selected by using standard capabilities of graphic packages.

In CSG systems, edges cannot be picked directly because BReps are not available. Instead,picking operations return faces, which can then be used as positional references, or asmeans of defining edges or vertices, by intersection. Face picking on CSG representationsis implemented by casting a ray from the viewpoint to the mouse position, and classifyingthe ray. Experimental GUIs based on CSG representations and face picking have beendemonstrated, and are comparable to their BRep counterparts [Encarnação & Requicha199?].

Geometric constraint satisfaction is a complicated problem, beyond the scope of thiscourse. For a survey and introduction to the area, see [Hoffmann ???].

7.2 Mass Property Calculation

7.2.1 Applications and Definitions

Mass properties such as weight and moments of inertia are important in many applications,as the following examples show. The most fundamental equation of dynamics, Newton’sequation F = ma, requires knowledge of the mass of the moving object. The weight of an

GMCh7 5/10/99 7-8

object is critical in aerospace products such as those intended to fly inside satellites. Theweight is less critical but still very important for the automobile industry, since it has adirect bearing on fuel consumption.

The center of gravity of an object is essential for stability studies. An object resting on ahorizontal plane is stable if a vertical line through the center of gravity intersects the interiorof the 2-D convex hull of the points of contact between the object and the plane—seeFigure 7.2.1 for a 2-D example, in which the convex hull of the contact points is 1-D.

CG

ContactConvex Hull

Figure 7.2.1 – Stability check in 2-D

The dynamics of a rotating body is studied through equations of motion in which themoments of inertia play a role analogous to that of mass in Newton’s equation. In addition,the moments and principal axes of inertia of an object provide a coarse characterization ofits shape and can be used for discrimination in pattern recognition and computer vision.Several object recognition algorithms, especially in 2-D, use moments as one of thefeatures that help distinguish the objects from one another.

Volume, mass, center of gravity, and moments of inertia of homogeneous objects are alldefined by integrals of the form

I = f (p) dvV∫ ,

where p is a generic point of the solid V, dv is the volume differential, is the density, andf is a polynomial function. The following table shows some examples.

Function f Entity1 Mass M

x/M X coordinate of CGx2 + y2 Moment of inertia about Z

xy Product of inertia

There is an area of numerical analysis that is concerned precisely with the calculation ofintegrals. It is called numerical integration or numerical quadrature. However, numericalintegration typically addresses the problem of computing integrals in which the domain ofintegration (the volumetric solid V) is geometrically simple and the function f iscomplicated. In geometric modeling we normally face just the opposite situation of asimple, polynomial f, but a complex domain V .

Integral evaluation is additive, in the sense that the integral over the union of two disjointdomains A and B is simply the sum of the integrals over A and over B. This implies that

GMCh7 5/10/99 7-9

mass properties of objects represented by spatial decompositions can be computed byevaluating the contributions of individual cells in the decomposition and adding the results.If the cells have simple geometry, the integrals can be evaluated easily by closed formexpressions. As a simple example, let f = z, and let V be a unit cube aligned with theprincipal axes and with left lower back vertex at p = (a, b, c); then

dx dy zdzc

c +1

∫b

b +1

∫a

a +1

∫ =[(c +1)2 − c2 ]/2 = c + 12 .

Standard calculus texts show how to find the closed-form expressions for the otherintegrals of interest for mass-property calculation over cubical domains. Therefore, spatialoccupancy enumerations, octrees and other decompositions into cubes or cuboids areespecially convenient for mass property calculation.

7.2.2 CSG Algorithms

The divide and conquer paradigm we have been exploiting for designing classificationalgorithms for CSG does not work for mass property calculation. The mass properties oftwo subtree objects cannot be combined to produce the mass properties of their Booleancombination. For example, the volume of the intersection of two objects cannot beexpressed in terms of the volumes of the arguments.

Therefore we must convert CSG into other representations that are more suitable for massproperties. We can evaluate the boundary and then use the BRep algorithms discussed inthe next section. But it is easier to convert CSG into an approximate spatial decompositionand then add the contributions of each cell. There are several attractive options [Lee &Requicha 1982]:

• Spatial Enumeration• Ray Representation• Octree

To construct an approximate spatial enumeration we classify every point in a dense 3-Dgrid. If the point is in the solid we associate with it a filled cubical cell. Figure 7.2.2.1illustrates this conversion procedure in a 2-D example. Instead of generating the points in aregularly-spaced grid it is better to generate them randomly within each corresponding cell,as shown in the figure. Randomness avoids systematic errors. Imagine, for example, thatthe left boundary of the object in the figure was moved right by, say, one third of a cellwidth. If the points were on a regular grid, all the cells would classify inside the solid andwe would make a relatively large error. With random points, some cells would classify inand others out, giving a better approximation for the volume. This approach is known inthe numerical analysis literature as a Monte Carlo procedure. The error associated with aMonte Carlo computation can be estimated by standard statistical techniques. Errorestimates are very useful to guide a user in the selection of a suitable level of subdivisionfor the problem being solved. Smaller cells increase the accuracy of the result, at theexpense of increased computation. Unfortunately the complexity of the algorithm increaseslinearly with the number of cells N , whereas the accuracy increases slower, with the squareroot of N [Lee & Requicha 1982].

GMCh7 5/10/99 7-10

Figure 7.2.2.1 – Spatial enumeration conversion by point classification

For a ray representation we classify parallel rays (i.e. lines) in a 2-D grid to produce a setof columns of square cross-section. Again, it is advantageous to cast rays randomly toavoid systematic errors and to be able to use Monte Carlo error estimates. Figure 7.2.2.2illustrates the procedure in a 2-D example. Ray representations are much more concise thatspatial enumerations, and are considerably faster to compute.

Figure 7.2.2.2 – Ray representation conversion by line classification. For clarity,the classified lines within the columns are not shown.

Let us turn now to CSG to octree conversion. We need to classify entire cells, which maybe large. For a large cell, we cannot simply classify the cell center (or a random point in thecell) because there is no guarantee that the whole cell will have the same classification as theselected point. Misclassification of large cells will produce intolerable errors. Classifyingthe vertices of a cell also is not safe. Unfortunately, to be sure that a cell is entirely inside oroutside a solid we have to do something equivalent to intersecting the cell with the solid andevaluating the boundary of the result. This is too expensive for practical use, since manycells have to be tested.

GMCh7 5/10/99 7-11

Figure 7.2.2.3 – Octree conversion by cell classification

(To be continued, but not this year...)

pudn.comread.pudn.com/.../ebook/240778/Geometric_Modeling_A_First_Cours… · GMCh1 12/30/99 1-2...

Documents

Transcript of pudn.comread.pudn.com/.../ebook/240778/Geometric_Modeling_A_First_Cours… · GMCh1 12/30/99 1-2...