COMPUTER DRAFTING HAND-DRAWN LINE...

COMPUTER DRAFTING OF HAND-DRAWN LINE SKETCHES

A THESIS

submitted for the award of the degree

of MASTER OF SCIENCE

in

COMPUTER SCIENCE AND ENGINEERING (by Research)

SHRIRAM REVANKAR

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

INDIAN INSTITUTE OF TECHNOLOGY MADRAS1600 036

AUGUST 1987

TO Anna and Aayi

ACKNOWLEDGEMENT

With a deep sense of gratitude I acknowledge the

encouragement and guidance of Dr.B.Yegnanarayana, without which

this work would not have taken the present form. It is my

pleasure to thank Pramod Saini and Swaminathan who spent their

precious time in helping me during many stages of the thesis

work. I also thank Maria Dassou, Jai Kumar and my fellow research

scholars for their enthusiastic support.

I thankfully acknowledge the facilities provided by the

Information Sciences Laboratory. I thank Dr.Manohar and other

faculty for their help and guidance.

Shriram Revankar

CERTIFICATE

This is to certify that the dissertation entitled

"COMPUTER DRAFTING OF HAND-DRAWN LINE SKETCHES" is a

bonafide work of Revankar Shriram Venkatesh Shet,

carried out at the Department of Computer Science and

Engineering, Indian Institute of Technology, Madras,

for the award of degree of MASTER OF SCIENCE in

Computer Science and Engineering.

Research Guide, Professor and Head, Department of Computer Science and Engineering, Indian Institute of Technology, Madras, 600 036, INDIA.

ASSTRACT

T h i s is an a t t e m p t t o e x p l o r e t h e i s s u e s i nvo lved i-n

computer d r a f t i n g o f hand-drawn l i n e s k e t c h e s . I n t h e env i saged

sys tem, a l i n e s k e t c h i s i n p u t th rough a v i d e o camera i n t e r f a c e .

The r e s u l t a n t g r a y t o n e image is b i n a r i z e d and from t h e b i n a r y

image, s k e t c h f e a t u r e s a r e e x t r a c t e d . These f e a t u r e s a r e t h e n

r e c o g n i z e d and c o r r e c t e d . The o u t p u t o f t h e sys tem i s i n t h e form

of c o o r d i n a t e s o f t h e f e a t u r e p o i n t s and t h e i r c o n n e c t i v i t y

in fo rmat i .on . T h i s form is compact and can be e a s i l y s t o r e d ,

upda ted , d u p l i c a t e d o r communicated o v e r a wide a r e a network.

The l o c a l t h r e s h o l d s e s t i m a t e d i n t h e proposed b i n a r i z a t i o n

scheme e x h i b i t good n o i s e r e j e c t i o n p r o p e r t i e s . The scheme a l s o

e x t r a c t s l o c a l l i n e v e c t o r s a t a l l edge p o i n t s . Run t r a c i n g i s

used f o r e x t r a c t i o n o f l i n e s from a b i n a r y image. The r u n l e n g t h

and l i n e w id th i n f o r m a t i o n i s used f o r e f f i c i e n t e x t r a c t i o n o f

t h e l i n e s . To r e c o g n i z e and c o r r e c t t h e i n p u t s k e t c h , t h e

e x t r a c t e d f e a t u r e s a r e r e p r e s e n t e d i n t h e form o f a g raph . Some

t e c h n i q u e s f o r g r aph r e d u c t i o n and c l a s s i f i c a t i o n a r e developed.

The r e c o g n i t i o n p r o c e s s v iews t h i s g raph a s a r e l a t i o n a l

d a t a b a s e , which answers q u e r i e s p e r t a i n i n g t o t h e s t o r e d

geome t r i c models. The geome t r i c models a r e d e f i n e d i n t h e form of

r u l e s w i t h a d a p t i v e t h r e s h o l d s . These models a l l o w approximate

matching. The c o r r e c t i o n p r o c e s s upda t e s some nodes o f t h e g raph

i n accosdarice w i t h t h e c o n t e x t u a l l y c o n s i s t e n t i n . t e r p r e t a t i o n s .

COMPUTER DRAFTING OF HAND-DRAWN LINE SKETCHES

CONTENTS

1. INTRODUCTION

1.1 Problem of computer drafting of hand-drawn line sketches

1.1.1 Machine representation of an input line sketch

1.1.2 Recognition and correction of line features

1.2 Description of proposed drafting system

1.3 Approach to obtain the machine representation of line sketches

1.3.1 Binarization

1 . 3 . 2 Extraction of line features from a binary pattern

1.4 Approach to recognize and correct a machine represented line sketch

1.4.1 Representation of features of line sketches

1.4.2 Recognition and correction

1.5 Work environment

1.6 Input and output specifications

1.7 Organization of the thesis

2. COMPUTER PROCESSING OF LINE SKETCHES: A REVIEW

2.1 Conventional computer aided drafting

2.2 Binarization of graytone images

2.3 Feature extraction from binary line patterns

2.4 Recognition of line patterns

2.5 Contributions of the present work

3. MACHINE REPRESENTATION OF LINE SKETCHES

3.1 Input and output specifications of the preprocessing stage

3.2 Binarization of graytone images

3.2.1 Proposed binarization scheme

3.2.2 Analysis of the proposed binarization scheme

3.3 Extraction of lines from a binary image

3.3.1 Assumed image model for line feature extraction from a binary image

3.3.2 Proposed feature extraction scheme

3.3.3 Results of the line extraction process

4 . RECOGNITION AND CORRECTION OF LINE SKETCHES

4.1 Graph representation of features of line sketch

4.1.1 Need for representation of features of a line sketch.

4.1.2 Suitability of graph representation

4.1.3 Reduction of graph representing a line sketch

4.1.3.1 Deletion of noisy segments

4.1.3.2 Restoration of intersection points

4.1.3.3 Deviation smoothing

4.1.4 Graph classificatioll

4.2 Approximate geometric models

4.2.1 Rules describing standard geometric models

4.2.2 Thresholds for approximation

4.2.3 Priority ordering of the rules

4.3 Context module for line sketches

4.3.1 Structure of the context module

4.4 Correction of hand-drawn line sketches

4.4.1 Selection of reference line segment

4.4.2 Correction of individual geometric figures

4.5 Control strategy for recognition and correction stage

4.6 Implementation of recognition and correction stage

4.7 Results

5. CQNCLUSIQN

6. REFERENCES

CHAPTER 1

INTRODUCTION

This research aims at making a machine intelligent enough to

accept and interpret a natural input in the form of hand-ara:~~n

line sketches. The output of the envisaged system is a drafted

version of the input. In the process the sketch is encoded in 3

compressed form which can be easily stored, updatec or

communicated over a cornputcr network. The thesis explores various

issues involved in this problem of computer drafting of hand-

drawn line sketches. In this chapter we give a brief account of

the problem, its background and our approach to solve it. The

organization of the thesis is given at the end of the chapter.

1.1 PROBLEM OF COMPUTER DRAFTING OF HAND-DRAWN LINE SKETCHES

A human being effortlessly recognizes a hand-draxn l i n e

sketch, because that mode of representation is compatible with

his understanding of line sketches. But the same representation

may not make any sense to a machine which has absolutely no

knowledge about line sketches. Secondly, the machine has 110

information about various symbols and figures which make sense to

a human being. Hence, if a machine has to draft an input line

sketch, it must be equipped with the necessary knowledge to

perceive and interpret the input. Thus the problem of computer

drafting comprises the following two major issues:

1.. to transform the input into a machine representation.

2. to provide necessary knowledge and control strategy to

interpret the encoded input.

1.1.1 Machine Representation of an Input Line Sketch

A line sketch drawn on a plane sheet of paper can either be

input automatically through devices like a video camera or can be

encoded manually. The process of distinguishing various features

like line intersections and deviations, which a human being does

while encoding manually, has to be done by the machine if a

sketch is input automatically. When a hand-drawn line sketch is

input through a video camera, a graytone image is obtained. This

modification can be attributed to spatial quantization, gray

level quantization, noise and nonlinearity of the transducer. The

absence of a perfectly white background or a perfectly dark

object further reduces the contrast between the object and the

background. Hence an attempt to obtain a machine representation

of the input line sketch must involve a scheme to distinguish

object pixels from background pixels unambiguously. Such a schen:e

is called binarization scheme. To obtain a machine

representation from a binarized image, a set of features are to

be extracted such that the set uniquely characterizes the input.

These features include lines, deviation points , intersection

points, etc.

1.1.2 Recognition and Correction of Line Features

Recognition of line features by a machine is achieved by

parametric matching. If the input sketches were perfect, there

would have been an exact matching of the features of the input

with the stored object models. But hand-drawn sketches exhibit

deviations from the true objects represented by then. Such

sketches can only be recognized through approximate matching. To

obtain their drafted version, they must be corrected in

accordance with the recognition made. Hence a machine to draft

suc? inputs must be equipped with geometric models for

approximate matching and the corresponding correction procedures.

The input sketches are normally a combination of various

geometric figures. So recognition of any geometric figure

invariably involves search. This necessitates proper

representation of the inrage data. The representation chosen must

preserve the invariant properties of the image, which may

otherwise be destroyed during correction.

In some sketches, context also plays an important role. For

example the sketch in Fig.l.1 can be viewed as either a pattern

of one quadrilateral within the other, or four triangles

connected at their vertices. Under such conditions, user can give

the context in which he wants the image to be corrected. So there

should be provision to comrnunicate contextual information to the

machine. This involve:; considerable extent of user interaction.

Fig.l.1 Illustration a f need for contextual in f ormat ion (The f igure can be viewed as one quadr i l a te ra l w i th in another or four t r i ang les connected a t t h e i r vert ices)

/ CONTEXT

1

REPRESENTATION

INE SIKETCtiES

COPY

Fig.1.2 An autamatic drafting system

1.2 DESCRIPTION OF THE PROPOSED DRAFTING SYSTEM

A schematic representation of the proposed drafting system

is shown in Fig.l.2. Input to the system is a hand-drawn line

sketch whose dimensions are not specified (undimensioned), as

shown in Fig.l.3a. The video camera and an analog to digital

converter module converts the visual data into the corresponding

digital representation, which is a graytone image. The

preprocessing stage includes binarization and feature extracticn

processes. The set of extracted features describes the input

image uniquely. Before these features are recognized and

corrected, they are represented in a suitable form, so that all

relations among the features are explicit. The recognition and

correction processes view this feature representation as a

relational data base, which answers queries pertaining to various

geometric models during recognition and gets updated during

correction. Knowledge about various geometric models and the

corresponding correction procedures are provided at the

recognition and correction stage. A context module is also

provided so that the user may provide the context in which the

drafted version has to be consistent.

Sample outputs at various stages of drafting are shown in

Fig.l.3b - 1.3d. Fig.l.3a is the input hand-drawn sketch.

1.3 APPROACH TO OBTAIN MACHINE REPRESENTATION OF LINE SKETCHES

1.3.1 Binarization

Binarization is basically a scheme to classify image pixels

into object or background pixels depending on whether or not the

a. INPUT HAND-DRAWN LINE SKETCH WITH AN INDICATOR LINE AT THE TOP LEFT CORNER.

c. PLOT OF THE FEATURES EXTRACTED FROM THE BINARY IMAGE, IN THE PREPROCESSING STAGE.

'b: BINARIZED IMAGE

Fig.1.3 A LINE SKETCH AT VARIOUS STAGES OF DRAFTING.

pixels are greater than a selected threshold. Some of the earlier

methods select global thresholds based on overall statistical

properties of an imagerl]. These methods produce poor results if

the input image is noisy or not uniformly illuminated, or if the

object area is small. Local thresholding schemes overcome some of

these drawbacks[2]. Some binarization schemes use both local and

global thresholds[3],[4].

A new local thresholding scheme is proposed which is

adaptive in nature and has noise suppression property. Here we

assume that da--ker the pixel, higher is its gray value. The

darkest point has a gray value of 255 and the brightest a value

of 0 . In this scheme a 3x3 window is considered around a pixel to

be thresholded. It is assumed that any line in the sketch is at

least three pixels wide. Therefore any line edge has one of the

eight 5-pixel neighborhood pattern in a 3x3 window as shown in

Fig.l.4. The minimum of the average gray values of the 5-pix~l

neighbors is used for the calculation of the thresh~ld for the

central pixel. The edge pat tern correspondi rig to tlie minimum-

average indicates the most likely background pattern, If the

pixel were to be on the edge of the line. The central piAAcl

belongs to the object if it has a gray value higher than the

minimum-average by a certain value.

If 'm' is the minimum of the average gray values of the set

of 5-pixel neighbors, the threshold is given by,

where K1 and K2 are positive nonzero constants, estimated either

experimentally or by image histogram analysis.

a . HORIZONTAL EDGES, (HI, HEIGHTAGE = i

b . POSITIVELY INCLINED EDGES. I WEIGHTAGE = 2

c . VERTICAL EDGES, I V ] , HEIGHTAGE = 3

d . NEGETIVELY INCLINED EDGES, 11-1, WEIGHTAGE = 4

F i g . l . 4 E i g h t 5 - P i x e l - N e i g h b o r p a t t e r n $ o f t h e p i x e l P, w i t h t h e i r v e c t o r w e i g h t a g e s .

8

A s can be seen from the threshold equation (I), the

threshold 'T' is high when the window is on the background (low

m ) and is low when the window is on the object (high ' v n ' ) .

This property of the threshold suppresses noise specks on the

background and holes on the object. In addition, the scheme

extracts the local line vectors at all edge points.

1.3.2 Extraction of Line Features from a Binary Pattern

Most of the conventional methods use thinning to obtain line

and intersection information from the binary pattern of a line

image[5]. These methods are time consuming because thinning is a

coniputationally expensive iterative process. To reduce the time

complexity, recent methods directly vectorize the binary

patterns[6]. But : this approach coniplicates the detection of

intersection points. A method to detect intersection points in

unthinned binary patterns is proposed by Sebock et al.[7].

A new method for line extraction, which exploits the line

width constraint, is proposed. Initially, the average line width

is found out by processing the indicator line segment drawn or1

the top left corner of an input sketch. Using this width

information, lines are grouped into two classes. The first is a

vertical or nearly vertical class of lines and the second is a

horizontal or nearly horizontal class of lines. All the binary

patterns representing vertical lines are traced by raster scan

and the rest are traced by non-raster scan.

Lines are viewed as overlapping run patterns[7]. To extract

lines from these patterns, we use run tracing. After every trace

of specific number of overlapping runs, a vector is placed on the

core line represented by the runs. If the vector deviates from

the line extracted so far by more than a prescribed threshold,

the exact point of deviation is found out using a binary search

technique. Line extracted up to the point of deviation is stored

as a line segment and the remaining part of the vector is taken

as a new line extracted so far. On the other hand, if the

deviation of the new vector is less than or equal to the

prescribed threshold, the vector is assumed to be a continuation

of the line extracted so far. This process continues until ail

lines are traced. The points where raster and non-raster traces

meet are points of intersection. These extracted features are

used for machine representation of the input line sketch.

1.4 APPROACH TO RECOGNIZE AND CORRECT A MACHINE REPRESENTED

LINE SKETCH

1.4.1 Representation of Features of Line Sketches

The extracted features of an input line sketch are to be

recognized and corrected. As the recognition process involves

search and pattern matching, the features must be suitably

represented. Wojcik[8] proposed one such representation in the

form of a graph. Nodes in the graph represent features of the

image and the arcs give the relation between the nodes they

connect. An example of such a representation of a triangle is

shown in Fig.l.5.

This graph can be easily reduced so that various noise

segments are eliminated and the image is smoothened wherever

possible. A new technique of node coloring is used to avoid

connocta

c o o r d l n a t a

. F i g . i . 5 Graph r e p r e s e n t a t i o n o f a t r i a n g l e .

unnecessary search during recognition and correction. In general

the graph can be viewed as a three colored structure. The 'open'

color indicates that the nodes and arcs are parts of t:he open

loops and the search for a geometric pattern in such regions is

avoided; the 'soft' color indicates the regions of closed loops

where the search for geometric models is carried out; and the

'hard' color indicates the corrected regions and these nodes are

not modified. A graph representation depicting this

classification is shown in Fig.l.6. In this figure nodes S1,

S2, ......, Sm represent line segments of lengths L1, L2, . . . . , Lrn

respectively. The line segments connect their end points

Pl,P2, ..... , Pn through the arc 'connects'. The nodes al,

a2, ....., aL indicate the relative orientation between various

connected line segments. During correction, only the node values

of the graph are modified, keeping the structure intact. This

ensures preservation of various relations existing amony the

features of the input sketch.

1.4.2 Recognition and Correction

A machine is said to recognize a particular pattern, if it

can generate a unique synibol whenever it receives that pattern.

Some of the earlier methods of recognition use template

matching[9]. This kind of matching is not suitable for hand-dram

line sketches, because infinite number of variations exist for

the representation of a single object. To cope with the

recognition of such inputs, the concept of approximate matching

is proposed. Huntsberger et a1.[10] have proposed fuzzy

geometric models to recognize approximate figures.

Pi . P2. . . . .are segment end points S i , S2. . . . . are 1 f nt: segments Li,L2, .... a r e lengths o f line segments a i . a 2 . .... are r e l a t i v e or ienta t ion% o f connected segments

F k g . i . 6 The g r a p h r e p r e s e n t a t i o n o f f e a t u r e s

o f a l i n e s k e t c h w i t h c o l o r i n g .

We have defined geometric models in terms of rules. An input

pattern is said to be recognized, if it satisfies these rules.

The rules contain various adaptive thresholds for approximate

matching. These thresholds maintain consistency in matching,

irrespective of the sizes of sides and/or angles of the input

sketch. The rules are given priorities to resolve the conflicts

that may arise when an input pattern satisfies more than one

geometric model.

Once a pattern is recognized, a check for contextual

consistency of the recognition is carried out. Contextually

inconsistent recognition is revoked and an alternative

interpretation is sought. At this stage, provision for user

interaction is made to resolve the cases where all

interpretations of the pattern fail. Context is represented by a

structure similar to frames[ll] . The default context is

automatically selected whenever the present context does not

provide any information about the pattern queried.

Patterns are corrected in accordance with their contextually

consistent interpretations. We have proposed some simple

geometric transformations for correction. Care is taken to ensure

that that the area covered by the corrected sketch is

approximately equal to that of the input sketch. All irregular

patterns are corrected with reference to the orientation

information of their corrected neighbors. All open line segments

are corrected after correcting closed loops, because any

corrective shift given to an open line segment does not propagate

over to other parts of the sketch.

1.5 WORK ENVIRONMENT

The input equipment consists of a video camera and a

digiti.zer. The input line sketch is illuminated by incandescent

lamps. The data is processed on a sequential processi-ng machine.

A bit.-mapped frame storage CK'I' terminal is used for assessi-ng

inputs and outputs at various stages of drafting.

The preprocessing stage is implemented in C language, as it

involves large array handling and mathematical calculations. But

once the features are extracted, the recognition involves

extensive symbol manipulation, relation matching, and search.

This in~plementation is straightforward and explicit in a

declarative language like PROLOG. PROLOG has the inherent

property of search and backtracking, which can be used for rcle

matching and search for alternate sol.utions.

1.6 INPUT AND OUTPUT SPECIFICATIONS

Input consists of undimensioned line sketches, dcpictiny

various geometric models and their combinations. The sketches may

be hand-drawn flow charts, simple house plans, block diagrams or

engineering design drawings. The hand-drawn sketches have

discrepancies like ragged straight lines, rounded corners, and

extendpd line segments. Various line lengths and angles

represented are only approximate measures. A pair of angles which

are supposed to be equal are only approximately equal. Similarly

a ralr of sides which are supposed to be of equal length are oniy

hpproximately equal. The sketches are fairly well drawn and are

free from overwriting and scratching. The line width variations

do not change the informati-on content. Fig.l.7a gives an example

of the input line sketch.

The output is a perfect geometric representation, which

closely approximates the input line sketch. Various points and

segments of the input sketch are modified, reoriented or shifted,

within a range specified by the thresholds. This necessarily

means that all the rounded corners are sharpened, and that the

extended and spurious line segments are eliminated. F'ig.l.7b

shows a drafted version of the input sketch of Fig.l.7a.

1.7 ORGANIZATION OF THE THESIS

In the presentation of this thesis, we assume the reader t~

be familiar with two dimensional signal processing and logic

programming. We feel that the development of a full fledged

system for automatic drafting cannot be just a cascade of various

modules explained in this thesis. As the objective of this work

is to explore the issues in computer drafting of line sketches

and tr: suggest viable solutions, no implemention effort is made

to proccess large sketches which occupy memory area greater than

the allowable array size in the machine. Fig.l.7 gives an idea

of the size and type of sketches on which various proposed

algorithms were evaluated. We would also like to clarify that

many aspects of the algorithms proposed cannot be explained

mathematically, but an effort is made to justify such aspects

intuitively. Supporting results and illustrations are given

wherever appropriate.

-- - ---

a. INPUT GRAYTONE IMAGE

b. DRAFTED VERSION OF THE INPUT

- - - - -

Fig.l.7 SAMPLE INPUT AND OUTPUT OF THE DRAFTING SYSTEM,


b. DRAFTED VERSION OF THE INPUT

Fig.l.7 SAMPLE INPUT AND OUTPUT OF THE DRAFTING SYSTEM,

A review of the state of art in the field of computer

processing of line sketches is given in chapter 11. The chapter

concluds with a note on specific contributions of this thesis.

Chapter I11 deals with extraction of line features from the

graytone image of an input line sketch. New algorithms for

binarization and line extraction are proposed. In the

binarization scheme, edge patterns of the lines in a window are

taken as the reference for threshold selection. It has good

noise suppression properties and provides local line vectors at

all edge points. The algorithm for line extraction and

segmentation does not involve computationally expensive process

of thinning. Here, a binary pattern is viewed as a structure of

overlap of runs of object pixels. A trace of overlapping runs is

directly vectorized to get lines and their deviation points. Line

intersection points are extracted through analysis of run length

information.

Recognition and correction processes are discussed in

chapter IV. The recognition process is basically a search for a

pattern in the input which approximately matches with standard

geometric models. The models are defined in the form of rules.

Emphasis is also given to the structural representation of input

sketch features. Correction involves reorientation and

modification of input sketch parameters, in accordance with the

recognized model and the relations existing between the figure to

be corrected and its neighbors.

Chapter V concludes the presentation with a brief summary of

the thesis and a few samples of outputs at various stages of

processing.

CHAPTER I1

COMPUTER PROCESSING OF L I N E SKETCHES: A REVIEW

Towards the improvement of man-machine communication

technology has made tremendous advances in last four decades, but

the goal envisaged still looks as distant as it was in the

beginning. In most of the cases the source and the size of

information necessary for processing a natural input, is quite

obscure. In the absence of a clear view of the problem, many of

the methods proposed towards providing natural inputs to a

machine become useless as soon as the input crosses the

protective barrier ( or constraints) put by their creator. To

overcome this problem, many researchers have tried to study and

emulate the details of biological processes involving the

functions like vision, problem solving, hearing, etc. One of the

vision problems is computer processing of line sketches.

Computers are conventionally used as an aid to draft

various kinds of sketches. Two different approaches are observed

in this kind of machine usage. Section 2.1 gives the details of

these approaches and their drawbacks.

Recent developments direct towards automating the process of

drafting. Some of these systems receive on-line input and some

others off-line. Off-line input devices invariably generate a

gray tone image, irrespective of the type of input . Sections 2.2 and 2.3 describe various existing schemes for binarization of

graytone images and line extraction from binary images,

respectively.

Recognition of extracted features of a line sketch has beec

studied extensively. Some of the early methods use template

matching for recognition of input data. Machine recognition of

natural inputs involves approximate matching. Recent trend is to

develop fuzzy models which match with a set of approximate

patterns rather than a single pattern. A review of these schemes

is given in section 2.4.

Section 2.5 carries a note on specific contributions of the

present thesis in the light of various published methods.

2.1 CONVENTIONAL COMPUTER AIDED DRAFTING

Commercially available computer aided drafting systems still

act as just passive receivers with little or no knowledge of the

task domain. They rely heavily on encoding capacity of a skilled

draftsman[l21. Though these systems surpass the conventional

drafting both in time efficiency and in output quality, the

draftsman's job is still tedious and his communication with the

machine is highly unnatural.

There are mainly two approaches in human encoding of line

drawing into a computer[l2],[13]. The first approach uses an

interactive method in which a graphic work station with

'computer aidcd drafting1(CAD) software package is used

extensively. Were a draftsman's job is complicated, because to

make use of the environment effectively, he has to detect

various regular structures present in a drawing to be drafted and

encode that information into the machine with proper size,

location and orientation information. This mode of thinking is

different from thinking involved in human drafting and it c:ompels

a draftsman to have a panoramic view of the encoded drawing. But

if the resolution of the screen is small or the screen size

itself is small, large drawings may not fit into a single

display. A partial display interferes with the draftman's thought

process. Secondly, editing of the encoded image must also be done

on a chunk of information which was encoded at a time, rather

than individual lines and angles. This may lead to a chain of

corqections and i.s time consuming.

The second approach is closer to manual drafting. Here a

draftsman encodes coordinates of various feature points like

deviation and intersection points of the drawing through a

coordinate sensitive yoke and connects these points as in the

drawing. This necessitates the drawing to be within the limits of

the coordinates of the pointer. Although a draftsman may find

this easily maneuverable, he must endure lengthy and monotonous

task of inputting each intersection and deviation point of every

line of the drawing. The job becomes more tedious if curves are

present in the drawing. Therefore the automation of encoding line

sketch into a computer has been an important issue. Many efforts

have been made to make a machine encode a line sketch by

itself 1131, 1151-[19].

2 . 2 BINARIZATION OF GRAYTONE IMAGES

All off-line input devices like video camera, facsimile,

etc., generate a graytone image, when they digitize an input. To

extract the input information from the graytone image, the object

pixels must be distinguished from the background pixels. This

c.lassification is viewed as a binarization (or binary

thresholding) problem, in case of images of binary pictures.

Binarization is carried out by classifying every pixel in the

image as black (or object) or white (or background), depending on

whether or not the gray value of the pixel under classification

is greater than a suitzably selected threshold.

Thresholding schemes can be broadly classified into

conventional .. .-. -- - - schemes and information - integrating -- - -- schemes[26].

Conventional techniques look for either maximal regions

satisfying some homogeneity criterion or edge information between

.t 1, ,&e regions, whereas the information integrating techniques try

to emulate the biological system approach for processing visual

data. In the information integrating techniques, information from

various related sources are used to put constraints on the data

to be processed, to obtain an unambiguous result.

In conventional schemes a variety of threshold selectiol~

techniques have been proposed. Each technique assumes certain

image model. Most of them perform satisfactorily on the images

which satisfy the assumed model. Thresholding schemes fall into

three main categories.

1.. Global Thresholding

2 . Local thresholding

3. Dynamic Thresholding.

In general, a threshold at any point can be expressed as a

function of the gray level at that point, its neighbors, overall

gray level distribution of the image, and the spatial location of

the point under consideration. If the image is of 'L' levels,

then the threshold at the 'ith' level ( 0 < i < L) is defi-ned as

where fi is the thresholding function for the 'ith' level,

y(x,y) is the gray level of the pixel at (x,y),

N.(x,y) is a set of neighbors of pixel (x,y), I

I is the gray level distribution of the complete image. n

If T.(x,y) is a function of In only, then it is called a global 1

threshold. The global threshold for any level is computed only

once for the entire image. If the threshold Ti(x,y) is a function

of g(x,y) and Ni(x,y), it is called a local threshold. If Ti(x,y)

is a function of the coordinates (x,y) of th: pixel under

consideration, it is called a dynamic threshold. Most of the

recent methods use a combination of these techniques.

A gl~bal technique based on prior information of object area

of the image is proposed by Doyle[21]. It was developed for

binarization of similarity invariant patterns. Here a threshold

is selected such that only a certain number of pixels equivalent

to the known object area, have gray level greater than the

threshold. This method is naturally not applicable, if the object

area is unknown or varies from picture to picture.

For .the purpose of segmenting white blood corpuscles Prewi.tt

and Mendelson[22] chose the threshold at the valleys of the image

histogram. This technique is called the mode method. This

technique involved smoothing of the histogram to remove spurious

modes and valleys of the histogram. The smoothened histogram was

then searched to find out the local maxima (or modes) of the

histogram. Then the threshold was selected at a value between the

two modes. One of the recent methods proposed on these lines is

an automatic binarizing scheme which can be extended for

multilevel thresholding[l]. Here an optimal threshold is selected

by using the discriminant analysis of Fukunaga[23]. The

discriminant measures maximize the separability of the resultarit

classes of pixels. The procedure uses the 0th and the 1st order

cumulative moments of the image histogram. Hence the computation

time is linear function of the size of the image.

In cases of out of focus images, or images with object areas

very s~nall compared to the background area, bimodality of the

image is not distinct. It is difficult to locate histogran?

valleys in such images. To overcome this difficulty, some of the

global methods study local properties of the image[24]-[26].

These methods basically transform the image histogram so that the

valley gets enhanced. In determining how each point of the image

should contribute to the transformed histograms, the rate of

change of gray levels around that point as well as the gray level

at that point, are considered. Normally changes in gray levels

occur at the edges comrnon to both the object and the background.

So the rate of change of gray level is also termed as 'edge

value'. Edge values are found out through edge operators like

Laplacian, Robert Cross, DIFl(maximum difference of average gray

level in pairs of horizontally and vertically adjacent 2-by-2

neighborhoods), etc.[4]. The points at the interior of the object

or background generally have low edge values because of

uniformity of the surroundings, while those on the

object/background boundaries have high edge values. Thus if a

histogram is obtained for only low edge value pixels, histogram

peaks remain the same while the valley becomes prominent[25]. On

the other hand, if a histogram of pixels having only high edge

values is obtained, the histogram should have a single peak at

the valley point of the image histogram. Alternatively, a

weighted histogram is obtained by counting the higher edge values

more heavily[27],[28].

Weszka et.a1.[26] used a pure Laplacian operator to define

the edge value. Since these are second derivative operators, they

have a zero value on a linear ramp formed at the

objectjbackground boundaries, but high edge value or1 the

shoulders on either side of the ramp. Thus the points having high

Laplacian values will be adjacent to, but not on, the boundaries.

The histogram of high edge value pixels should now have two peaks

representinq the shoulders of the boundary and a deep valley

representing the boundary.

All the above described global methods assume that the

abject snd thc background gray levels are uniform in their

respective regions. This is not true in practice, especially in

the images of hand-written characters or hand-drawn line

sketclles, because of uneven stress put by hand. Global

threshoiding of such iinages may lead to generation of

disconnected binary patterns. Moreover, all the global methods

are sensitive to noise, because irrespective of its surroundings,

a pixel is classified purely on the basis of its gray value. To

handle this situation, local thresholding schemes were proposed.

Here a pixel is classified by comparing its gray value with the

gray value obtained from a set of neighbors. The spatial

distribution of this set is called a window. The size and shape

of the window varies from method to method. Some methods assume

minimum width of the object pattern and some are selected by

experimental studies. A review of such schemes is given by

Weszka[24] and Ullmann[2]. Wo15[29] assumed that in textual

images, width of a limb does not exceed a known limit. He

proposed that a pixel is an object point, if it has a gray value

higher than the gray values of a certain pair of neighboring

pixels, by more than a specified constant. Ogawa et a1.[30], in

their OCR system proposed an algorithm, where a set of neighbors

are chosen depending on the a priori line width information. A

pixel is classified as an object point, if it is darker than the

average gray value of the neighbors, by more than an

experimentally selected constant. On the other hand, Ullmann[64]

selected a neighborhood pattern without assuming any width

information.

The loss of information due to digitization of a picture is

aggravated by binarization. This leads to the formation of tailed

and jagged patterns during thinning of the binarized pattern. To

reduce this degradation, a method of double adaptive thresholding

is proposed in[3]. One more local method which has more of global

flavor is proposed by Ridler and Calvard[6], where an iterative

technique is used for the selection of a threshold. Wojcik[8],

proposed a new thresholding scheme, where the existing gray value

of a pixel is replaced by the maximum homogeneous gray value,

which is found by a rotating window technique. The method tends

to blunt corners and sharp curvatures. The performance of the

method can be improved by giving weightage to the pixels of the

window, depending on a proximity measure.

In dynamic thresholding schemes, the threshold at a point

depends on coordinates of the point, in addition to its own gray

value and the gray values of a set of neighbors. Chow and

Kaneko[32] use a scheme of dynamic thresholding where the value

of the threshold at a point depends on its proximity to boundary

points. The threshold for these boundary points is determined by

local histogram analysis.

A recent trend in image classification is to use information

integration, which is how the biological vision is assumed to be

working. One of the approaches in that direction is proposed by

Ahi~ja et al.[3]. Here features of neighborhood gray level

patterns are used for pixel classification. The feature vectors

are obtained from a known set of neighbors. In addition to the

image information, external information like context and texture

information are also used for image segmentation[34]-[36]. These

methods have proved to be useful in case of interpretation of

complex images like satellite imagery, 3-D imagery, etc.

2.3 FEATURE EXTRACTION FROM BINARY LINE PATTERNS

It has been observed that in almost all practical cases of

natural communication with machines, the input is either too

large to match directly with a stored pattern or too many pattern

variations exist in the representation of a single object. Hence

various methods have been proposed to compress the input data or

to oxtract oomo invariant foaturos, which uniquoly charactorizo

the input. This in general is called feature extraction. In case

of line images line segments, their relative orientations,

connectivity and intersections form a set of features which

uniquely characterize a line sketch. Several methods have been

proposed to extract these features from the binary image of a

line sketch.

We find two distinct trends in the algorithms for line

extraction from binary patterns. The first relies on obtaining a

singly connected pixel pattern of the object by thinning and then

extracting various line segments. The second deals with

extraction of lines without thinning. Methods which extract lines

from binary patterns without thinning are further classified into

two categories. One category extracts lines by direct

vectorization and the other by core line tracing. These methods

invariably assume a group of object pixels, rather than a single

pixel as the basic unit constituting the object.

Most of the published work in line extraction, proposes

thinning as an inseparable process. The algorithm proposed by

Hilditch[37] uses an iterative edge erosion technique. Here a

3 X 3 window is traversed over the image and a set of rules are

applied to the contents of the window. The rules specify the

pixels to be marked for deletion at the end of each iteration.

The iterative scan completes when no more points can be deleted

or marked. The rules can be summarized as , delete an object

point if 1) it is an edge point; 2) it is not an end point; 3)

its neighborhood pattern does not match with any of the seven

predefined window patterns. These window patterns basically

include various tests for break points and successive erosions.

Naccachc and Shinghal[l8] gave a formal description of this

algorithm. In the same paper the authors have proposed a new

thinning algorithm called the 'safe point thinning algorithm'

(SPTA). Here the method is similar to the one proposed in [ 3 7 ] ,

but the rules are optimized into a set of Boolean expressions,

which can be evaluated using the neighbors of each point. The

algorithm is twtce as fast as similar methods but it generates

ragged lines at 'T' junctions. Many other thinning algorithms are

proposed on similar lines[39],[40]. Udupa and Murthy[30]

suggested a method to obtain a piece-wise-linear approximation of

the skeleton of a unthinned binary image. It is an iterative

algorithm, where a set of window operators is passed over the

image to detect the 'turning points' and 'end points' of the

lines. But the algorithm is not very robust as it assumes that

each line has a constant width over at least six pixels along the

line.

Most of the above algorithms assume an environment of lov~

resolution data. A linear increase in resolution introduces

quadratic increase in data to be processed. Since every iteration

is followed by stripping of the marked pixels, the computation

time increases in cubic proportion to the resolution. This

drawback of the iterative techniques makes them unsuitable for

practical situations such as real time processing and high

resolution image processing. Added to this, line information is

obtained only after additional processing of the thinned

patterns.

This drawback is overcome by direct vectorization of binary

patterns. Here, instead of every pixel being treated

independently, a group of pixels is taken as the basic unit of a

binary pattern.

Sebock et al.[7] assumed a run of object pixels as the unit

of line, and the manner in which the runs overlap determines the

points of intersection of lines. To make the algorithm

independent of width, run length variation is ignored. Hut

because of this. the algorithm fails to recognize the horizontal

'T' intersection. Ramachandran[6] proposed a method to encode

engineering drawing. The method gives importance to exact

reproduction of the input image. Here all edges of an input image

are marked, and a set of constant length vectors are placed in

between the edges of a line. The average width between the edges

along the horizontal scan gives the line width information. The

direction of the vector and the line width information is then

encoded in a compressed form. As the method uses only vertical

trace, it is inefficient in encoding nearly horizontal lines and

code developed for such lines is large.

Some ]nethods use core-line tracing for line extraction. In

the method proposed by Wakayama[42], a maximal square window of

object pixels is taken as the basic unit of line and the central

pixel of the window is taken as the pixel on the core line. In

addition to obtaining the thinned version of the binary pattern,

this method also provides a capability of exact reproduction of

the binary pattern, when needed. In the method proposed by Arvind

et a1.[3], the binary image is intially blurred using a Gaussian

filter to generate peaks of gray values at the core of a line

pattern. Then an adaptive thresholding scheme is used to detect

these peaks from rest of the data. But this algorithm suffers

from the width dependency. If the width of the Gaussian filter is

too large, the lines with small widths get blurred beyond

recognition and if the filter is adjusted for thin lines, wider

lines may get discarded as regions.

To extract line features, thinning and core line tracing is

followed by vectorization. Thinned patterns are viewed as chain

connected curves. Deviation points in these curves are found out

by calculating the angles subtended at each point by a pair of

fixed length line vectors, and selecting the points where local

maxima of angles occur[43]-[46]. The point at which a pixel has

more than two neighbors is an intersection point.

In direct vectorization schemes, though finding the

deviation points reduces to the same procedure as in the chain

coded curves, finding intersection points becomes difficult.

Sebock et a1.[7] and Pavlidis[47] proposed two different methods

for i.rltersection detection. In the former method, a linked list

of runs of pixels is formed with a known number of in-pointers

and out-pointers at each run. Wherever a pointer conflict occurs,

an intersection point is marked. In [47], run length coding is

used for grouping the object pixels. These groups are called

nodes. The nodes are traced in a predefined manner. Whenever a

node traces back to the already traced node, an intersection

point is marked. Filtering is suggested to avoid detection of

innumerable number of intersection points, due to stray holes in

the object area or in case of processing of a checkered board

pattern.

2.4 RECOGNITION OF LINE PATTERNS

Various schemes have been proposed to represent object

features. To recognize an object, the representation chosen must

make the relations existing among the features explicit. One way

of encoding these relations is proposed by Wojcik[8], which is a

graph representation of the line sketch. Here the arcs represent

the relations and the nodes represent the extracted features.

Pattern matching is extensively used for recognition. If the test

pattern matches with a pattern which the machine has already

associated with a name tag, the test pattern is assumed to be

recognized as an object with the associated name. Several

matching techniques have been proposed. Some methods involve

exact matching and some involve approximate or fuzzy matching.

In exact matching techniques, various sets of object models

are stored. Every stored element is called a template and it is

attached to the name of the object it represents. Only if a test

pattern matches with any one of the templates, the pattern is

said to be recognized as an object corresponding to the template

it matched. The field of character recognition makes extensive

use of matching techniques[48],[49]. A simple matching strategy

is to check for one-to-one correspondence between the line

segments of the figure to be recognized and those of the known

template. Some on-line recognition systems make use of the

temporal order of strokes also[l8]. If the image is chain coded,

pattern of chain is directly matched with a template, as in

string matching. These methods become impractical if the number

of object patterns is very large, as in the case of patterns of

hand drawn sketches. For such inputs fuzzy or approximate

matching is inevitable.

In approximate matching, object models are devised such that

a set of approximate object patterns match with each model,

unlike a unique pattern matching with a model as in the case of

template matching. In the methods proposed by Huntsberger et

a1.[23], geometric models are defined in the form of fuzzy sets.

If an input pattern is a member of any one of these sets, the

pattern is recognized as the geometric model associated with the

host fuzzy set. If the pattern is a member of more than one fuzzy

set, some definite priority is given to the models to resolve the

conflict.

2.5 CONTRIBUTIONS OF THE PRESENT WORK

A new local thresholding scheme i.s proposed[54]. A 3 X 3

window is used for pixel classification. The method highlights a

new idea of noise suppression. Constants of the threshold

function can be found either experimentally or automatically

using the discriminant analysis described in Otsu[l]. The method

also extracts local line vectors at all edge points. These line

vectors can be used for curve detection.

The line extraction algorithm proposed is similar to the one

proposed by Pavlidis[47], but it uses the prior line width

information to infer the existence of an intersection point.

Lines are viewed as overlapping run structures as proposed in

Sebock et a1[7]. To provide more uniform treatment to runs of a

line, raster and non-raster scans are used for the extraction of

vertical class of lines and horizontal class of lines

respectively. This is an improvement over the engineering drawing

encoding scheme proposed by Ramachandran[G], where the lines of

horizontal class are extracted as an array of vertical line

vectors.

The proposed representation scheme is an improvement over

the graph representation of Wojcik[8]. Various reduction and

classification techniques are proposed. This representation

serves as a relational database, which answers queries pertaining

to various geometric models and gets updated during correction.

Tor the recognition of approximate geometric figures from an

input sketch, fuzzy geometric models are defined in the form of

rules. These rules have dynamic thresholds which allow

approximate matching[55]. These geometric models are more

flexible than the ones proposed in Huntsberger at al.[10], where

the threshold of approximation is not explicit.

Techniques for correction of undimensioned line sketches are

developed. This correction is in accordance with the relations

governing the matched geometric model. A corrected sketch is an

aesthetically improved version of the input. Correction can also

be controlled by user specified context through a context module.

CHAPTER I11

MACHINE REPRESENTATION OF LINE SKETCHES

Even though a line sketch is binary, its digitized image

obtained through a video camera interface is graytone. In such an

image, the sketch information is obscure because the object

pixels are not distinct from the background pixels. Hence the

image is binarized to separate the object pattern from the

background. This object pattern is then processed to extract

characteristic features of the sketch. The output of this process

is in the form of coordinates of feature points and their

connectivity information. This output structure forms a

convenient machine representation for recognition and correction

of the input sketch.

A brief note on the input and output specifications of the

preprocessing stage is given in section 3.1. In section 3.2, a

new binarization scheme to separate object pixels from the

background is proposed. A scheme to extract line features from a

binary image is discussed in section 3.3.

3.1 INPUT AND OUTPUT SPECIFICATIONS OF THE PREPROCESSING STAGE

The sketch to be processed is drawn on a plane sheet of

paper ( Fi.g. 1.3a). The sketch contains an isolated vertical

straight line segment, which is drawn at the top left corner of

the sketch, with the same i~lstrulnent with which the sketch is

drawn. This line segment acts as an indicator line, which

provides the line width information. The input to the system is a

digitized version of such a sketch, which is obtained through a

video camera and an analog to digital converter. The digitized

version is a graytone image with gray levels of the pixel varying

from 0 (indicating the brightest region) to 255 (indicating the

darkest region). Object pixels in the image are darker than the

background pixels. In a graytone image, sharp variations in gray

levels of the input sketch are eliminated, i.e. object/background

boundaries are smeared. In addition to this gray level

modification, noise may also introduce random variations in the

pixel gray values.

The output of the preprocessing stage is a set of features

which uniquely characterizes the input sketch. We observe that a

set containing all line segments, deviation points and

intersection points, uniquely characterizes a line sketch. Hence

the output specifies coordinates of various feature points and

their connectivity information.

3.2 BINARIZATION OF GRAYTONE IMAGES

Graytone images contain smeared boundaries and gray level

variations due to noise. So a graytone image, when binarized, may

generate an object pattern which is either enlarged or eroded.

Further, the pixels which are modified by noise may form dark

specks (spurious object points formed on the background) or holes

(spurious background points formed on the object). Ideally, a

binarization scheme should eliminate the following

discrepancies:

i. noise specks should be not be formed on the background,

ii. no holes should be formed on the object, and

iii. smearing of the boundaries must be eliminated.

The proposed binarization scheme estimates a local threshold

at each pixel of the image and classifies the pixel as an object

pixel if it has a gray value higher than the estimated threshold.

The scheme also provides local line vectors at all edge points of

the object.

3.2.1 Proposed Binasization Scheme

A local threshold is estimated based on the image features

around the pixel to be thresholded. The equation governing a

local threshold is represented as follows:

where T(x,y) .is threshold for the pixel at the point (x,y),

N(x,y) is a set of neighbors of pixel at (x,y), and

( ) is a threshold function

If g(x,y) is the gray value of the pixel p(x,y), then

binarization is defined by the following equation.

if g(x,u) > T(x,y)

Then p(x,y) is an object point.

Else p(x,y) is a background point. (3.2)

For the purpose of estimation of the threshold, a 3 X 3

window centered around the pixel to be thresholded is considered.

I n other words, the set

represents the pixels in the window around the pixel p(x,y).

In s window, 8 sets of 5-pixel neighbors are defined as

shown by the shaded area in Fig.l.4. This set of 5-pixel-

neiqhbors represent line edge patterns in a 3 X 3 window.

If 'm' is the minimum of the average gray values of the 5-

pixel-neighbor set corresponding to the pixel p(x,y), the

proposed threshold function is given by

where K1 and K2 are positive nonzero constants.

The values of the constants K1 and K2 depend on the contrast

of the input grsytone image and the overall picture brightness.

These can be f o ~ x l d experimentally. Automatic determination 01 K1

and K 2 can also be done using discriminant analysis of the gray

level dFstri.but.ion of the image[23]. If K is the point of optimal

classification[l], then assuming that image is uniform in a

region of 3 X 3 pixel window, we have both 'm' and T(x,y) of

equation (3.3) equal to K . On substituting this condition in the

equation we have

Typically the value of K2 is 1.

Once the threshold is calculated at a pixel, the pixel is

classified either as a background pixel or as an object pixel, in

accordance with the equation (3.2). For convenience, the gray

values of all the pixels which are classified as object are set

to '1' and those of background pixels are set to ' 0 ' . The image

thus obtained is called a binary image.

3.2.2 Analysis of the Proposed Binarization Scheme

h proposed method is based on the assumption that except

dt t h e edges, the picture is uniform in an area of 3 X 3 window

( t .~n i formi ty asstrn~ption). This r-iecessarily means that a hand-drawn

line is a k Least 3 pixels wide[l2,16] . The edge patterns defined

by t h ~ 5-pixel-neighbors exclude a completely dark window and the

p*~tt;esns formed by corners of a sketch. This exclusion is

jus'tif-'j.od t~ecnuse of the following reasons:

Rccause the image distribution is not known a priori, all

. . w. - x i in t5s image are to be treated uniforn~ly. So, even

though corners form a small percentage of the object points,

C Q ,L- . -. 3 -- pattern checking should be done at all points. This

;n i iPr , ; .::lze process computationally expensive.

2 , Rocc:.;so of the involvemer?.t of a smaller set of neighboring

p i . z r a 1 ; i : a d 2 - f armining the threshold, the noise sensitivity of

s . *- , l i s i he-cshol ci increases.

- i n . : : t h e mi.nimum line width is of 3 pixels, the loss in

i,i?r.li<:?: . -, information that rnay occur due to blunting of corners C

', T'" - ., ;- ,; >\!LL,.: :i 1 . ._i . 4 . 1, :l. Y.; ;I -L :- -L~: , Sc?c:ause of the uniformity assumption, the

I.r,."c.si:?a~tion -f:hat can be obtained from a completely dark

window can be obtained from any one of the 5-pixel-neighbors.

As the background has a gray value lower than that of object,

the 5-pixel-neighbor pattern corresponding to the minimum-average

'm' in equation (3.2) gives the most likely background pattern

along an edge of a line. The pattern also gives the direction of

the edge of the line at the point under consideration. Hence the

5-pixel-neighbor pattern corresponding to the minimum-average is

called an Edge-Vector and the minimum-average is called the edge-

vector-gray-value. Only if the pixel gray valug higher than the

edge-vector-gray-value by a certain extent, it can be treated as

an object pixel.

Dark specks on the background are formed when the noSse

level is high enough to make a few random pixels ill the

background area as dark as the object. Similarly holes are formed

in the object when the noise level is high enough to make a few

random pixels in the object area as bright as the background.

Formation of specks or holes is not easily controlled by either

global or conventional local thresholding schemes, unless they

are preceded by some filtering or local averaging processes. But

these preprocesses with the exception of median filtering tend to

increase smearing and some time may introduce disconnection in

object patterns.

A thresholding scheme to suppress specks and holes must

generate large thresholds on the background, so that the

formation of specks is suppressed. It must also generate low

thresholds on the object, so that the holes are filled up. It can

be observed that, if the window is on the object the edge-vector-

gray-value is high, because it gives the average cf the object

pixels. Similarly on the background region, the edge-vector-gray-

value is low. Using the edge-vector-gray-value 'm', the threshoid

could be selected as

where C1 and C2 are chosen constants.

In both equations (3.5) and (3.6), it can be observed that

the threshold is low on the background because 'm' is low, and

thereby assisting the speck formation. The threshold is high on

the object because 'm' is high, and thereby assisting the hole

formation. Most of the conventional schemes suffer from this

drawback. But in equation (3.3), the threshold is high on the

background and is low on the object area. This property

suppresses the formation of specks and holes. In Fig.3.1,

binarized images obtained from a global threshold(Otsu[l]~) and

from the proposed scheme are compared. Fig.3.la is the input

graytone image which is sprinkled by random additive noise of

amplitude 30% of the average gray value of the image. Fig.3.lb is

the binarized version obtained by Otsu[l] algorithm. Fig.3.l~ and

Fig.3.ld show the outputs of the proposed algorithm with and

without automatically selected constants. From the figure it can

be observed that the proposed algorithm has markedly better

performance in the presence of noise.

a . NOISY GRAYTONE IMAGE

--

G. BINARY IMAGE ODTAINED BY THE PROPOSED ALCO- RITHM WITlI AUTOMATI- CALLY ESTIMATED CONSTANTS.

Fig. 3.1 ILLUSTRATION OF THE PROPOSED ALGORITHM.

b. BINARY IMAGE OBTAINED BY OTSU[l] ALGORITHM.

.- --

d. BINARY IMAGE OBTAINED BY THE PROPOSED ALGO- RITHM WITH EXPERIMENT- ALLY SELECTED * CONSTANTS.

r.; ..A-

NOISE SUPPRESSION PROPERTY OF THE

It i.s observed tha.t the threshold is high if the edge-

vector-gray-value is low. Hence, the pixels in the smeared region

experience a higher threshold than the pixels in the object:

region. Thus the misclassification of the pixels of smeared

boundary as the object pixels is suppressed.

The scheme is sensitive to noise present in the smeared

region. This is because the assumed uniformity condition is not

applicable on the smeared region, where the gray values transit

from the object level to the backgrourld level. But fortuna-tcly

the specks or holes formed at the boundary of a line do not

distort the line information significantly.

To extract local line vectors, the 5-pixel-neighborhood

patterns are given weiyhtages depending on the direction of the

edge they represent, as shown in Fig.3.2. If the lines in an

input image are of uniform thickness, the direction of the edge

of a Line at any point is the same as the direction of the line

at that point. Fig.3.3 shows a birlarized image with its edge

vectors specified. The numbers in the image are the 'edge-vector'

weightages. It can be seen that vertical lines in .the image

predominantly have the number '3' on their edges, indicating

that the lines are inclined at 90° to the horizontal axis.

Similarly, the horizontal lines have predominantly the nulnber '1'

on their edges, indicating that they are horizontal. It can also

be. seen that, inclined lines have a combination of numbers on

their edges. For example, the positively inclined lines have a

combination of 3s and 2s on their edges, indicating that they are

inclined to the X-axis by an angle between 90°(represented by

the weightage 3) and 4s0(represented by the weightage 2).

V s e t

3

set I- set

s e t

H, V , I+, I-. sets r e f e r to the e d g e

p a t t e r n s o f F i g . 1 . 4 .

Fig.3.2 Weightages o f t h e Edge v e c t o r s .

44

Fig.3.3 A BINARIZED IMAGE WITH LOCAL LINE VECTORS

3.3 EXTRACTION OF LINES FROM A BINARY IMAGE

Lines are extracted from a binary image of a line sketch

without thinning. Towards the development of the algorithm, an

image model is assumed. The algorithm proposed is basically a

direct vectorization scheme. The lines extracted are output in

the form of coordinates of their end points. The output also

specifies their connectivity.

3.3.1 Assumed Image Model for the Line Feature Extraction from a

Binary Image

A binary image is viewed as a matrix where the object pixels

are represented represented by 1's and the background pixels by

0's. Usually all images have an indicator line as specified in

section 3.1. All the lines in the image are of uniform

thickness. These assumptions constitute the model of the image to

be processed.

A contiguous sequence of l's(object pixels) in any row of

the image is called a run. The number of pixels in a run gives

the run length. The rnanner in which runs overlap determines the

line features like deviations and of lines.

A run in the (i+l)th row of the image matrix is said to

overlap a run in the ith row, only if

B = E l ) + and B(i+l) = < E(i)+l

where B ( i ) is the beginning column of a run in the ith row and

E(i) is the ending of the run in the ith row. Similarly B(i+J-)

and E(i-i-1) are the beginning and ending of a run in the (i+l)-th

row. In other words, a run on the ith row has a overlapping run

on the ( j.+l jth row if a-k least one pixel. in the run of ( i+l )th

row is i.n the 8-nei.ghborhood of a pixel in the run of ith row.

A s long as an overlap exis.ts, the line continues. When a

line is vertical or nearly vertical, we observe that a line

obtained by the trace of the central pixels of the runs is the

line represented by the corresponding binary pattern as shown in

Fig.3.4a. But the line represented by the central pixels of the

run ceases to be the line represented by the binary pattern, if

the line is horizontal or nearly horizontal, as shown in

Fig.3.4b. Under such conditions, if the runs were viewed along

the columns we could have extracted the correct line by jo.ining

the central pixe1.s of the runs. This necessitates classification

of the binary patterns into vertical and horizontal classes

before the lines are extracted. The following discussion shows

that the 'run length' and the 'line width' illformation can be

used to carry out this classification.

When a line is along the column of the image matrix, the

runs traced by raster scan(row-by-row) are perpendicular to -the

line they represent. Hence the run length can be taken as the

measure of the width of the line at that location. If a line is

not exactly vert;ical, the run length observed in the raster scan

is greater than the actual width of the line as can be seen in

fig 3.5. We select a threshold of inclination of 6o0 with respect

to the vertical axis, above which the lines are considered as

horizorital class of lines. This classification can be carried out

by the run length and the line-width information. From Fig.3.5,

we observe that

F i g . 3 . 4 a . Vertical t r a c e of a vertical line pattern.

b . Vertical trace of a horizontal l i n e pattern.

w width o f the l i n e .

- . w / cost3 = run length o f the -- -- incl ined 1 ine . ---A --- w + w / cose = the run length a t

tha intersect ion.

- - : 8 = angle o f i nc l ina t ion .

Flg.J.5 Run- length variations with line

inclination and at line intersection.

(run length) * cos( O ) = w i d t h .

Hence for 6- 60°, the run length = 2* width.

Therefore, if the run Length observed from the vertical

direction is greater than twice the line width, the corresponding

binary pattern is classified into the horizontal class. The ruR

length and the line width information can be further used for

intersection detection, because the run lengths observed at all

intersection points are greater than twice the line width,

irrespective of the angle of inclination of the intersec-ti.lg

lines as shown in Fig.3.5.

3.3.2 Proposed Feature Extraction Scheme

We assume that the line traversing the core of the binary

pattern is a good approximation of the input line. We have

already discussed in the previous section that the central pixel

of the run is on the object line, only if the runs are nearly

perpendicular to the direction of lines. But in raster scan,

horiz~ntal lines have the runs parallel to the direction of the

l i ~ l e s and a trace of central pixels of the runs represents a line

other than the object line. Hence non-raster scan(co1umn-by-

column) is used to trace horizontal class of lines. In either

scan, only one class of 1.ines are processed. If lines of the

other class are observed, corresponding runs are skipped.

Initially, the runs of the indicator line are traced through

raster scan. Because the indicator line is a vertical line

segment, the average line width information of the sketch is

obtained by calculating the average run length of this indicator

line. O n c c -t.:hc! l i.ne wicl.t:h in fo . rmat . ion :i s extracted, t h e i nrl L c a l : ~ ) ~ '

line is not processed.

In raster scan, all runs with run length less than twice -the

average lj-ne width are traced. At every stage of tracing, ofily a

constant(srnoothing factor) number of overlapping runs are traced

and a vector is placed such that it joirms the central pixel of

the starting and ending runs of that stage. If this vector

deviates from .the vector representing the line extracted so fa r .

by more than a prescribed threshold, a deviation is i11di.cated

within the span of this new vector. Then the exac-t point of

deviati.on is found out by a sirnple binary search. Once the exact

point of deviation is found, the line extracted so far up to the

point of deviation is stored as a line segment and the rest of

the new vector is taken as a part of a new line., Then the next

stage of line tracing begins. On the other hand, if the deviation

of the new vector is less than or equal to the prescr~ibed

threshold, the vector is considered as a continuation of the

line.

This process of line tracing is interrupted either by t h e

absen.ce of an overlapping run or by the presence of an

overlapping run of length greater than twice the line width.

Whenever an interruption occurs, the line traced until that point

is stored. If the interruption is due to the absence of an

overlapping run, a search for extraction of lines from untraced

patterns begins. But if the interruption is due to the run length

constraint, the tracing continues but the line trace is skipped

as long as the run length is greater than twice the line width.

The value of the '"smoothing factor' is heuristically

selected as thrice the li.ne width. This constarrt should not be

too large because it may suppress the detection of prominent

deviation points also, nor can it be too small for it may then

lead to the detection of a large number of spurious deviation

points.

L.ine extraction through non-raster scan is carried o u t on

similar lines. But here all t;he vertical lines are skipped u s i q

the same run length and line width constraint. In either of the

scans, the patterns f:rorn which lines are already extracted are

skipped. This avoids duplication of lines in the range of

inclination of 30° to 60O with respect to the coordirlate axes,

where horizontal and vertical classes of lines overlap.

T h e L i n e Extracti.~~~ Algorithm

1. IF there is an unprocessed run, start the line trace.

ELSE

stop.

2 . WHILE there is an overlap and the run is shorter than twice

the line width and the number of runs stored is less than the

smoothing factor,

Trace the runs as the constituents of the line extracted

so far.

3 . WHILE there is overlap and the run is shoxter than twice the

line width and number of runs stored is less than the

smoothing factor,

Trace the runs as the constituents of the new line

vector.

4 . I F t h e d e v i a t i o n between t h e new l i n e vecztor and the :me

e x t r a c t e d s o f a r i s g r e a t e r t han t h e t h r e s h o l d ,

f i n d t h e e x a c t d e v i a t i o n p o i n t , o u t p u t t h e l i n e segment

up t o t h e d e v i a t i o n p o i n t , s t o r e t h e remaining n e w line

a s t h e l i n e e x t r a c t e d s o f a r .

I F t h e r e i s an o v e r l a p , and t h e run l e n g t h i s less than

twice t h e l i n e width ,

go t o s t e p 3,

ELSE

o u t p u t t h e l i n e e x t r a c t e d s o f a r and

I F run l e n g t h i s g r e a t e r t han tw ice t h c l i i - 1 2 w i d t h ,

s k i p t h e l onge r runs and go t o s t e p 2 ,

E l s e

go t o s t e p I .

5. I F t h e d e v i a t i o n between new l i n e v e c t o r and t h e l i n e

e x t r a c t e d so f a r i s l e s s t h a n o r equa l t o t h e t .hseshold,

add new l i n e v e c t o r t o t h e l i n e e x t r a c t e d s o f a r .

I F t h e r e is o v e r l a p and t h e run l e n g t h is l e s s t h a n

tw ice t h e l i n e wid th ,

go t o s t e p 3.

ELSE

o u t p u t t h e l i n e e x t r a c t e d s o f a r and

I F t h e r u n l e n g t h is g r e a t e r t h a n tw ice t h e l i n e

width ,

s k i p t h e l onge r r u n s and go t o s t e p 2 ,

ELSE

go t o s t e p 1.

After the extraction of lines, the end points of tho line

segments separated by a distance less than twice the line width

are given common coordinate points. This gap in between two

segment end points occurs because rows and columns are used for

coordinate specification. This gap filling is based on the

assumption that for any two lines to remain separated on the

paper, they should have a minimum of 'one line width' gap between

them.

3.3.3 Results of the Line Extraction Process

An efficient method for vectorization of binary line

patterns is proposed. The vectorization is strictly based on the

number of curvatures and corners present in the line. Hence

vectorization of horizontal or nearly horizontal lines is

possible with equal ease as for the vertical class of lines. An

example of the input and output of the preprocessing stage is

shown in Fig.3.6. Fig.3.6b is a plot of the features extracted in

the preprocessing stage. The features are represented in the form

of coordinates of the segment end points and their connectivity,

as shown in Fig.3.7.

The segment information along with the line width

information makes the various features of the image explicit to

recognition and correction processes discussed in chapter IV. The

machine representation of Fig.3.7 can also be viewed as a

compressed code of the input line sketch.

- -- - ---


b. PLOT OF THE FEATURES . EXTRACTED IN THE PREPRO- CESSING STAGE.

Fig.3.6 SAMPLE INPUT AND OUTPUT OF THE PREPROCESSING STAGE.. I

5 3

coord ina te I p o i n t f,

coord ina te (pa i n t2 ,

caord ina te 0point3,

coord ina te (point4,

coord ina te b o i n t 5 ,

coord ina te 6point6,

caard ina te (point7,

coord ina te (point8,

connects ( po in t i. poin t21 . connects (point2. po in t31 . connects (point4, po in t51 . connects Ipoint5. po in t61 . connects lpoint7. po in t51 . connects [point4. po in t21 . connects IpointZ, po in t81 .

connects Ipoint5, po in t31 .

connects (point6, po in t31 . connects (point3. p o i n t e l . connects (paint7, po in t41 . connects Ipoint4, p o i n t i l .

Fig.3.7 M a c h i n e r e p r e s e n t a t i o n f o r t h e

sketch, F i g . 3.6%.

CHAPTER IV

RECOGNITION AND CORRECTION OF LINE SKETCHES

The sketch features extracted in the preprocessing stage are

recognized and corrected to obtain a drafted version of tile input

line sketch. To facilitate the search that is involved I-luring

recognition, the input is represented in the form of a gra,~h. As

the features input are from hand-drawn line sketches, only

approximate matching is possible in the recognition stage. The

recognized figures are then corrected in accordance with the

geometric model with which they match. If the input sketch is a

combination of geometric figures, while correcting an individual

figure, its relation with the connected neighbors is preserved.

The context in which a sketch is to be viewed plays an important

role in the rec~gnition process. Hence provision to specify thc

context is also made available.

Section 4.1 describes a graph representation of features of

a Line sketch. Section 4.2 describes organization of standard

geometric model-s in the form of rules. Section 4.3 explains the

organization of the context module. A control strategy to

coordinate the processes like rule selection, pattern search,

check for contextual consistency, correction, etc., is described

in section 4.5. Implementation details of the recognition and

correction stages are given in section 4.6. In section 4.7

performance of the recognition and correction stages is

illustrated with typical examples.

4.1 GRAPH REPRESENTATION OF FEATURES OF A LINE SKETCH

The features extracted from an input line sketch are

represented in the form of a graph structure. The recognition and

correction processes view this representation as a relational

da-tabase, which provides the input sketch information. The

recognition process carries out search in the database to

recognize the geometric figures that are present, whereas the

correction process updates the database whenever necessary. The

final updated version of the graph represents the output of the

recognition and correction stage.

4.1.1 Need for Representation of Extracted Features

Suitable representation of the features of an input sketch

is needed to meet the following basic requirements:

1. Various features like line width, line segments, deviation

points, etc., make no more sense than a set of numbers to a

machine. If this set is to be interpreted the relations

existing among the elements of the set must be made explicit.

2. An input line sketch may be a combination of many geometric

figures. Therefore recognition of a pattern approximating a

geometric model invariably involves a search. This search

must be efficient.

3. The correction process, while correcting individual geometric

figures of a sketch, may modify the overall sketch

information. This modification is not acceptable, because the

aim of d r a f t i n g i s t o p r e s e n t t h e same s k e t c h in fo rma t ion i n

an a e s t h e t i c a l l y improved manner. Hence i t is necessary t o

p r e s e r v e t h e i n p u t i n fo rma t ion .

4.1.2 Suitability of Graph Representation

A graph is a connected s t r u c t u r e , where every node i s

connected t o a t l e a s t one a r c and every a r c connec ts two nodes. A

node i n t h e graph r e p r e s e n t s an e x t r a c t e d f e a t u r e , whereas an a rc

r e p r e s e n t s t h e r e l a t i o n t h a t e x i s t s between t h e connected nodes.

A r e l a t i o n t h a t e x i s t s between a p a i r of f e a t u r e s can be made

e x p l i c i t by connec t ing t h e p a i r of f e a t u r e nodes w i th an a r c

r e p r e s e n t i n g t h e r e l a t i o n . F i g . l . 5 shows one such graph

s t r u c t u r e , which r e p r e s e n t s a t r i a n g l e . The graph s t r u c t u r e can

be viewed a s a r e l a t i o n a l da t abase which answers t h e q u e r i e s on

t h e r e l a t i o n s e x i s t i n g among t h e r e p r e s e n t e d f e a t u r e s .

Recogni t ion i n v o l v e s a s e a r c h i n t h e da t abase . This s e a r c h i s

f a c i l i t a t e d by a graph because once t h e s e a r c h f o r a p a r t i c u l a r

p a t t e r n beg ins , t h e nex t node t o be searched must be an ad jo in ing

node. Th i s a d j o i n i n g node can be r e a d i l y ob ta ined from t h e graph.

F u r t h e r , t h e graph r e p r e s e n t a t i o n f a c i l i t a t e s r e d u c t i o n i n search

t ime through t echn iques l i k e node c o l o r i n g , smoothing o f l i n e

d e v i a t i o n s , etc:. I n a geomet r ic l i n e s k e t c h , t h e ske t ch

in fo rma t ion depends on c o n n e c t i v i t y o f l i n e segments, t h e i r

r e l a t i v e o r i e n t a t i o n s and l e n g t h s . The graph s t r u c t u r e a i d s i n

p r e s e r v i n g t h e s e i n v a r i a n t p r o p e r t i e s du r ing c o r r e c t i o n ( s e e

s e c t i o n 4.4 .2) .

4.1.3 Reduction of a Graph Representing a Line Sketch

Graph reduction reduces search time and also helps to avoid

erroneous interpretations. It involves deletion of noisy

segments, restoration of intersections and smoothing of spurious

deviations.

4.1.3.1 Deletion of noisy segments: -- - -- By noisy segments we refer

to all the line segments which do not belong to a closed loop and

have lengths less than the line width. Here line width is a

feature extracted in the preprocessing stage, which is equal to

the average line width of the input line sketch. This width is

taken as a default threshold. The reduction process is

illustrated in Fig.4.1. Note that point p2 in the figure is open

(a point node connecting only one line segment). If both ends of

a line segment whose length is less than the threshold are open,

that line segment is deleted from the database. This reduction

can be carried out using any length as the threshold. Depending

on the coarseness of the input sketch, user can override the

default threshold.

4.1.3.2 - -- -- --- - - Restoration of intersection points: Some intersection - - --- --

points get distorted due to the combined effect of quantization

error and faulty preprocessing. Two exa~r~ples of such

intersections are given in Fig.4.2a. These points of intersection

can be restored by shrinking unwanted line segments. The

reduction is carried out as shown in Fig.4.2b. The point p12 of

the reduced graph of Fig.4.2b is the mid point of the line

joining points pl and p2 of the original graph.

F i g . 4 . 1 D e l e t i o n o f t h e l i n e sement ' s e g n ' which is an open segment and has t h e l e n g t h less t h a n t h e t h r e s h o l d

F k g . 4 . 2 a T w o e x a m p l e s o f d i s t o r t e d i n t e r s e c t i o n

connect

x l + y l 9( r--

2

Y - x 2 + y 2 2

F k g . 4 . 2 b I n t e r s e c t i o n r e s t o r a t i o n

4.1.3.3 - Deviation -- smoothing: This process joins two line

segments connected at a point with a relative orientation

approximately equal to 180 degrees, provided exactly two line

segments are connected at that point. This smoothing not only

reduces the database but also avoids erroneous recognitions. This

erroneous recognition occurs if the recognition process involves

graph matching. An example of such a situation is shown in

Fig.4.3a, where a approximately triangular figure may be

recognized! as an irregular quadrilateral because it constitutes

four line segments. Fig.4.3b shows its smoothed version.

4.1.4 Graph Classification

The graph classification effectively reduces the search

space. One can observe that each geometric model forms a closed

loop. Hence a search for a matching pattern for a particular

geometric mcdel involves search for a closed loop of path length

equal to the number of sides constituting the model. By 'path

length' we mean the number of segment nodes traversed. For

example, a triangle is recognized in the database only if there

is a closed path constituting exactly three segment nodes. This

search can be reduced if the open paths are not traversed. To

enable this, all the nodes and arcs in the open loops are given a

color which is transparent to the search process. Hence such

paths are not traversed during the search for a geometric model.

The classification of open paths from rest of the graph can be

achieved by repeated application of the following rules till they

fail. An illustration of the classification is shown in Fig.4.4.

oonnmotm

s m o o t h i n g

Fig.4.3 Segments ' s e g l ' and ' s e g 2 ' s u 9 t e n d an a n g l e L n e a r l y e q u a l t o 180

' s s g l 2 ' is smoothed segment o f ' s e g l ' and ' s e g 2 ' .

1

con

F i g . 4 . 4 Classification of open points

( . . . . i n d i c a t e s o p e n c o l o r e d c o n n e c t i o n )

Rulel: A point node is 'open', if it connects only one line

segment.

Rule2: If there is an open point, color the connecting arc so

that it becomes transparent for further search.

All open loops are colored in 'open' color and closed loops

in 'soft' color. The values of the nodes with only these two

colors can be modified during correction. Once a node is

corrected, its color is changed to 'hard' color and vtilues of

hard colored nodes are not modified. A graph with this

classification is shown in Fig.l.6. In this figure nodes S1,

S2, ......, Sm, represent line segments of lengths L1, L2, . . . - , Lm, respectively. The line segments connect their end points

P L , P 2 , ....., Pn, through the arc 'connects'. The nodes al,

a2, ....., a L , indicate the relative orientation between various

connected line segments. Dotted lines with tag 'disconnects'

indicate open paths.

4.2 APPROXIMATE GEOMETRIC MODELS

Geometric i~~odels are defined in the form of rules. These

rules allow approximate matching. An input line sketch matches a

model, if it satisfies all the conditions of a rule describing

the model. For size and orientation independent matching,

conditions In a rule must be governed by the relations that exist

among the line segments constituting the geometric model. To

every specified relation, a threshold for approximate matching is

prescribed. Rules are given priorities to resolve conflicts in

decision ~aking.

4-2.1 Rules Describing Standard Geometric Models

Rule describing a geometric model assumes that the extracted

line features are represented in the form of a graph. A 'path1 in

a graph is described as a continuous traversal of connected nodes

and arcs. A path is said to be a 'closed loop1 if upon traversal,

the starting node is reached without traversing any of the nodes

on the path more than once. A 'path length' is the number of

segment nodes traversed in a path. Using these basic definitions,

rules for geometric models are described as follows.

Rule 1: TRIANGLE: If there is a closed loop of path length

three, then it is a triangle.

Rule 2 : QUADRILATERAL: If there is a closed loop of path length

four, then it is a quadrilateral.

Rule 3: POLYGON: If there is a closed loop of path length 'n',

then it is an 'n' sided polygon.

Rule 4: lSOSCELES TRIANGLE: If there is a closed loop of path

length 3 and two angles between pairs of

connected line segments are approximately

equal, then it is a isosceles triangle.

Rule 5: EQUILATERAL TRIANGLE: If there is a closed loop of path

length three and the angles between three pairs

of connected line segments are approximately

equal, then it is an equilateral. triangle.

Rule 6: RIGKTANCLED TRIANGLE: If there is a closed loop of path

length three and the angle between a pair of

line segments is approximately equal to 90

degrees, then it is a rightangled triangle.

Rule 7: RIGHT-ISOSCELES TRIANGLE: If there is a closed loop cf

path length three and two of its angles between

pairs of its connected line segments are

approximately equal and the angle between the

third pair of connected line segments is

approximately equal to 90 degrees, then it is a

right-isosceles triangle.

Rule 8: TRAPEZIUM: If there is a closed loop of path length four

and a pair of opposite sides are approximately

parallel, then it is a trapezium.

Rule 9: PARALLELOGRAM: If there is a closed loop of path length

four and both pairs of alternate line segments

are approximately parallel, then it is 2

parallelogram.

Rule 10: RECTANGLE: If there is a closed loop of path length four

and both pairs of alternate line segments are

approximately parallel and an angle between a

pair of its connected line segments is


rectangle.

Rule 11: SQUARE : If there is a closed loop of path length four

and both pairs of alternate sides are

approximately parallel and consecutive sides

are approxinlately equal and subtend an angle


square.

Rule 12: RHOMBUS: If there is a closed loop of path length four

and both pairs of alternate sides are

approximately parallel and consecutive sides

are approximately equal, then it is a rhombus.

4.2.2 Thresholds for Approximation

The patterns extracted from the hand-drawn line sketches

match only approximately with the description of the stored

geometric models. The allowable ranges of discrepancies in

angles and sides are decided by the prescribed thresholds. The

thresholds are defined as follows.

1.Threshold for absolute orientation (Ao): A line is

approximately vertical or approximately horizontal if it

deviates from the horizontal or vertical axis by less than

a prescribed constant Ao.

2. Thresholds for equality ( Eo ) : Two quantities are --

approximately equal if the difference between them is less

than a value E times the smaller of the two quantities. 0

The default value of Eo is 10%.

3.Threshold - - of parallelism (P ) : Two unconnected sides arc 0

approximately parallel if they subtend an angle smaller

than a threshold P given by 0'

where A. is the threshold for absolute orientation, H is the

average distance between the line segments under consideration

and L is the length of the shorter of the line segments.

These three thresholds are selected because of the following

observations:

(i) The hand--drawn lines representing vertical or horizontal

lines invariably deviate from the vertical or horizontal

axis. Such deviations must be tolerated. The range of

allowable deviation is specified by the threshold for

absolute orientation Ao.

(ii) A difference in line lengths looks obvious if the lines are

short. But the same difference becomes insignificant if the

lines are longer. Fig.4.5 shows an example of such a

situation. The same argument is true in case of angles.

(ii) The accuracy of representation of an angle subtended by two

unconnected lines tends to be low, if they are separated by

a large distance. This situation is clarified in Fig.4.6.

Secondly, if parallel lines to be drawn are long, one can

draw them with better accuracy than if they were short.

This is because a small angular deviation becomes visually

more and more obvious as the line lengths increase as shown

in Fig.4.7. Hence the threshold selected should be

proportional to the distance of separation and inversely

proportional to length of the lines.

4.2.3 Priority Ordering of the Rules

Some geometric models are special cases of some other

geometric models. For example, an equilateral triangle is a

special case of an isosceles triangle, where the third side is

equal to the two equal sides. A square is a special case of a

quadrilateral, a parallelogram, a rectangle and a rhombus. This

- - two lines of unequal lengths

-.--.--.-.-.--- ----__

- - - . - --.--. .-.--.LI--L.I . - - - -_. - ._. *

two lines o f nearly equal lengths

F i g . 4 . 5 Both set o f lines have t h e s a m e difference in lengths.

- two non parallel lines

two nearly parallel lines

F i g.4.6 The same pair of lines at different distances of separation.

two nearly parallel lines

-- --- .--.- -- -- two non parallel lines

F i g . 4 . 7 Two pairs o f lines of the s a m e relative inclination.

situation leads to conflicts in decision making. For example, an

input pattern approximating a square (or an equilateral triangle)

satisfies more than one rule and a conflict arises as to which

model does it approximate. To avoid such conflicts, rules are

given priorities. If more than one rule is satisfied, the model

corresponding to the rule with the highest priority is selected

as the recognized model. These priorities cannot be given

arbitrarily. For example, if the model for a rectangle is given

priority over that for a square, none of the square patterns will

be recognized as squares.

Every time a rule succeeds, only the corresponding matched

pattern is corrected. To extract all similar patterns existing in

the database, the rule that matches a pattern has to undsrgo

repeated execution until it fails. Thus the priority relation

must be reflexive, which means a rule has priority over itself. --

Secondly, if a particular model gets a priority over another

model, the latter model can never get a priority over the former.

Hence the priority relation is antisymmetric. Finally if rule 'a1

gets priority over rule 'bl, and rule 'bl gets priority over rule

'cl, then rule 'a1 has priority over rule 'cl. This suggests that

the priority relation is transitive. These reflexive, transitive

and antisymmetry properties make a priority relation, a Partially - - -

Ordered Relation. -- --

For simplicity of representation, let each rule be

represented by its number. Then {I, 2,3,4,5,6,7,8,9,10,11,12} is

the set of rules defining all the geometric models. A square

pattern satisfies the set {2,3,8,9,10,11,12}. Thus a square can

be viewed as partitioning the set of rules into

{ {2,3,8,9,10,11,12}, 1,4,5,6,7}. Similarly we have partitions

of other patterns as follows:

RECTANGLE: { {2,3,8,9, lo}, 1,4,5,6,7,11,12}

RHOMBUS: { {2,3,8,9,12}, 1,4,5,6,7,10,11}

PARALLELOGRAM: { {2,3,8,9}, 1,4,5,6,7,10,11,12}

TRAPEZIUM: { {2,3,8}, 1,4,5,6,7,9,10,11,12} -

QUADRILATERAL: { {2,3}, 1,4,5,6,7,8,9,10,11,12)

EQUILATERAL TRIANGLE: { {1,3,4,5}, 2,6,7,8,9,10,11,12}

RIGHT-ISOSCELES TRIANGLE: { {1,3,4,6,7}, 2,5,8,9,10,11,12}

ISOSCELES TRIANGLE: ( {1,3,4}, 2,5,6,7,8,9,10,11,12}

RIGHTANGLE TRIANGLE: { {1,3,6}, 2,4,5,7,8,9,10,11,12} -

TRIANGLE: { {1,3}, 2,4,5,6,7,8,9,10,11,12}

POLYGON: {1,2,3,4,5,6,7,8,9,10,11,12}

These partitions ,of the partially ordered rules can be

represented in the form of a Hasse diagram[53]. This is a diagram

where an arc represents a relation and a node represents a

parti.tion. Larger the partition, lower is the node level. The

Hasse diagram for the set of rules and partitions made by

geometric models is shown in Fig.4.8. One can observe that rules

with more stringent conditions make larger partitions. The rules

describing square, equilateral triangle, etc. have large

partitions, whereas the rule describing in general a polygon

makes no partition at all. Using the specificity ordering

resolution strategy[ll], the rule which makes the biggest

partition is given the highest priority because under the given

circumstances it is the most specialized rule. A rule with zero

or the smallest partition is given the lowest priority. In

F i g . 4 . 8 Hasse diagram showing r u l e priorities.

(The numbers a t t h e nodes correspond t o t h e r u l e numbers)

between these two bounds, priority is distributed in accordance

with the size of the partition made by a rule. From the Hasse

diagram, we observe that there is no unique lower bound. i.e.

there is no largest partition. The partitions of equilateral

triangle, right-isosceles triangle and square are the three lower

bounds of the diagram. Since these models are unrelated, no

matter what priority is forced over them, the decision process is

not hindered as long as the existing relations are maintained.

One such forced priority relation, which preserves the priori-ties

specified in the Hasse diagram is,

Square - - > Rectangle - - > Rhombus - - > Parallelogram - - > Trapeziu~n

- - > Quadrilateral - - > Equilateral Triangle - - > Right Isosceles

Triangle - - > Isc-sceles Triangle - - > Triangle - - > Polygon.

Here the symbol ' - - > I is read as 'has the priority over'. This is

a totally ordered relation and can be easily implemented.

4 .3 CONTEXT MODULE FOR L I N E SKETCHES

An interpretation of an input figure is checked for its

contextual consistency. If the interpretation is found to be

inconsistent, a search for an alternative interpretation is

made. If no context is specified, all interpretations are

assumed to be contextually consistent.

The context module can also check for the user specified

conditions. This provision can be effectively used for

recognition and correction of highly distorted sketches. This may

provide higher freedom to the user for drawing input line

sketches. Consider the case of the Fig.l.1. The machine can view

the pattern as four triangles connected at their vertices or two

quadrilaterals one within the other. If the machine corrects the

sketch as a combination of triangles, the user may find the

output distorted, if he erere to view the figure as two

quadrilaterals one within the other. To overcome such problems, a

user may specify his views in the context module.

4.3.1 Structure of the Context Module

The context module is represented in a form similar to

frames, with context as the frame name and various slots as

geometric models. Values of these slots decide whether a queried

model is consistent or not. Every new context, which is a frame

by itself fills a 'default' slot of the general context frame.

When the contextual consistency of an interpretation is to be

checked, the query starts at the present context frame. If the

frame contains no information about the pattern queried, the

default value is obtained from the frame next in the hierarchy.

Fig.4.9 depicts the structure of the context module.

4.4 CORRECTION OF HAND DRAWN LINE SKETCHES

Every figure, with contextually consistent interpretation,

is corrected in accordance with the relations governing the model

with which it matched. To preserve the overall sketch appearance,

the relations that exist between the figure to be corrected and

its connected neighbors are also considered in the correction

process. The correction process basically consists of two stages.

(G2- CONTEXT

sketch a

F i g . 4 . 9 S t r u c t u r e o f t h e context m o d u l e .

Tho first; s t age i ~ l v o l v e s s e l e c t i o n of r e f e r e n c e l i n e segment and

t h e second s t a g e invo lves c o r r e c t i o n of recognized geometric

f i g u r e w i t h r e s p e c t t o t h e s e l e c t e d r e f e r e n c e .

4 . 4 . 1 S e l e c t i o n of a Reference Line Segment

Various r e l a t i o n s among t h e l i n e segments c o n s t i t u t i n g a

geomet r ic model can be expressed i n t e rms of r e l a t i v e

o r i e n t a t i o n s of l i n e segments and t h e i r l e n g t h s . Unless any one

of t h e l i n e segment i.s used a s t h e r e f e r e n c e , r e l a t i v e

o r i e n t a t i o n remains undef ined. Hence t h e f i r s t s t e p i n t h e

c o r r e c t i o n p roces s i s t o s e l e c t a r e f e r e n c e l i n e segment. During

t h e s e l e c t i o n of a r e f e r e n c e l i n e segment, we t r y t o s a t i s f y and

p r e s e r v e t h e neighborhood r e l a t i o n s o f t h e p a t t e r n t o be

c o r r e c t e d . The s e l e c t i o n of r e f e r e n c e l i n e segment i s governed by

t h e fo l lowing s e t o f r u l e s .

11

i . S e l e c t a l i n e segment which is a l r e a d y c o r r e c t e d a s t h e

r e f e r e n c e .

ii. S e l e c t a l i n e segment which has a d e f i n i t e r e l a t i o n

w i t h i t s ne ighbors , a s t h e r e f e r e n c e .

iii. S e l e c t a l i n e segment which has a d e f i n i t e r e l a t i o n

w i t h t h e co - o r d i n a t e a x i s , a s t h e r e f e r e n c e .

i v . S e l e c t a l i n e segment a s t h e r e f e r e n c e .

1 I

T h e symbol I] . . . 1 1 i n d i c a t e s t h a t t h e r u l e s w i t h i n t h e

symbol a r e totally orde red and t h e d e c i s i o n made by t h e e a r l i e s t

r u l e s a t i s f i e d i n t h e r u l e l ist is cons ide red . I f a r u l e wi th t h e

h i g h e s t e x i s t i n g p r i o r i t y i s s a t i s f i e d by more t h a n one l i n e

segmerbt of the recognized figure, any one of them is selected as

the reference. In rules 2 and 3, check for definite relation

involves a cheek for the existence of approximately parallel or

approximately perpendicular relations.

4.4.2 Correction sf Individual Geometric Figures

Correction is made with respect to the selected reference

segment and in accordance with the relations governing a matched

model. All the relations of the model are defined in terms of

connectivity, length and angles. The connectivity is never

destroyed because arcs defining connectivity are not updated.

Even if the coordinate value of a 'point' node is updated during

correction of any one of the line segments, all the other Line

segments connecting that poi.nt also refer to the updated point.

This is because in a graph structure a relational arc refers to a

node and not to the value of the node. Relations like relative

urientation cr line lengths are also preserved, because

corrective shift given to any point is such that the rnodificat:ion

in line length and relative orientation is always within the

specified range of approximation.

Following geometric formulae are used for correction.

1. If a line segment (Xl,Yl), (X2,Y2) is rotated by an angle ' B '

as shown in Fig.4.10, then the rotated end point (X3,Y3) is given

by

F i g . 4 . i Q R o t a t i o n o f a s t r a i g h t l i n e .

(Xi Y % l ? X-ax la

F i g . 4 . 1 1 L o c a t i o n o f a c o n n e c t e d

segment end p o i n t ( X 3 , Y 3 ) .

2. If a line segment (X2,Y2),(X3,Y3) of length L2, is connected

to a reference line segment (Xl1Y1),(X2,Y2) of length L1, and if

they subtend an angle 'or' as shown in Fig.4.11, the point (X3,Y3)

is given by

Up to this stage, care is taken to ensure that every

corrected individual figure of the input sketch is an

approximation of its original representation. But cu~nulative

effect of this approximation may lead to increase or decrease of

the overall sketch size, and sometimes it may introduce

distortions in the overall appearance. To suppress this

cumu:Lati.ve effect, approximately equal sides and angles are

averaged out before correction. Considering the above described

formulations and constraints, the correction procedures for

various geometric models are described as follows:

S~uare: -

1. Select a reference line segment from the line segments

constituting the square to be corrected.

2 - [ I

i. if the reference segment is already corrected, and no

other point is corrected, generate a square of the

size of the reference segment.

ii. If in addition to the reference segment one more

corner point is already corrected, shift the

uncorrected point, symmetric to the already corrected

point.

i.ii, Generate a square of the size equal to the average

side length of the recognized square figure.

1 3

Given a reference line segment(X1, Yl), (X2, Y2) of a square,

the corner point (X3,Y3) opposite to (X2,Y2) is given by

Rectancvle' 3-2


constituting the rectangle to be corrected.

2. E l

i. If the reference segment was already corrected and no

other point is corrected, average the uncorrected

opposite sides and generate a rectangle.

ii. If in addition to the reference segment one other

point is corrected, correct the remaining point

symmetric to the already corrected point.

iii. Generate a rectangle with sides equal to the average

of the opposite sides.

Given a reference line segment (Xl1Y1),(X2,Y2) of length L1

of a rec-tangle, the corner point (X3,Y3) opposite to (X2,Y2), is

given by

P a r a l l e l o ~ r a n : -- -

1. S e l e c t a r e f e r e n c e l i n e segment from t h e l i n e segmen-ts

c o n s t i t u t i n g t h e pa ra l l e log ram t o be c o r r e c t e d .

2 . [ I

i. I f t h e r e f e r e n c e l i n e segment was a l r e a d y c o r r e c t e d

and no o t h e r p o i n t i s c o r r e c t e d , g e n e r a t e a

pa ra l l e log ram wi th s i d e s equa l t o t h e average of t h e

unco r rec t ed o p p o s i t e s i d e s and a n g l e s equa l t o t h e

average of t h e o p p o s i t e ang le s .

ii. I f i n a d d i t i o n -to t h e r e f e r e n c e l i n e segment one

o t h e r co rne r p o i n t i s c o r r e c t e d , s h i f t t h e

unco r rec t ed p o i n t symmetric t o t h e a l r e a d y co r r ec t ed

p o i n t .

i i i . G e n e r a t e a pa ra l l e log ram of s i d e s equal t o -the

average of t h e o p p o s i t e s i d e s and a n g l e s equa l t o t h e

average of o p p o s i t e ang le s .

Rhombus :

1. S e l e c t a r e f e r e n c e l i n e segment from t h e l i n e segments

consti tut : i .ng t h e rhombus t o be c o r r e c t e d .

2 . r1

i. I f t h e r e f e r e n c e l i n e segment was a l r e a d y c o r r e c t e d

and no o t h e r p o i n t i s c o r r e c t e d , g e n e r a t e a rhombus

of t h e s i z e of t h e r e f e r e n c e l i n e segment and angles

equal to the average of the opposite angles.

ii. If in addition to the reference line segment one

other corner point is corrected, shift the

uncorrected point symmetric to the already corrected

point.

iii. Generate a rhombus with sides equal. to average of

the four sides and angles equal to average of the

opposite angles.

r I Given a reference line segment (Xl, Yl), (X2,Y2 ) of length L,1

and the segment to be corrected(X2,Y2),(X3,Y3) of length L2, the

corner point (X3,Y3) opposite to (X2,X2) that constitutes a

p&rallelogram or a rhombus is given by

X3 = X2 - L2/LI((X2-Xl)cos(a ) + (Y2-Yl)sin(@ ) )

~3 = X2 - L2/1,1((Y2-Yl)cos(@) - (X2-Xl)sin( 8 ) )

where ( t?3 1 is the angle between the reference side and the side

to be corrected. In case of rhombus the ratio L2/L1 is 1.

Trapezium: - - - -- - - -

1. Select one of the parallel sides as the reference

segment.

2. [ I

i. If the non-parallel sides have approximately equal

inclination with respect to the reference segment,

make the trapezium symmetric so that ;he side

opposite to the reference segment is placed at the

average original height and parallel to the reference

segment.

ii. Correct the sides to the original angle, so that the

side opposite to the reference segment is strictly

parallel and placed a.t a distance equal to average

original distance from the reference segment.

Quadrilateral: - - .

1. Select a reference line segment from the line segrr~ents

constituting the quadrilateral to be corrected.

2 . 111

_i. If -the remaining sides have definite relation with

the reference line segment, correct it accordingly.

ii- Correct the sides to their original angles and

lengths.

I I

Equilateral triangle: -- - -- - - --


constituting the equilateral triangle to be corrected.

2 . E l

i. If the segment is already corrected, make an

equ$..lateral triangle of side equal to the reference

segment.

ii. Generate an equilateral triangle of side equal to the

average of the three sides.

C 1

Isosceles trianqle: .

7 . Select a reference line segment from the line segrrients

constituting the isosceles triangle to be corrected.

2 - I1

i. If the reference line segment is one of the equal

sides and is already corrected, make an isosceles

triangle with the original angle.

ii. If the reference segment is not corrected, make the

isosceles triangle with two equal sides of a length

equal to the average length of the approximately

equal sides.

iii. If the reference side is the side connecting the

equal sides, make the isosceles triangle with two

equal sides of a length equal to the average length

of the approximately equal sides.

I I

Right-isosceles triangle: - - -- -


constituting the right isosceles triangle to be

cc2rrectc:d.

2 . [ I

i, If the reference side is one of the equal sides and

-is illready corrected, generate a perpendicular line

segment with length equal to the reference line

segment.

ii. If the reference line segment is not corrected,

generate perpendicular sides of length equal to the

average length of the approximately equal sides.

iii. If the corrected side is the side opposite to the

right angle, make right .- isosceles triangle of two

equal sides of length equal to the average length of

the approximately equal sides.

Rilght - --- ---- ange triangle:

1. Select a reference line segment, which subtends the right

angle.

2. Find the third point as a rectangle point, with connected

segment length equal to the original length of the

segment to be corrected.

Triangle: -- --


constituting the triangle to be corrected.

i. Correct the remaining sides with respect to their

eor,rec.ted neighbors, with which they have definite

relation.

ii.. Correct the sides with respect to the reference

segment.

I I

Polv~on: .-.a% -


constituting the polygon to be corrected.

2. [ I

i. If the polygon is a regular polygon, correct the

remaininy line segments with reference to the

selected reference, so that all angles and sides are

equal.

i.i. Correct the rest of the line segments .to their

original angles.

[ 1

4.5 CONTROL STRATEGY FOR RECOGNITION AND CORRECTION STAGE

The overall control structure for the recognition and

corr@ction stage is presented in Fig.4.12. The features extracted

in the preprocessing stage are initially represented in the form

of a graph. Then the graph is reduced to obtain an optimal set of

features. T5is reduction process is a sequence of operations like

deletion of the noise segments, removal of the duplicate

segments, smoothing of the bends within the user defined or

default threshold and classification of the feature nodes. The

reduced graph is then processed to obtain the relative

orientation information.

The graph representation forms the relational data base

which answers queries from various geometric models. At this

stage the rule base starts querying the data base for various

conditions. Starting from the highest priority rule, the rules

which are satisfied are assumed to be matched. A figure which

satisfies a rule is corrected, only if the model is contextually

consistent. After every correction, the rule base is reset and

PRIMARY GRAPH REPRESENTATION ----- -.---- ------a"--- --- --

1 C GRAPH REOUCTION AND SMOOTHING i

I PHESEBVATION OF ANGLES AND CLASSIFICATION OF GRAPH I

! REPRESENTATION 1 {RELATIONAL / DATA BASE] 1-JOUTPUT

r ---------

RULE BASE J--imTRoL

STWATEGY CONTEXT

I -- i _--.- 1 - 4 1 --- ---

1 I CDRRECTIQN /_-- -7 i

I PROCEDURES f

F B g . 4 . 3 2 C o n t r o l s t r u c t u r e o f t h e r e c o g n i t i o n

and c o r r e c t i o n stage.

queries resume. When all the rules fail to satisfy, search is

made for the uncorrected points. The uncorrected points in the

closed loop are corrected first and then the open points are

corrected. The open points are corrected only at the end because

any corrective shift given to the open points does not affect

other parts of the sketch.

4.6 IMPLEMENTATION OF RECOGNITION AND CORRECTION STAGE

The implementation of the recognition and correction stage

is dcne in PROLOG. PROLOG is a declarative language, where the

various relations existing between the line segments can be

declared as facts. A s PROLOG uses resolution, all decisions can

be obtained through simple queries over the stored facts. As the

contextual consistency check involves search for an alternate

solution, a backtracking of the decision process is essential.

This bzclrt-racking is inherent in PROLOG.

I n the implemetation of the graph structure, all the feature

nodes are treated as arguments of the relations by which they are

bound. The relations in turn represent the arcs in the graph

structure. An examp1.e of the graph structure and its PROLOG

translation is shown in Fig.4.13. This PROLOG declaration also

acts as a rel-ational database, which can answer the queries made

by the rules.

The reducti.on procedures explained in section 4.1 are

implemented in terms of set of conditions, which query the graph

structure. An example of PROLOG declaration of 'deviation

smoothing' is given in Fig.4.14. Node coloring is achieved by a

length Isego. 721 . length ( s ~ g i , 683 . length (seg2. 182) . l ongth tseg3, 9-71 .

connects (segl. p3, p4) . connects lseg2. p4, p2) . connects (seg3, p i , p31 . connects bego, p i , p21 . angle (segO, seg2.491 . angle (segO. seg3, 130) . angle (seg3. segl , 49) . angle (segl, seg2.1301 .

F i g . 4 . 1 3 G r a p h r e p r e s e n t a t i o n o f a p a r a l l e l o g r a ? W

and i t s psolog t r a n s l a t i o n .

----"------..----- -- -- ----- ------------.--.

=/HIS I S AN ITERATIVE GOAL HHICH SUCCEEDS WHEN 7

f ALL THE LINE SEGMENTS ARE CHECKEDR/ 1

I

deviatfon2maoth: - connected ISi, A, €31, connected (52, A, C) ,

c \= e. not lconnectedto (A, B, C1,

1 find_janglo (8, A. C, Tan, Cos, Sin!, I 1 i ang-threshold (Thresho ld , !

L =< Threshold, ppocess [Si, S2, A, 6, Cl,

F a i l .

1 JXSMQOTHING COMPLETED K/ !

dev i a t tongmoath .

I I

i urn FACTS ARE RETRACTED AND MODIFIED FACTS ARE ASSERTED I , i i INTO THE GRAPH. SEGMENTS S i OF LENGTH L i AN0 S2 OF I

i I

! LENGTH L2 ARE SMOOTHENED TO S l OF LENGTH b I t L 2 N/ 1

I I

1 I process (Sl, S2, A, B, C ) : -

! . r e t r a c t ( length (St, L i ) 1 , r e t r a c t (lclngth (52. L21 I . L 3 is I L I +L21 , asser ta ( length (St. L3) 1. asser ta Icr~nneets (Si, 8. Cj . r e t r a c t (cannectys (Si, A, B) , r e t r a c t (connects (S2, A. CJ I . r e t r a c t (s (S211. r e t r a c t (p (A1 1 , r e t r a c t ( c o q r d (AI , _) 1 , I . .

F B g . 4 . 3 4 P R O L O G c o d e f o r d e v i a t i o n s m o o t h i n g

simple process 3f assertion of a fact. For example, an 'open'

color is given to a point 'p10' by asserting the fact

The classification ~rocedure involves repeated application

of the rules give in section 4.1. I.ts PROLOG translation is given

in Fig.4.15. A rule defining a geometric model is translated Ln

the form of a group of queries, where the rule head acts as the

main goal and all the queries are declared as its subgoals. For

example, the rule for triangle is declared in the form:

connected(Seg2,R, C),

Seg2 \ = Segl.

When thesc? four queries are responded affirmatively by the

database, therc exists n triangular pattern (A,B,C) in the

cla tabasc.

Becauss PROLOG uses depth first sequential search, it is

advantageous to reduce the number of queries to arrive at the

results. We observe that in most cases, rules defining variccs

members of the same family of models repeatedly make the queries

which are colnmoll to the whole family. This is a computationally

expensive exercise, especially in sequential processing systems.

This can be reduced to a single query by grouping the common

queries under a separate goal which has the priority over the

corresponding class of rules. For example, all the rules of a

. . . " . ' -.-...--

i / X THIS RECURSIVE CALL CLASSIFIES THE NODES 1

1 WHICH ARE ONE-CONNECTED AS 'OPENa AND REST OF THE NODES AS 'SOFT' U/ I

I c l a s s i f y : - I P (X), t i s ing ly jcar~nec ted (XI, I

! r e t r a c t Ip IX1 1,

I asscr t a (open (XI I , I r e t r a c t (connects (S, X, YJ 1 , j asser ta (disconnect IS. X. Y j 1, I

i c l a s s i f y I Y j . I

i c la s s i fy I r0 : - I

i P (Yl , I s i n g l y j o n n e c t e d [ X I , [ classify (Y l . I I j s , l a s s i f y I J : - i i r e t r a c t lp [XI I , I 1 asser ta ( s o f t [XI I , I

i classify IV1 .

i I singly- connected (XI : - , I not (double_~onnsc ted (XJ 1 .

: double-car~nected (XI : - I

i connectd (St, X. Y j , I

connected (52, X, Z),

F i .g.4.1.5 P R O L O G code f o r g r a p h classification

triangle, search for the existence of a three segment long closed

loop. To avoid this repeated search, these queries are grouped

under a separate rule and the rule is given a priority over other

rules for triangle and the result of this search is then passed

over to all other rules of triangles. As PROLOG gives higher

priority to rules which are earlier in the list of rules of one

kind, if necessary, priority can be given to a particular rule

over the other just by asserting the rule before the other rule.

The correction procedures are expressed in the form of

mathematical formulations as explained in section 4.4.2. These

formulations can be directly translated into PROLOG. One such

example is shown in Fig.4.16. PROLOG shows inefficiency when

correction procedures are to be executed. This is because,

instead of treating complete procedure as a single goal, PROLOG

treats every mathematical operation as a separate goal, and

carries out an inherent and expensive exercise of storing and

retraction of -the goal environment. Hence, though PROLOG is

effective in the implementation of graph representation and rule

base, it is quite inefficient while carrying out procedural jobs

like correction.

In cases of finding alternate solutions, we felt the need

of the Communication Sequential Processes (CSP) feature of

guarded command. This feature, if included in PROLOG may affect

the pure logic declaration in PROLOG, but it may be less harmful

than the cut feature of PROLOG.

l______l______l__ ____-_ "_l_-~.-ll-.I_-___ -^ -------- r . - ------- ____- 7 ! \

/n THE PROCEDURE FINDS THE RECTANGLEPOINT NEAREST TO , THE POINT TO BF CORRECTED %/ 1

!

1 ! I / % If the p o i n t t 3 be c o r r e c t e d is 'hard'. the goal is t r u e % / i I : make-pec t IC, B, A, L-21 : -- j

hard ( A ] . I I ! . i

/ % (C, Bj is the reference l ine segment and A is the i p o i n t t o be coasected a t a distance L2 a /

r8etract (co-ord (A. X, Yj 1 , c o j r d (C, X i , Yl) , c o g r d 18. X2, Y21, fdist ( X i , Y i , X2, Y2. L11, X3 is X2 t. (Y2 - Y l J S L2 / L i , Y3 i s Y2 - (X2 - X l j % L2 / L i . X4 is X2 - fY2 - Yij % L2 / L i , Y 4 i s '42 - 1x2 - Xi) H L2 / L l , ! . f d i s t IX, Y. X3. Y3. D l ) . f d hst (X. Y. X4, V4. D21 . if,bist (Di, X3, Y3. D2, X4, Y4, A1 .

/% Cl is distance hetween (Xi ,YI ) and (X2,Y21 E/ ! I I

f d ist (X i , Y 1. X2. Y2, 01 : - !

e is 1x1-X~J w (XI-X~J + (Y 1 - ~ 2 j w (Y 1 - ~ 2 J . I

s q r t (L. Dl . I

/ # C o r r e c t the p o i n t t a the nearest rectangle poin t %/ I I

Dh >= 02, asserta (co_prd (P. X, Y) 1 .

i f-dist (BULL, A. 8. MAC, INN. OUT. PI : -

F P g . 4 . 3 6 C o r r e c t i o n o f a c o r n e r p o i n t to t h e n e a r e s t r e c t a n g l e p o i n t .

4 . 7 RESULTS

The recognition and correction stage explained in this

chapter takes line features of a sketch and produces a geometric

representation. Performance of the recognition and correction

stage can be examined through a set of typical inputs and the

corresponding drafted outputs. Fig.4.17 exhibits variations in

the drafted versions of a hand-drawn quadrilateral with respect

to values of thresholds for approximation. Fig.4.17b is the

drafted version with default thresholds. Here, the sketch i s

recognized as a parallelogram and corrected accordingly. In

Flg.4.l7c, the drafted sketch is a rectangle. This is because of

a large, user specified threshold for absolute orientation (A = 0

,, 0 0 1. With the same A 0'

when the threshold for equality E is 0

raised to 2 5 % , the input is recognized and corrected as a

square(Fig.4.17d). One more example depicting variations in a

drafted sketch with user specified thresholds is shown in

Fig.4.18. Here, an irregular triangle is corrected to an

equilateral triangle, when threshold for equality is raised to

2 5 % . In all the above cases the sketch size is maintained within

the thresholds specified by the user.

In Fig.4.19, the effect of user specified contexts on a

drafted sketch is illustrated. Fig.4.19b is the drafted version

with default context, where every recognition is contextually

collsistent. Here, the sketch is recognized as a combination of a

trapezium and four triangles. Fig.4.19~ is the drafted version

when the cantext specified is "presence of only a quadrilateral

with threshold for equality 15%". Here, the inner quadrilateral

a. INPUT GRAYTONE' IMAGE OF b. UIIAF'I'EL) V:EIISION WITH A HAND-DRAWN QUADRILATERAL- DEFAULT THRESHOLDS

c. DRAFTED VERSION WITH d. DRAFTED VERSION WITH

USER SPECIFIED THRESHOLD USER SPECIFIED THRESHOLDS

OF ABSOLUTE ORIENTATION E = 2!58ANDAo = 50

= 50 DEGREES. DBGREES . A.

Fig.4.17 BEHAVIOUR OF DRAFTED OUTPUT OF A QUADRILATERAL WITH THE VARIATION IN SPECIFIED THRESHOLDS.

. .----

a. INPUT GRAYTONE SMAGE OF A TRIANGLE

~ ~-

b. PLOT OF THE FEATURES EXTRACTED IN PRE- .PROCESSING

c. DRAFTED VERSION OF THE d. DRAFTED VERSION WITH TRIANGLE WITH DEFAULT USER SPECIFIED THRESHOLD THRESHOLDS. OF EQUALITY Eo = 25%.

1 ~EIIAVIOLJR OF DRAFTED OUTPUTS OF A TRIANGLE WITH THE , VARIATION IN SPECIFIED THRESHOLDS.

..' , ?(: : "?.i.! '- , .;. - ,%;- :.: . . ,. .. . ' l i . . '7' .. _ . . . . , _. . . r , . ' . , : , , ( , . , '..'.i ': : ' ' . . . . I . . .. . .:' . * > . . . . . . ' . . . . . 'I' .

a. PLOT OF THE FEATURES EXTRACTED IN THE PREPROCESSING STAGE.

b. DRAFTED VERSION WITH DEFAULT CONTEXT.

d . DRAFTED'. VERSION WITH A c. A DRAFTED VERSION WITH A CONTEXT OF PRESENCE OF

CONTEXT OF PRESENZE OF ONLY TRIANGLES. ONLY QUADRILATERAL.

. .

Fig.4.19 VAR.IATIONS IN DRAFTED OUTPUTS WITH CONTEXT.

is corrected as a square and rest of the line segments are

corrected with reference to this square. In this context none of

the triangles are recognized. Similarly, when the context is

"presence of only triangles", the drafted version of Fig.4.19d is

obtained. Mere, one can observe that the quadrilateral is without

any regular shape.

In the absence of models for recognition of alphanumeric

features, the system fails to draft the sketch with alphanumeric

data. One such example is illustrated in Fig.4.20, where the line

features of character ' R ' are left uncorrected. If the system

were to be used for drafting the sketches with alphanumeric data,

corresponding object models must be included in the present

system. These models must be given priority over geometric models

so that the sketches with dimensional specifications can be

corrected according to specified dimensions. Under such

conditions, the only change needed in the correction procedure is

ta override the default thresholds by a 'zero' value.

The corrected version of a sketch is in the form of feature

points with their coordinates and connectivity specified. This is

usually referred as 'soft copy'. It can be observed that the

input image, which would have taken as many bytes of memory as

the number of pixels in the image, for its storage, now needs

only a few bytes of memory. Secondly, this compressed code can be

easily updated because it is in a machine-recognizable form.

Finally, the small size of the data allows efficient

communication and sketch duplication.

- . --

a . INPUT GRAYTONE IMAGE I

.. - - - - - - - - b. PLOT OF *HE-FEATURES

EXTRACTED I N THE PREPRO- CESSING STAGE.

- - - . .

c . DRAFTED VERSION WITH DEFAULT THRESHOLDS.

. . ---

Fig.4.20 A SKETCH WHICH I S INCOMPLETELY DRAFTED I N THE ABSENCE OF ALPHANUMERIC MODELS.

s 4r

CHAPTER V

CONCLUSION

Various issues involved in computer drafting of hand-drawn

geometric line sketches, are explored. While solutions are

sucjqested for thc basic issues, the issues that are involved in

the developmenJc of a full-fledged drafting station arc? not

exarnil-led. The described system takes a digitized line sketc!? as

the input. This input graytone image is binarized and line

features are extracted without 'thinning'. The extracted features

are then recogrlized and corrected. Recognition of geometric

figures is done through approximate matching of the figures with

various geometric models. Provision is also made to check the

contextual consistency of a recognized model. The output is a

drafted soft copy of the input line sketch which can be

efficiently stored, duplicated or communicated over a Wide Area

Network (WAN ) . The proposed binarization scheme uses local thresholds and

it exhibit:; good noise rejection properties. The scheme also

provides local line vectors at all edge points. It generates low

thresholds in dark(object) regions, so that formation of holes is

suppressed. But the same effect may lead to misclassification of

pixels in low illumination regions.

The line extraction and segmentation scheme uses the line

width and the run length constraints. The algorithm assumes that

the input is strictly a line sketch. It skips all the dark

regions present in the input sketch. Irrespective of the type of

She input sketch, the preprrressing stage extracts lines in the

form of intersection and deviation points. Hence all curves are

piecewise linearized.

The line features extracted are represented in the form of a

graph. New techniques for reduction and classification of a graph

and representation of approximate geometric models in the form of

rules are proposed. As the process is independent of the

o~ientation or translation of the figures, no effort is made to

preserve the overall orientation of the sketch.

To make the system more versatile, it can be modified to

accept line sketches with dimensions specified on them. It is

essential. that in such a system, the features characterizing

dimensional information be separated from those characterizing

the sketch information. The system must also have an additio~~al

stage for recognition of alphanumeric data to understand the

dimensions specified.

REFERENCES

1. Nobuyuki Otsu, "A threshold selection method from gray level histograms", IEEE Transactions on Systems, Man and Cybernetics, Vol. SMC-9, No.1, pp.62-66, January (1979).

2. J.R.Ullmann, "Binarization using associative addressing", Pattern Recognition, Vo1.6, pp.127-135, (1974).

3. L.T.Watson, K-Arvind, H.W.Ehrich, R.M.Haralick, "Extraction of lines and regions from graytone line drawing images", Pattern Recognition, Vo1.17, No.5, pp.493-507, (1984).

4. J.S.Weszka, A.Rosenfeld, "Histogram modification for threshold selection", IEEE Transactions on Systems, Man and Cybernetics, Vol.SMC-9, No.1, pp.38-52, January (1979).

5. R.W.Smith, "Computer processing of line images: A survey", Pattern Recognition, Vo1.20, No.1, pp.7-15, (1987).

6. K.Ramachandran, "Coding method for vector representation of engineering drawings", Proceedings IEEE, Vo1.68, No.7, pp.813-817, July (1980).

7. J.J.Sebok, L.E.Roemer, G.S.Malindzak Jr., "An algorithm for line in.tersection identification", Pattern Recognition, Vo1.13, No.2, pp.159-162, (1981).

8. Z.M.Wojcik, "A natural approach in image processing and pattern recognition: Rotating neighborhood technique, self adapting threshold, segmentation and shape recognition, Pattern Recognition, Vo1.18, No.5, pp.299-326, (1985).

9. D.H.Ballard, C.M.Brown, "Computer Vision", Prentice Hall Inc., Englewood Cliffs, New Jersey, pp.65-70, (1982).

10. T. L . Huntsberger, C. Rangaraj an, S.N.Jayaramamurthy, "Representation of uncertainty in computer vision using fuzzy sets", IEEE Transactions on Computers, Vo1.C-35, No.2, pp.145-156, February (1986).

li. P.H.Winston, "Artificial Intelligence", Second Edition, Adison Wesley Publishing Co. (1984).

12. D.L.Goetsch, "Introduction to Computer Aided Drafting", Engle Wood Cliffs, New Jersey, (1983).

1 M E , S.Kakumoto, T.Miyatake, S.Shimada, H.Matsushima, "Automatic recognition of design drawing and maps", International Conference on Pattern Recognition, Montreal, pp.1296-1305, (1984).

14. J.R.Ward, B-Blesser, Pencept Inc., "Interactive recognition

of hand printed character characters for computer input", IEEE CG&A, pp,24-37, September (1985).

15. M.C.Fulford, "The FASTRAK automatic digitizing system", Pattern Recognition, Vo1.14, No.1-6, pp.65-74, (1981).

16. J. P. Bixler, J.P. Sanford, "A technique for encoding lines and regions in engineering drawings", Pattern Recognition, Vo1.18, No.5, pp.367-377, (1985).

17. T.P.Clernent, "Extraction of line structural data from engineering drawings", Pattern Recognition, Vol.14, No.1-6, pp.43-52, (1981).

18. H-Murase, T. wakahara, "Online hand sketched figure recognition, Pattern Recognition, Vo1.19, N0.2, pp.147-160, (1986).

19, W.C.Lin, J.M.Pun, "Machine recognition and plotting of hand sketched line figures, IEEE Transactions on Systems, Man and Cybernetics, Vol.SMC-8, No.1, pp.52-57, January (1978).

20. A.Mitchie, J.K.Aggarwa1, "Image segmentation by conventional and information integrating techniques: A synopsis", Image and Vision Computing, vo1.3, No.2, pp.50-62, May (1985).

22. W-Doyle, "Operations useful for similarity invariant pattern recognition", J.Assoc. Comput. Mac,h.,Vol.9, pp.259-267, (1962).

22. J.M.S.Frewitt, M.L.Mendelsohn, "The analysis of cell images", Ann. N.Y. Acad. Sci. 128, pp.1035-1053, (1966).

23. K. F'ukanaga, "Introduction to Statistical Pattern Recognition", New York Academic Press, pp.260-267, (1972).

212. J.S.Weszka, " A survey of threshold selection techniques", Coinputer Graphics and Image Processing, Vo1.7, pp.259-265, (1978).

25. D.P.Panda, A.Rosenfeld, "Imago segmentation by pixel- classification in (gray level gradient) space", LEEE Transactions on Computers, V~i.27, pp.875-879, (1978).

26. J.S.Weszka, R.S.Nage1, A.Rosenfeld, "A threshold selection technique", IEEE Transactions on Computers, Vo1.23, PP.1322- 1326, ( 1974).

27. S.Watanabe and CYBEST group, "An automated apparatus for cancer prescreening:CYBEST", Computer Graphics, Image Processing, Vo1.3, pp. 350-358, ( 1974).

2 8 . J.S.Weszka, J.A.Verson, A.Rosenfeld, "Threshold selection techniques. 2'" University of Maryland Computer Science Center Tech.Report. 260, (1973).

29. R.N.Wolfe, "A dynamic thresholding technique for quantization of scanned images in automatic pattern recognition", National Security Industrial Association, Wa.shington D.C., pp.143-102, May (1979).

30. H.Ogawa, K.Taniguchi, "Thinning and stroke segmentati-on for hand written Chinese character recognition", Pattern Recognition, Vo1.15, No.4, pp.298-308, (1982).

31. T.W.Ridier, S.Calvard, "Picture thresholding using iterative selection method, "IEEE Transactions on Systems, Man and Cybernetics, Vol.SMC-8, No.8, pp.629-632, August (1978).

32. C.K.Chow, T.Kaneko, "Boundary detection of radiographic images by a threshold method", Proc. IFIP Congress 71, Booklet TA-7, pp.130-134, North - Holland, Amsterdam, (1972).

33. N-Ahuja, A.Rosenfeld, R.M.Haralick, "Neighborhood gray level as feature in pixel classification", Pattern Recognition, V01.12,251-260, (1980).

34. J. Kittler, J. Foglein, "Contextual classification of multispectral pixel data", Image and Vision Computing, Vo1.2, No. 1, pp. 13-29, February (1984).

35. J. Hyde, J.A. Fulwood, B. R. Corsll, "An approach to knowledge driven segmentation", Image and Vision computing, Vo1.3, No.4, pp.198-205, November (1985).

36. P.Zamperoni, "Model based segmentation of graytone images", Image and Vision Computing, Vol.2, No.3, pp.123-133, August, (1984).

37. C. J.Milditch, "Linear skeletons from square cupboards", Machine Intelligence, Vo1.4, pp.403-420, (1969).

38. N.J.Naccashe, R.Shingha1, "SPTA: A proposed algorithm for thinning binary patterns", IEEE Transactions on Systems, Man and cybernetics, Vol . SMC-14, No. 3, pp. 409-418, May/June (1954).

39. F.W.M.Stentiford, R.G.Mortimer, "Some heuristics for thinning hand printed binary characters for OCR", IEEE Transactions on Systems, Man and Cybernetics, Vol.SMC-13, No.1, pp.81-83, January/February, (1983).

40. C-Arcelli, G.S.Di Baja, "A thinning algorithm based on prominence detection", Pattern Recognition, Vo1.13, No.3, pp.225-235, (1981).

41. K.J.Udupa, I.S.N.Murthy, "Some new concepts for encoding line patterns", Pattern Recognition, Vo1.7, pp.225-233, (1975).

42. T-Wakayama, "A core line tracing algorithm by maximal square moving", IEEE Transactions on Pattern Analysis and Machine

Intelligence, Vol.PAM1-4, No.l,pp.68-74, (1982).

- 43. H.Freeman, L.S.Davis, "A corner finding algorithm for chain coded curves", IEEE Transactions on Computers, --- pp. 297- 303, March ( 1977 ) .

44. A.Rosenfeld,J.S.Weszka,"An improved method of angle detection in digital curves", IEEE Transactions on computers, pp.940- 941, September (19750.

45. L.S.Davis, "Understanding shapes and angles", IEEE Transactions on computers, V0L.C-26, No.3, pp.236-242, March ( 1977 ) .

46. A.Rosenfeld, E.Johnston, "Angle detection on digital curves", IEEE Transactions on computers, pp.875-878, September (1973).

47. T.Pavlidis, " A hybrid vectorization algorithm", International Conference on Pattern Recognition, Montreal, pp.490-492, (1.984).

48. C.Y.suen, M.Rerthod, S-Mori, "Automatic recognition of hand printed characters - The state of art", Proceedings of IEEE, Vo1.68, No.4, pp.469-487, April (1980).

49. S.Mori, K-Yamamoto, M.Yasuda, "Research in machine recognition of hand printed characters", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.PAMI-6, No.4, pp.386-405, July (1984).

50. M.A.Fischler, R.C.Bolles, "Perceptual organization and curve partitioning", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.PAM1-8, No.1, pp.100-105, January (1986 ) .

51. H.Bunke, G.Sagerer, "Use and representation of knowledge in image understanding, based on semantic networks", International Conference on Pattern Recognition, Montreal, pp.1135-1137, (1984).

52. M.Numao, M.Shizuka, " A frame like knowledge representation - system for computer vision", International Conference on Pattern Recognition, Montreal, pp.1128-1130, (1984).

53. Zvi Kohavi, "Switching and Automata Theory", Second Edition, McGraw Hill Inc., New York, pp.24-36, (1978).

54. Shriram Revankar, B.Yegnanarayana, M.Manohar, "Binarization of line images using edge vectors", IEEE Transactions on System Man and Cybernetics, Communicated (1987).

55. Shriram Revankar, B.Yegnanarayana, "Geometric reconstruction of hand-drawn line sketches", IEEE Transactions on Pattern Analysis and Machine Intelligence, Communicated (1987).

COMPUTER DRAFTING HAND-DRAWN LINE...

Documents

Transcript of COMPUTER DRAFTING HAND-DRAWN LINE...