1444 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND …taicl/papers/PAMI-interpret.pdf · A Novel...

14
A Novel Knowledge-Based System for Interpreting Complex Engineering Drawings: Theory, Representation, and Implementation Tong Lu, Chiew-Lan Tai, Huafei Yang, and Shijie Cai Abstract—We present a novel knowledge-based system to automatically convert real-life engineering drawings to content-oriented high-level descriptions. The proposed method essentially turns the complex interpretation process into two parts: knowledge representation and knowledge-based interpretation. We propose a new hierarchical descriptor-based knowledge representation method to organize the various types of engineering objects and their complex high-level relations. The descriptors are defined using an Extended Backus Naur Form (EBNF), facilitating modification and maintenance. When interpreting a set of related engineering drawings, the knowledge-based interpretation system first constructs an EBNF-tree from the knowledge representation file, then searches for potential engineering objects guided by a depth-first order of the nodes in the EBNF-tree. Experimental results and comparisons with other interpretation systems demonstrate that our knowledge-based system is accurate and robust for high-level interpretation of complex real-life engineering projects. Index Terms—Knowledge representation, interpretation, engineering drawings, high-level analysis, graphics recognition. Ç 1 INTRODUCTION C AD and CAM systems have been widely used in the engineering industries, and many new design pro- blems have been addressed using CAD/CAM tools. In addition, since most engineering tasks involve modifica- tions of existing designs [1], [2], there are a huge number of 2D CAD drawings (DXF/IGES format) in active use, creating a strong commercial demand for automatic interpretation systems to reuse previous design contents. The aim of such interpretation systems is to convert engineering drawings which are represented by graphical primitives into high-level descriptions. This conversion process extracts not only accurate shapes and attributes of engineering entities but also their complex relations like references and duplication. An engineering entity here refers to an interpretation target in the given engineering drawings, such as a domain-independent dimension or a domain-dependent engineering object. High-level interpretation of engineering drawings is an open and challenging problem, especially for complex real- life engineering drawings [3], [4], [5]. Most existing interpretation methods yield good results from raster line engineering drawings to low-level vectors [6], [7], [8], [9], [10], [11], [12], [13], [14], [15], but few methods proceed to use geometric reasoning and recognition processes to recover high-level descriptions. Several reasons may ex- plain this situation. First, how to represent contextual knowledge that describes drawing conventions efficiently and robustly is admittedly a hard problem [3], [4], [41]. Some knowledge-based systems use the “rules + inference” representations to obtain low-level vectors [8], [16], [17], [18], [19], [20], [21]; however, such unstructured rules are not amenable to content-oriented analysis due to their limited ability to describe complex relations of various types of high-level entities in real-life engineering drawings. Second, a real-life drawing does not only contain geometric shapes but also indicates their constraints and interactions. Therefore, an accurate high-level interpretation system must rely strongly on the extraction of relations among engineering entities. Unfortunately, such complex relations are not easy to represent. Last, high-level interpretation systems are more difficult to adapt to frequent variations in real-life applications. By observing the architects’ understanding processes, we conclude that automatic high-level interpretation should be largely driven by explicit graphical constraints and implicit reasoning. Explicit graphical constraints, which typically include connection, parallelism, and intersection of graphi- cal primitives, can guide the shape searching processes of potential engineering entities. Implicit reasoning is used to extract the hidden contents (e.g., omissions, symmetry, and references) in engineering drawings. In the understanding processes, it is necessary to check for consistency or retrieve detailed attributes. Reasoning always requires back and forth cross-referencing between different parts of a drawing or between different drawings. Such nonlinear jumps are drawing-content-based and are unpredictable, making the interpretation processes complicated. 1444 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 31, NO. 8, AUGUST 2009 . T. Lu, H. Yang, and S. Cai are with the State Key Laboratory of Novel Software Technology, Department of Computer Science and Technology, Nanjing University, Hankou Road, Nanjing, China 210093. E-mail: {lutong, sjcai}@nju.edu.cn, [email protected]. . C.-L. Tai is with the Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong. E-mail: [email protected]. Manuscript received 24 Nov. 2007; revised 9 Apr. 2008; accepted 3 June 2008; published online 10 June 2008. Recommended for acceptance by E. Saund. For information on obtaining reprints of this article, please send e-mail to: [email protected], and reference IEEECS Log Number TPAMI-2007-11-0790. Digital Object Identifier no. 10.1109/TPAMI.2008.161. 0162-8828/09/$25.00 ß 2009 IEEE Published by the IEEE Computer Society

Transcript of 1444 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND …taicl/papers/PAMI-interpret.pdf · A Novel...

Page 1: 1444 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND …taicl/papers/PAMI-interpret.pdf · A Novel Knowledge-Based System for Interpreting Complex Engineering Drawings: Theory, Representation,

A Novel Knowledge-Based System forInterpreting Complex Engineering Drawings:Theory, Representation, and Implementation

Tong Lu, Chiew-Lan Tai, Huafei Yang, and Shijie Cai

Abstract—We present a novel knowledge-based system to automatically convert real-life engineering drawings to content-oriented

high-level descriptions. The proposed method essentially turns the complex interpretation process into two parts: knowledge

representation and knowledge-based interpretation. We propose a new hierarchical descriptor-based knowledge representation

method to organize the various types of engineering objects and their complex high-level relations. The descriptors are defined using

an Extended Backus Naur Form (EBNF), facilitating modification and maintenance. When interpreting a set of related engineering

drawings, the knowledge-based interpretation system first constructs an EBNF-tree from the knowledge representation file, then

searches for potential engineering objects guided by a depth-first order of the nodes in the EBNF-tree. Experimental results and

comparisons with other interpretation systems demonstrate that our knowledge-based system is accurate and robust for high-level

interpretation of complex real-life engineering projects.

Index Terms—Knowledge representation, interpretation, engineering drawings, high-level analysis, graphics recognition.

Ç

1 INTRODUCTION

CAD and CAM systems have been widely used in theengineering industries, and many new design pro-

blems have been addressed using CAD/CAM tools. Inaddition, since most engineering tasks involve modifica-tions of existing designs [1], [2], there are a huge number of2D CAD drawings (DXF/IGES format) in active use,creating a strong commercial demand for automaticinterpretation systems to reuse previous design contents.

The aim of such interpretation systems is to convertengineering drawings which are represented by graphicalprimitives into high-level descriptions. This conversionprocess extracts not only accurate shapes and attributes ofengineering entities but also their complex relations likereferences and duplication. An engineering entity hererefers to an interpretation target in the given engineeringdrawings, such as a domain-independent dimension or adomain-dependent engineering object.

High-level interpretation of engineering drawings is anopen and challenging problem, especially for complex real-life engineering drawings [3], [4], [5]. Most existinginterpretation methods yield good results from raster lineengineering drawings to low-level vectors [6], [7], [8], [9],

[10], [11], [12], [13], [14], [15], but few methods proceed touse geometric reasoning and recognition processes torecover high-level descriptions. Several reasons may ex-plain this situation. First, how to represent contextualknowledge that describes drawing conventions efficientlyand robustly is admittedly a hard problem [3], [4], [41].Some knowledge-based systems use the “rules + inference”representations to obtain low-level vectors [8], [16], [17],[18], [19], [20], [21]; however, such unstructured rules arenot amenable to content-oriented analysis due to theirlimited ability to describe complex relations of varioustypes of high-level entities in real-life engineering drawings.Second, a real-life drawing does not only contain geometricshapes but also indicates their constraints and interactions.Therefore, an accurate high-level interpretation systemmust rely strongly on the extraction of relations amongengineering entities. Unfortunately, such complex relationsare not easy to represent. Last, high-level interpretationsystems are more difficult to adapt to frequent variations inreal-life applications.

By observing the architects’ understanding processes, weconclude that automatic high-level interpretation should belargely driven by explicit graphical constraints and implicitreasoning. Explicit graphical constraints, which typicallyinclude connection, parallelism, and intersection of graphi-cal primitives, can guide the shape searching processes ofpotential engineering entities. Implicit reasoning is used toextract the hidden contents (e.g., omissions, symmetry, andreferences) in engineering drawings. In the understandingprocesses, it is necessary to check for consistency or retrievedetailed attributes. Reasoning always requires back andforth cross-referencing between different parts of a drawingor between different drawings. Such nonlinear jumps aredrawing-content-based and are unpredictable, making theinterpretation processes complicated.

1444 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 31, NO. 8, AUGUST 2009

. T. Lu, H. Yang, and S. Cai are with the State Key Laboratory of NovelSoftware Technology, Department of Computer Science and Technology,Nanjing University, Hankou Road, Nanjing, China 210093.E-mail: {lutong, sjcai}@nju.edu.cn, [email protected].

. C.-L. Tai is with the Department of Computer Science and Engineering,Hong Kong University of Science and Technology, Clear Water Bay,Kowloon, Hong Kong. E-mail: [email protected].

Manuscript received 24 Nov. 2007; revised 9 Apr. 2008; accepted 3 June 2008;published online 10 June 2008.Recommended for acceptance by E. Saund.For information on obtaining reprints of this article, please send e-mail to:[email protected], and reference IEEECS Log NumberTPAMI-2007-11-0790.Digital Object Identifier no. 10.1109/TPAMI.2008.161.

0162-8828/09/$25.00 � 2009 IEEE Published by the IEEE Computer Society

Page 2: 1444 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND …taicl/papers/PAMI-interpret.pdf · A Novel Knowledge-Based System for Interpreting Complex Engineering Drawings: Theory, Representation,

Based on our analysis, we believe that there are two coreproblems in high-level interpretation: how to represent thecomplex engineering knowledge in engineering drawingsclearly and efficiently, and how to use the knowledge-basedrepresentation to linearly interpret complex high-levelengineering drawings. An efficient knowledge representa-tion should first describe the geometric compositions ofengineering entities and their implicit relations clearly,respecting potential variations in real-life applications, thensupports the retrieval of desired high-level design contentsas accurately and robustly as possible. The representationmust include implicit relations, which can help to speed upthe searching processes and improve accuracy. Finally,under the guidance of a well-defined knowledge-basedrepresentation, potential engineering entities can besearched sequentially from the input complex real-lifeengineering drawings.

This paper has two main contributions: 1) We propose anew hierarchical descriptor-based knowledge representa-tion method to organize the various types of engineeringobjects with complex high-level relations, and 2) wedevelop a new interpretation system based on the proposedknowledge representation method to convert real-lifeengineering drawings to content-oriented high-level de-scriptions. We first manually identify typical explicit andimplicit domain knowledge for high-level analysis from alarge number of real-life engineering drawings, and thenclearly define this knowledge in an Extended Backus NaurForm (EBNF), facilitating modification and maintenance.During interpretation, our system loads a knowledgerepresentation file and converts it to a tree structure, wherenodes and edges represent potential engineering entitiesand their relations, respectively. Graphical recognitionalgorithms are embedded in the entity nodes. By depth-first traversing the tree, our system easily converts thecomplex interpretation processes into a linear sequence ofrecognition functions. Our automatic interpretation systemis efficient and robust since all of the possible explicitgraphical constraints and implicit semantic relations havebeen well organized in appropriate levels in the tree toguide the automatic analysis process.

Our interpretation system can be applied to several real-life applications. For example, it allows the retrieval ofdesign contents or the automatic verification of consistencyamong different drawings since the engineering entities andtheir relations have all been recognized. Similarly, thesystem can be used for achieving more accurate costestimation and reconstruction of 3D models.

Comparing with previous interpretation systems forengineering drawings [20], [30], [35], [36], [38], [48], oursystem offers the following novelties: First, to the best of ourknowledge, our system represents the first effort in high-levelinterpretation real-life complex engineering drawings. Sec-ond, rather than using “rules + inference” representations, wedevise a descriptor-based knowledge representation methodthrough analyzing the human’s interpretation processes ofcomplex high-level engineering drawings. With the knowl-edge descriptors defined in EBNF, various types of engineer-ing entities and their complex relations can be representedeasily and clearly. Third, the system is easy to maintain bysimply modifying the EBNF-based descriptors to adapt tofrequent variations of real-life engineering projects. Unlikehard-coded interpretation systems, our system can be easily

extended to other engineering domains by replacing thecorresponding knowledge representation file.

The rest of this paper is organized as follows: Section 2discusses the existing techniques for automatic engineeringdrawing interpretation. The characteristics of typical real-life engineering drawings are explained in Section 3.Section 4 classifies some typical engineering knowledgefor high-level drawing interpretation and presents anoverview of our knowledge-based interpretation system.Section 5 first describes our knowledge representationstrategies, which are inspired by the human’s understand-ing processes and play a crucial role in the success of oursystem, and then presents a new hierarchical knowledgerepresentation method obeying the proposed strategies.Section 6 describes our knowledge-based interpretationsystem to identify high-level engineering entities. Experi-ment results and discussions are presented in Section 7.Finally, Section 8 provides a summary of this work.

2 RELATED WORK

Machine interpretation of engineering drawings has beenan active area since the late 1970s [5] and a large number ofcomputational methods have been proposed to identifygeometry parts of engineering significance. Existing en-gineering interpretation methods may be roughly classifiedinto five categories, depending on the basic technique theyrely on: pixel-level knowledge-independent [6], [7], [9], [10],[11], [22], [23], pixel-level knowledge-dependent [13], [16],[17], [18], [20], [24], [25], [26], [27], [28], [29], vectorial-levelknowledge-independent [30], [31], [32], [33], vectorial-levelknowledge-dependent [34], [35], [36], [37], [38], [39], andhybrid systems [8], [40], [41].

Pixel-level knowledge-independent techniques are the ear-liest proposed for raster-to-vector conversion. The basicidea is to search for a set of low-level geometrical features inrasterized line engineering drawings using pixel-tracking-based [9], [10], thinning-based [42], contour-based [43], orrun-graph-based [44] methods. Some recent methodsimprove both the accuracy and robustness [6], [7], [11],[22], [23]. These methods are general for interpreting paper-based drawings; however, since knowledge is not used inthe interpretation processes, only low-level vectors can beextracted.

Pixel-level knowledge-dependent techniques focus more onusing knowledge to interpret engineering drawings. TheANON system of Joseph and Pridmore [16] uses a series ofschema classes to describe prototypical drawing constructs,with the schema instances containing a geometrical descrip-tion and a number of C/C++ functions. Dori and Liu [45]design an improved MDUS system, whose core is a hierarchyof graphic object classes, each containing specialized objectrecognition algorithms. Other knowledge-aided systemshave been proposed to search line networks [18], dimensions[26], or geometric structure [17] from paper drawings.However, the knowledge in these systems is still in low-levelforms and only simple entities can be extracted. High-levelknowledge, such as implicit semantic relations, is rarelydiscussed. Recently, Couasnon et al. [27], [28], [29] presenteda generic DMOS system which is based on a grammaticallanguage and an associated parser to recognize tablestructures. DMOS is effective in recognizing structureddocuments that can be hierarchically defined by graphical

LU ET AL.: A NOVEL KNOWLEDGE-BASED SYSTEM FOR INTERPRETING COMPLEX ENGINEERING DRAWINGS: THEORY,... 1445

Page 3: 1444 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND …taicl/papers/PAMI-interpret.pdf · A Novel Knowledge-Based System for Interpreting Complex Engineering Drawings: Theory, Representation,

primitives. However, its recognition process is still linear,driven only by explicit graphical definitions. Therefore, it isdifficult to apply DMOS to the automatic interpretation ofengineering projects composed of a large number ofimplicitly related engineering entities and drawings.

Vectorial-level knowledge-independent techniques arewidely used in graph-based systems to recognize geometricshapes. Bimber et al. [30] use BNF grammars and Caetanoet al. [31] use fuzzy relational grammars to describe shapeinformation. Mahoney and Fromherz [32] use a graph-based language to model and recognize stick drawings.Unfortunately, these systems lack the ability to describenonshape domain information, which is critical in theinterpretation of complex drawings.

Vectorial-level knowledge-dependent techniques have beenproposed to achieve better quality interpretation results in awider range of applications, such as mechanic engineeringdrawings, electronic circuit diagrams, architectural, orchemical engineering drawings and network diagrams.Parbhu and Pande [37] implement a system for extractingentity features from CADD engineering drawings usingstring-based syntactic pattern recognition. Zhi et al. [35]develop a graph-based approach AUG to capture architec-tural information originally produced by designers in CADplans and rebuild the topological relationships. However,since knowledge is hard-coded, these systems are not easilyextensible to other domains [41], [46].

Hybrid techniques divide interpretation processes into twostages: Domain-independent rules are first used to segmentvectors from a paper form drawing, then a drawinginterpretation subsystem works in concert with a set ofdomain-specific matchers to classify high-level targets. Sincesuch systems are in fact a combination of the abovetechniques, they have similar discussed shortcomings.

Other approaches can be used to supplement automaticinterpretation systems, such as the LEDCONS system of Suand Lin [33]. It performs syntactic level drawing interpreta-tions based on the strategy of embedding them within theprocesses of human editing in an interactive fashion.

Unlike the proposed knowledge-based system, most ofthe above methods aim at achieving low-level processing orsimple entities recognition and very few studies have

analyzed the representations of complex engineeringknowledge for high-level interpretation [40]. Earlier, ourresearch group designed a series of low-level algorithms forengineering drawing analysis [18], [22], [23], [24], [25]. Welater continued our research to high-level recognition and3D reconstruction [26], [38], [39], [47], [48]. This paperpresents our new knowledge-based system for high-levelinterpretation of complex real-life engineering drawings.

3 CHARACTERISTICS OF ENGINEERING DRAWINGS

Engineering drawings are complex documents intended asa means of communication among engineers [16]. Entities inengineering drawings have variable types, numbers, andgroupings of text or lines. Although drawing standards canaid designers, they do not completely specify the allowablecontents of any given drawing.

Through examining hundreds of real-life engineeringdrawings, we conclude that a typical complex engineeringdrawing is in fact a three-level hybrid representation: scale,schematic, and internal representations. Scale representationsare accurate geometrical projections of engineering entities[20]. Schematics include various types of shortcuts, e.g.,details may be led out by connected lines, while attributeannotations of symmetrical entities are omitted by addingdot-and-dash symmetrical axes. Internal representationscapture implied semantics and relations in engineeringdrawings. Schematic and internal representations are usedbecause they simplify drawings; however, they posechallenges to automatic analysis in high-level interpretationsystems.

Fig. 1 illustrates the different levels of representations intwo drawings of a small tower project. Figs. 1a and 1b are,respectively, a beam drawing and a slab drawing of thesame floor, with the beam drawing containing the details ofcolumns and beams, while the slab drawing contains thedetails of slabs. Scaled contours of engineering entities, suchas columns, walls, beams, slabs, holes, staircases, andwindows, are drawn. Since internal steel structures aredifficult to describe in a plan drawing, a column sectionview with detailed attributes is laid out schematically by anellipse and a line in Fig. 1a. The contour of the section view

1446 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 31, NO. 8, AUGUST 2009

Fig. 1. A small architectural engineering drawing of a tower. (a) A simplified beam drawing with a column section view led out schematically by an

ellipse and a line. (b) The simplified slab drawing of the same floor.

Page 4: 1444 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND …taicl/papers/PAMI-interpret.pdf · A Novel Knowledge-Based System for Interpreting Complex Engineering Drawings: Theory, Representation,

is also helpful for searching other columns in the towerproject. Certain internal representations, including relationsand semantics of different entities or drawings, are notexplicitly drawn but rather implied in Fig. 1. For instance,since both drawings are very similar: The engineeringentities are of the same shapes and appear in the samelocations in both drawings, the interpretation of one figurecan be speeded up once the entities in another figure arerecognized. Consistency can be checked simultaneously forthese two drawings. Another example of implicit represen-tation is the long vertical line in the middle of Fig. 1a, whichrepresents the symmetry axis of the floor plan layout. Oncean engineering entity is recognized on one side of thesymmetry axis, the other side can be searched immediately.

Our knowledge-based system considers all three levelsof representations to interpret complex engineering draw-ings accurately and robustly.

4 KNOWLEDGE ANALYSIS AND SYSTEM OVERVIEW

4.1 Typical Knowledge in Real-Life EngineeringDrawings

We classify the knowledge in engineering drawings intoexplicit and implicit knowledge. Explicit knowledge includesengineering entities with obvious geometric shape defini-tions, their related dimension sets, and annotations. Someshapes are common across many domains, but others arespecific to a certain domain. Implicit knowledge is more orless at the crossroad of art and engineering, and includesthe following most common classes:

. Multiview. The same engineering entity is oftendescribed in several related views in the samedrawing. Different views are not necessarily of thesame scale because their dimension sets can provideconsistent coordinates.

. Abbreviation. To avoid repetitions, designers useimplicit forms (e.g., details supplied textually) tosimplify drawings.

. Reference. Leading lines and text are used to refer tothe detailed parts in the same drawing. Sometimesimplicit forms (e.g., different entities having thesame name) may be used. We call the detailed one areference source, while the simplified one a referencer.

. Inheritance. The same entity may appear severaltimes in different drawings. For instance, an entitymay appear in drawing A with all of the details(name, contour shape, internal structure, annota-tions, location and dimension sets), but appears inanother drawing B with only its name and schematiccontour. We refer to the object having details as aninheritance source, and the others as inheritors.

. Reflection. A symmetry axis is used to implyreflection relation. Details are drawn in one side,while the other side may be greatly simplified. Wecall the detailed one a reflection source, while thereflected one as a reflecter.

. Size constraints, such as dimensions or size annota-tions.

. Schematics and other personal style preferences.

Fig. 2 is an illustration that loosely characterizes theoutput of the proposed system, describing the explicit andimplicit knowledge extracted from Fig. 1. Fig. 2a shows the

hierarchical structure of the schematic column section viewin Fig. 1a. The section view has three parts: dimension,name, and shape. Each part may be further composed oflower level components or basic graphical primitives. Forinstance, the shape comprises an external contour of thecolumn entity and its internal structure.

Fig. 2b shows two types of implicit knowledge: referenceand reflection. Details of the column entities can be found inthe schematically referred section view. That is, the sectionview is a reference source, while the other column entities arereferencers. References may guide jumps in the samedrawing or between different drawings. Reflections areimplied by symmetry axes, which always guide jumpswithin the same drawing.

Fig. 2c illustrates another type of implicit knowledge:inheritance. Once the entities and their details (inheritancesources) in Fig. 1a are found, we can immediately search forthe inheritors in Fig. 1b at the corresponding locations.Inheritances always guide jumps among different drawingsand help speed up interpretation processes.

Generally speaking, explicit knowledge is the “visible”contents in an engineering drawing which have to besearched by shape matching algorithms, while implicitknowledge is “hidden” information which needs to bepredicted, reasoned, or understood based on a well-definedknowledge representation in an automatic interpretationsystem.

4.2 System Overview

The above-discussed types of knowledge in real-lifeengineering drawings inspire our design of an automaticinterpretation system in three aspects. First, during theknowledge representation phase, we need to consider boththe explicit geometric shape definitions of a target entityand its potential implicit constraints with other engineeringentities. Second, during the automatic interpretation phase,the system should be able to freely switch among differenttypes of knowledge to drive the interpretation processesback and forth among related drawings. Third, implicitknowledge is more reliable, considering the variations inshape definitions and the frequent occurrence of impre-cisely drawn engineering entities. For instance, the ex-istence of a simplified entity means there has to be anothercorresponding detailed source entity, or a potential drawingerror. Therefore, in an interpretation system, explicitgeometric definitions can be used as recognition entrances,while implicit constraints may be used to guide succeedinggraphical reasoning or consistency checking.

Based on the above analysis, we design a novel knowl-edge-based interpretation system. Fig. 3 shows the archi-tecture, which is composed of a knowledge representation anda knowledge-based automatic interpretation subsystem.

We first predefine the domain knowledge and store it indescriptor-based representation files (see Section 5 fordetails). Due to the complexity of architecture drawings[4], our group has cooperated with an architect and astructural engineer for a couple of years to analyze thecharacteristics of the domain-knowledge-independent enti-ties (i.e., dimensions and grid lines) and various types ofdomain-knowledge-dependent architectural targets. Wecollect representations of these entities from differentregions in China, extract their common features, and findtheir implicit constraints under the guidance of the architectand structural engineer. Each knowledge representation fileincludes the explicit and implicit knowledge of a specific type

LU ET AL.: A NOVEL KNOWLEDGE-BASED SYSTEM FOR INTERPRETING COMPLEX ENGINEERING DRAWINGS: THEORY,... 1447

Page 5: 1444 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND …taicl/papers/PAMI-interpret.pdf · A Novel Knowledge-Based System for Interpreting Complex Engineering Drawings: Theory, Representation,

of engineering drawings. Then, the CAD drawings of anengineering project are imported for interpretation (paperdrawings can be scanned, vectorized, and converted to CADdescriptions using commercial vectorization software).

The interpretation subsystem is the core module of oursystem. It is composed of a knowledge interpreter, aknowledge parser, and an entity searcher. The knowledgeinterpreter first selects and loads a corresponding knowl-edge file according to the type of the input drawings to beinterpreted, then reorganizes it into a tree structure. In thistree, nodes represent potential engineering entities, and

edges represent the relations of the entities. Recognitionalgorithms are associated with the entity nodes. Then, theknowledge parser performs a depth-first search for thetarget engineering entities from the root node to traverse theentire knowledge tree. When visiting a node in the tree, theparser extracts the function names and transfers them to theentity searcher. The latter invokes the recognition functionsassociated to that node to search for the desired entity in thegiven set of drawings. Related parameters are sent to theentity searcher simultaneously. In this way, the node-visiting sequence is actually converted to a series of

1448 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 31, NO. 8, AUGUST 2009

Fig. 3. Framework of our knowledge-based drawing interpretation system.

Fig. 2. Explicit and implicit knowledge extracted from the drawings in Fig. 1. (a) Hierarchical compositions of the section view in Fig. 1a. (b) Two types

of implicit knowledge in Fig. 1a: reference and reflection. (c) Inheritance knowledge from Figs. 1a and 1b.

Page 6: 1444 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND …taicl/papers/PAMI-interpret.pdf · A Novel Knowledge-Based System for Interpreting Complex Engineering Drawings: Theory, Representation,

recognition or interpretation functions. Since the potentialexplicit and implicit engineering knowledge is predefinedin the representation file, engineering entities can beidentified more accurately.

5 KNOWLEDGE REPRESENTATION

In this section, we first describe our analysis of the humanunderstanding processes and propose some strategies forrepresenting knowledge. Then, we present a new hierarch-ical knowledge representation method that obeys theproposed representation strategies for automatic analysisof complex real-life engineering drawings.

5.1 Knowledge Representation Strategies

5.1.1 Human Understanding Processes

Through working with different engineers, we find that,although there are differences in their detailed humanunderstanding processes, they consciously or uncon-sciously obey some common rules to interpret engineeringtargets. For instance, given a set of related architecturaldrawings, they always prefer to first read the annotatedintroduction to learn the default settings, then read thedifferent floor drawings sequentially from bottom to topfloor, respecting the mechanical constraints of a building.When interpreting a certain drawing, the architects firstfind grid lines to align with other drawings, and thenquickly locate the entities of interest by searching theirtypical compositions, i.e., name text. If the details of anentity were found to be omitted in the given drawing, thearchitects would turn to search for its corresponding sourceentity in another related drawing.

For the purpose of designing an automatic interpretationprocess for engineering drawings, an efficient humanunderstanding approach more or less involves (but is notlimited to) the following steps:

1. Identify the engineering entities to be interpreted. Tointerpret any engineering drawing, the humanreader must first decide what entities are of interestand clearly define these entities that may appear.

2. Sequence the interpretation processes properly. Simplifi-cations and shorthand that can be well understoodby human readers are necessarily used in real-lifedrawings to reduce complexity. Cross-referencing isunavoidable since the same entity may appear morethan once in drawings. Therefore, given an inter-pretation target entity, the reader needs to dynami-cally sequence the interpretation processes in aproper order to obtain correct interpretation results.

3. Interpret entities in an integrated environment. Entitiesoften cannot be interpreted as isolated targets sincethey are usually semantically related to others in thedrawings. A typical example is the three-viewrepresentation where all three views of an entityneed to be considered simultaneously in order tocorrectly interpret an entity.

Inspired by these specific human understanding pro-cesses, we propose the oriented, ordered, and integratedstrategies in designing our new knowledge representationmethod, rather than using the existing unstructured “rules +inference” engine methodologies.

5.1.2 Oriented, Ordered, and Integrated Strategies

Oriented strategy means that various types of engineeringentities need to be predefined clearly and well organized ina knowledge representation. Ordered strategy implies thatpotential entities in the knowledge representation shouldhave different recognition or analysis priorities. Integratedstrategy emphasizes the importance of relations among theentities. These strategies have been proven to be effectivefor high-level interpretation in our experiments.

Oriented strategy. A real-life engineering project has alarge number of graphical primitives and various types ofengineering entities. When interpreting a given project, weneed to clearly predefine the knowledge pertinent to thepossibly existing engineering entities: their potential com-positions, constraints, and relations with other interpreta-tion entity, as well as implicit descriptions that may speedup the interpretation process (e.g., reflections, references,and inheritances). With the interpretation entities clearlypredefined, searching areas can be reduced to improveefficiency and accuracy.

Ordered strategy. To avoid unnecessary jumps betweendifferent drawings, the order of interpreting various typesof engineering entities is important. A well-defined inter-pretation sequence improves not only the accuracy but alsothe robustness and efficiency. Considering the tower projectas an example, some typical sequences for automaticinterpretation are as follows:

. A higher floor of the tower is restricted by the floorbelow it due to mechanic constraints. Thus, abottom-up process for automatic interpretation isnecessary.

. Source entities (e.g., reference sources or inheritancesources) need to be searched prior to the others.

. Various components of an engineering entity havedifferent contributions to automatic recognition.Obvious features (i.e., a name text) should besearched first since they often can speed up theanalysis.

Integrated strategy. The interpretation process shouldrely on an integrated environment, rather than isolatedgraphical definitions. To correctly recognize an engineeringentity, not only its internal graphical compositions need tobe recognized, its environment often provides keystonejudgment and thus must also be considered.

To illustrate the three strategies, Fig. 4 shows someinterpretation processes for the tower project. The dimen-sions in Figs. 1a and 1b are first recognized to build theglobal coordinates. Then, the reference source columnsection view in Fig. 1a is analyzed, obtaining the nametext, the internal steel structure, and the external contour.Then, the recognition of the referencer column entities inFig. 1a can be done quickly, first by searching for the nametexts and then, if not found, by shape matching. After that,the corresponding inheritor column entities in Fig. 1b can besearched immediately. Other types of engineering entitiesare searched similarly.

5.2 Hierarchical Knowledge Representation Method

As previously mentioned, how to represent the contextualknowledge that describes drawing conventions is admittedly

LU ET AL.: A NOVEL KNOWLEDGE-BASED SYSTEM FOR INTERPRETING COMPLEX ENGINEERING DRAWINGS: THEORY,... 1449

Page 7: 1444 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND …taicl/papers/PAMI-interpret.pdf · A Novel Knowledge-Based System for Interpreting Complex Engineering Drawings: Theory, Representation,

a hard problem [3], [4], [38]. For high-level interpretation ofcomplex engineering drawings, the challenge of designingan efficient and robust knowledge representation lies withhow to organize well a large number of potential engineer-ing entities and their relations, especially their unpredict-able nonlinear cross referencing. Such jumps are necessaryand are often important for referring to detailed attributes,maintaining consistency, or checking conflicts.

The large number of rules used by most existing knowl-edge representation methods complicates the interpretationproblem. Rules are useful for describing static graphicalcompositions of a given engineering entity; however, theirunstructured static nature makes describing various inter-acting relations among engineering entities difficult. More-over, a small variation in the domain knowledge often bringsabout a number of chained modifications to the rules, makingsuch a system difficult to maintain or adapt. Therefore, rule-based systems are not a preferred choice to represent complexhigh-level engineering knowledge.

We need a simple and effective representation method todescribe both the various types of engineering knowledgeand the complex interpretation processes. Through analyzinga large number of real-life engineering drawings, weconclude that automatic interpretation is, in fact, composedof a series of condition-driven processes. To successfullyinterpret an engineering entity, a set of correspondingconditions needs to be checked first. Some of these conditionsspecify graphical compositions, some serve to speed up thesearching process or check consistency. Since each conditionmay involve other engineering entities, jumps are automati-cally performed during the interpretation processes. Suchcondition-driven processes are useful to interpret complexreal-life engineering projects that are composed of dozens oflarge and complex engineering drawings.

For illustration, we still consider the tower project shownin Fig. 1. To automatically extract the details of any slabentity in Fig. 1b, the following condition-driven considera-tions are necessary:

. In Fig. 1b, to search for the potential slab entities, thecontours of the inheritor columns need to besearched and removed first because each slab issupported by the related columns.

. To search for the inheritor columns, the inheritancesource columns in Fig. 1a, including the referencesource column section view and the other referencercolumns, need to be searched first.

. To search for the referencer columns in Fig. 1a, thesource column section view need to be searched firstto provide details.

. The two coordinate systems of Figs. 1a and 1b needto be aligned for inheritance.

As a result, the final automatic condition-driven inter-pretation process is as follows: First, the dimensions inFigs. 1a and 1b are searched and integrated into the globalcoordinates. Next, the reference source column section viewin Fig. 1a is searched to provide detailed attributes (i.e.,name and shape) for the column entities. Then, thereferencer columns are recognized through shape matching.Once the referencer columns in Fig. 1a are found, theirshape coordinates are transformed to Fig. 1b to search forthe inheritor columns rapidly. By removing the graphicalprimitives of the recognized dimensions and inheritorcolumns from Fig. 1b, the potential slab entities are finallysearched accurately according to their graphical composi-tion definitions due to the simplified search space. Anautomatic interpretation system that misses any of theabove steps, or performs them in a different order, wouldproduce inaccurate or incorrect interpretation results of theslab entities.

We propose to represent these conditions as knowledgedescriptors. The condition-driven interpretation processes arethen essentially divided into two parts: knowledge repre-sentation and knowledge-based interpretation. Engineeringdomain knowledge is first represented in the form ofknowledge descriptors, obeying the proposed oriented,ordered, and integrated strategies. Each engineering entityis identified by a specific set of descriptors, called its effectivedescriptors. Then, to interpret the engineering drawings of aproject, descriptors are loaded and visited to recognizetarget entities in the drawings. With each target entityrepresented by a set of effective descriptors, the unpredict-able condition-driven interpretation processes are essentiallyconverted into a series of definite linear descriptor-visitingprocesses.

Generally speaking, there are four levels of interpretationtargets: project, drawing, engineering entity, and graphicalprimitive. A project is composed of a set of relatedengineering drawings, and each drawing is composed ofmany related engineering entities. Simple engineeringentities are composed of elementary graphical primitives,such as text and lines, while complex entities can bedescribed hierarchically as a combination of compositeentities. Each interpretation target in the four levels is onlyrepresented with its effective descriptors.

5.2.1 Knowledge Descriptors

As mentioned, knowledge descriptors are used to representthe interpretation conditions of an engineering target. Basedon their usage, we classify the descriptors into two

1450 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 31, NO. 8, AUGUST 2009

Fig. 4. Interpretation sequences of Fig. 1.

Page 8: 1444 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND …taicl/papers/PAMI-interpret.pdf · A Novel Knowledge-Based System for Interpreting Complex Engineering Drawings: Theory, Representation,

categories: internal descriptors and external descriptors. Inter-nal descriptors depict the possible internal compositions foridentifying a target entity, while external descriptorsdescribe relations and interactions with other engineeringtargets. External descriptors always lead to jumps amongdifferent targets.

The following are three typical types of internal

descriptors:

. Internal composition object (ICO). ICOs define thecomponents of a specific interpretation target. Eachcomponent is organized hierarchically with thegraphical primitives as leaves. For instance, name,dimension, and shape are ICOs of a column sectionview entity.

. Internal relational constraint (IRC). IRCs describe therelation among the ICOs of a given interpretationtarget. Graphical constraints, such as parallelism andperpendicularity, are a typical type of IRCs. IRCs areuseful for checking the relational validity of recog-nized constituents of an engineering target. Forinstance, lines and annotation text need to followcertain graphical constraints to form a dimension.

. Internal dimensional constraint (IDC). IDCs indicatethe dimensional constraints among shapes, dimen-sions, and related annotations. Dimensions arenecessary in an engineering target, providing con-straints for checking geometrical validity.

External descriptors describe interentity relations, such asreferences, inheritances, and reflections. We classify thepossible external descriptors into the following six cate-gories according to their usages:

. External necessary object (ENO). An ENO is anecessary and reliable entity which needs to beidentified before starting the analysis of a specificengineering target. For instance, without the corre-sponding beam drawing, the interpretation of agiven slab drawing would not be accurate. There-fore, the corresponding beam drawing is an ENO ofa slab drawing.

. External source object (ESO). The ESO of an engineer-ing entity is its corresponding source entity withdetailed attributes. For instance, the section view inFig. 1a is an ESO of the referencer columns. Withoutthe ESO information, the interpretation results of asimplified engineering entity would not be accurate.

. External flagging object (EFO). An EFO flags theexistence of an engineering target. For instance, inFig. 1a, the name text “C1” is used to flag theexistence of the section view. Detection of an EFOcan speed up searching processes.

. External leading method (ELM). An ELM identifies therelation between an ENO, ESO, or EFO and anotherentity. The most frequently used is the distance-ELM,i.e., nearest graphical distance of two entities. InFig. 1a, the EFO text “C1” is related to the sectionview through distance-ELM. In complex drawings,graphical-ELMs are often used to lead an entity to ablank space to avoid overlapping or interferences.For example, the ellipse and the line in Fig. 1a thatlead out to the section view form a graphical-ELM. Inaddition, boundary-ELMs involving special shapes

are also commonly used, such as a cloud shapeemphasizing its surrounded text.

. External source-tracking method (ESM). An ESMindicates how a target entity searches for its ESO.Based on the typical implicit engineering knowledgediscussed earlier, we classify ESMs into reference-ESMs, inheritance-ESMs, and reflection-ESMs.

. External dimension-direction constraint (EDC). EDCsdefine the dimensional and directional relationbetween an interpretation target and its ENOs.Dimension sets in an engineering drawing oftenprovide constraints to determine whether or not arecognition result is correct. Directions may also behelpful for checking the recognition results, orreducing searching spaces.

5.2.2 Descriptor-Based Hierarchical Knowledge

Representation

We devise an EBNF description method to clearly define theengineering knowledge. The existing BNF has the followingthree meta-symbols:

. “::=”: means “is defined as,”

. “|”: means “or,” and

. “<>”: used to surround a category name.

We introduce six new meta-symbols, “_”, “*”, “[]”, “{}”,“&”, and “()”, to describe our internal and externaldescriptors. The usage and examples of the six extendedmeta-symbols are listed in Table 1.

To represent the domain knowledge for a given type ofengineering projects, we analyze a large number of typicaldrawings. For illustrative purpose, we again take the towerproject as an example. Table 2 shows part of the extractedknowledge, represented as 11 descriptions, each beingcomposed of an interpretation target name, knowledgedescriptors, and meta-symbols. There are five levels ofinterpretation targets, with the complex engineering entitiesfurther represented through an intermediate entity level.

Description 1 specifies that an engineering project mayconstitute beam drawings and slab drawings. Descriptions 2and 3 define the effective descriptors of a beam drawingand a slab drawing, respectively. Description 2 reveals thefollowing engineering knowledge. First, in any beamdrawing, dimensions are necessary because they are neededfor integrating local coordinates into the global coordinatesof the whole project. Second, there are at least onereferencer column and one reference source beam in eachbeam drawing to support the building. Finally, the analysissequence is defined as follows: dimensions ! referencercolumn ! referencer source beam ! referencer beam !symmetry axis and reflector columns and beams. Descrip-tion 3 differs from Description 2 in two aspects. First,Description 3 defines a necessary ENO descriptor, indicatingthat a slab drawing cannot be interpreted if the correspond-ing beam drawing is not found.

Second, Description 3 gives the ICO descriptors of theinheritor beam entities and the column entities in the slabdrawing, indicating that these entities are necessary andneed to be searched first under the guidance of theircorresponding inheritance source entities during the inter-pretation processes. Types of guidance include coordinatetransform and shape matching.

LU ET AL.: A NOVEL KNOWLEDGE-BASED SYSTEM FOR INTERPRETING COMPLEX ENGINEERING DRAWINGS: THEORY,... 1451

Page 9: 1444 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND …taicl/papers/PAMI-interpret.pdf · A Novel Knowledge-Based System for Interpreting Complex Engineering Drawings: Theory, Representation,

Descriptions 4 to 7 specify the effective descriptors of fourengineering entities, namely, dimension, symmetry axis,referencer column, and reference source column. Description 8

defines an intermediate entity, i.e., a column name, whichalways constitutes a name feature text and a serialnumber. Descriptions 9 to 11 define the graphical primitives

1452 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 31, NO. 8, AUGUST 2009

TABLE 1Six Extended Symbols of EBNF

TABLE 2Part of the Effective Descriptors of Various Interpretation Targets of the Tower Project in Fig. 1

Page 10: 1444 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND …taicl/papers/PAMI-interpret.pdf · A Novel Knowledge-Based System for Interpreting Complex Engineering Drawings: Theory, Representation,

of the entities or intermediate entities. Description 9 definesthat a column name feature text is “c” or “C”. Description 10specifies that a column serial number is an integer, whileDescription 11 defines four possible types of shapes: L, T , I,and X. Descriptions at the graphical primitive level alwayslead to the invocation of a group of identification functions.This process of defining knowledge descriptors is per-formed manually with the help of experts. The proposedknowledge representation is easy to modify and maintain.A semiautomatically assistant tool may be possible formodeling the knowledge representation or checking con-sistency. Once knowledge is represented, it can be used toguide automatic interpretation processes.

6 KNOWLEDGE-BASED INTERPRETATION

The core module of our interpretation system consists of aknowledge interpreter, a knowledge parser, and an entitysearcher. When the engineering drawings of a project areinput for interpretation, our system first selects and loadsthe relevant knowledge representation file according to thetype of the engineering drawings, and then transfers thedrawings and the knowledge file to the knowledgeinterpreter. The types of most input engineering drawingscan be automatically identified by searching for thedrawing name text, for instance, the name text “BEAMFLOOR OF TOWER AT LEVEL 53.05” of Fig. 1a implies

that it is a beam floor drawing. If no such names exist,human interaction is sought to specify a drawing type.

The knowledge interpreter first converts the loadedhierarchical knowledge representation into an EBNF-treestructure. The EBNF-tree defines potential interpretationtargets described by effective descriptors and their internalhierarchical relations. Consider the simple example withdescriptors listed in Table 2. When Description 1 is loaded,the constructed EBNF-tree contains only the root node ofthe tree shown in Fig. 5. After Descriptions 2 and 3 areloaded, the entire tree shown in Fig. 5 is constructed. Fig. 6shows part of the EBNF-tree of a beam drawing afterloading Descriptions 2 and 4-11.

In the EBNF-tree, a node is labeled with an engineeringtarget name and its group of effective knowledge descrip-tors. An edge in the tree represents the relations of the twonodes it connects. For instance, the two edges in Fig. 5bimply that both beam drawings and slab drawings areeffective ICOs of the project. That is, when interpreting sucha project, the first step is to search for its beam drawingsand slab drawings. A leaf of the tree is either an elementarygraphical primitive or an identification function. Forinstance, the leaf named “line segment” means that anassociated identification function is invoked to search for asuitable line segment. Next, the knowledge parser traversesfrom the root of the tree in a depth-first manner. Thetraversal order in fact represents the analysis sequence. As

LU ET AL.: A NOVEL KNOWLEDGE-BASED SYSTEM FOR INTERPRETING COMPLEX ENGINEERING DRAWINGS: THEORY,... 1453

Fig. 5. Part of the EBNF_Tree constructed with Descriptions 1-3.

Fig. 6. The constructed EBNF_Tree by Descriptions 4-11.

Page 11: 1444 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND …taicl/papers/PAMI-interpret.pdf · A Novel Knowledge-Based System for Interpreting Complex Engineering Drawings: Theory, Representation,

an example, the depth-first traversal sequence is alsoindicated in Fig. 6. Notice that the source reference columnsare searched prior to their referencers. If no such referencecolumns are found, the searching of referencer columns canbe pruned earlier to speed up the interpretation process.Therefore, the traversal sequence is also a dynamic processfor different engineering projects, even though they use thesame static knowledge EBNF-tree.

When visiting a leaf node, the knowledge parser extractsthe analysis functions associated with that node andtransports them to the entity searcher. The entity searcherinvokes the functions in a function library and searches forthe interpretation targets in the given engineering draw-ings. In this way, the recognition and interpretationprocesses are driven by the EBNF-tree.

Under the guidance of the EBNF-tree, the entity searchermay return two types of search results each time: success orfailure. For entities that are precisely drawn in the givendrawings, the entity searcher finds and returns the targetentity as expected. For entities where the drawing rules arenot followed precisely, or for unexpected entities notcovered in the knowledge base, the entity searcher recordsthe searching area, creates a warning message, and returnsa failure result. After the automatic interpretation processterminates, the user can double click on a warning messageand the corresponding failure area would be zoomed in forfurther manual consistency-checking or interaction. As aresult, the succeeding error-checking and manual interac-tion process for bad-quality drawings is in fact semiauto-matically guided by the EBNF-tree, making the systemmore robust and efficient.

The EBNF-tree obeys our proposed oriented andintegrated interpretation strategies. All information of apotential interpretation target, including its internal gra-phical composition and external relations with other high-level engineering entities, is definitely defined. Supposethere are two nodes A and B, representing the referencesource entity of the section view and the correspondingreferencer entity in Fig. 1a, respectively. If A is a son of B,then the reference source entity is recognized before thereferencer during the depth-first traversal, guiding thesuccessful recognition of the referencer entity. However, ifB is a son of A, then the referencer would have to be directlysearched in a complex environment without any guidance.Worse, if there is no path from A to B in the tree, theinterpretation system would have no idea how to transferthe detailed attributes from A to B. Since different subtreesmay generate different interpretation results, the subtreesplay an important role in an accurate, efficient, and robustautomatic high-level interpretation system.

Comparing with rule-based methods, the proposedsystem has the following improvements. First, each inter-pretation target is only represented as a group of simpledescriptors, making the domain knowledge readable andmore convenient for manual maintenance. A trained userwithout programming background would be able tocustomize the descriptors of an interested target step bystep. Second, variations of prior knowledge will not bringabout chained modifications since knowledge has beenorganized hierarchically and structurally. Finally, and mostimportantly, dynamic interpretation processes can be easilycustomized by only reordering, adding, removing, ormodifying the “static” descriptors of an engineering target.

Therefore, the proposed system is more convenient andflexible to represent complex high-level knowledge for thepurpose of automatic interpretation.

7 EXPERIMENTS AND DISCUSSIONS

We have implemented the proposed knowledge-basedinterpretation system in Visual C++ 6.0. We collected30 typical real-life architectural engineering projects, con-taining 271 architectural engineering drawings altogether,from Nanjing, Shenzhen, and Hong Kong, and categorizethe projects into three groups A, B, and C according to theircomplexities. They are comprised of 19 types of architectur-al engineering drawings, such as drawings of columnsections and column plans, beam sections and beam plans,wall sections, slab plans, staircases, roofs, basements, andseveral types of structured tables.

As previously mentioned, we first extract commonfeatures of various types of architectural entities and findtheir implicit constraints under the guidance of an architectand a structural engineer. After that, we represent theseentities as knowledge descriptors in XML format files,which are suitable for organizing and checking hierarchicaldata. During interpretation, our system loads the drawingsof the project to be interpreted and the corresponding XMLfiles. Then, by depth-first traversing the EBNF_Tree, thecorresponding functions are invoked to search for thetargets in the input drawings.

Table 3 shows the recognition sequence of some typicalarchitectural engineering entities and their recognitionrates. The results reflect the following facts about oursystem. First, those entities which only have internaldescriptors are searched before those with both internaland external descriptors. The identification of entities whichonly have internal descriptors does not need jumps amongdrawings to check complex relations. Once their graphicalcompositions and their internal constraints are verified,such entities are identified with high confidence. Therefore,as shown in Table 3, dimensions and grid lines are firstrecognized, and have relatively high recognition rates.Second, the source entities are always searched before theircorresponding simplified entities in our system. Forexample, as shown in Table 3, reference source columns,beams, and slabs are found ahead of their correspondingreferencer columns, beams, and slabs. The reason is that inthe EBNF-tree, a source entity is always a necessary ESO ofa potential simplified one. Only after the source entity isfound can the simplified one be searched by shapematching or name matching in the drawing. Directlysearching for the simplified entities in the given drawingoften leads to a much lower recognition rates because onlylimited information can flag their existence. Finally, thesequence of recognized entities and their recognition ratesin Table 3 demonstrate that our method successfullyconverts the complex jumps into a clearly predefined linearsequence. Under the guidance of complex high-levelrelations, our automatic interpretation system successfullyidentifies various engineering entities with high averagerecognition rates.

Table 3 also shows that the average time costs of thethree groups are 41 min, 3.6 min, and 1.7 min, respectively.Although our knowledge-based representation has clearlydefined all of the possible existing engineering entities, we

1454 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 31, NO. 8, AUGUST 2009

Page 12: 1444 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND …taicl/papers/PAMI-interpret.pdf · A Novel Knowledge-Based System for Interpreting Complex Engineering Drawings: Theory, Representation,

cannot predict with certainty whether or not a specific typeof engineering entities exist in the given real-life engineer-ing project. Therefore, our system has to spend extra timesearching for those entities which in fact do not exist in thegiven project. Fortunately, the time cost is acceptable,especially compared with human interpretation processes,which usually need at least 20-40-fold cost.

The time cost for the searching processes may be reducedin several ways. For instance, when the knowledge parserextracts the name “reference source column” with the meta-symbol “*”, our system knows that there may exist one ormore reference source column section views in the drawing.

Then, there are two possible searching strategies: sequential

or concurrent search. Figs. 7 and 8 show the searchingprocesses for the reference source column section viewsusing these two strategies. In our experiment, concurrentsearching helps improve efficiency because the recognizedgraphical primitives can be quickly removed to reduce theremaining search space.

We believe that our approach also works for otherdrawing domains. We tested our approach on two other

types of drawings: complex structured tables composed oftext and graphs and flow charts. In fact, the tested tablesand charts were contained in our collection of 30 typical

LU ET AL.: A NOVEL KNOWLEDGE-BASED SYSTEM FOR INTERPRETING COMPLEX ENGINEERING DRAWINGS: THEORY,... 1455

Fig. 7. Sequentially searching for reference source column section

views.

TABLE 3Interpretation Results of Three Groups of Real-Life Architectural Engineering Projects

Fig. 8. Concurrently searching for reference source column section

views.

Page 13: 1444 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND …taicl/papers/PAMI-interpret.pdf · A Novel Knowledge-Based System for Interpreting Complex Engineering Drawings: Theory, Representation,

real-life architectural engineering projects. For instance, acolumn template drawing may list all candidate columnnames, attributes, and the corresponding graphical sectionviews in a structured complex table. After extracting theproper knowledge descriptors, the interpretation of such atemplate drawing is actually an automatic interpretationprocess of the structured tables.

In our experiments, we faced two types of failures: internal-failures and external-failures. To interpret a target engineeringentity, internal-failures are caused by the following tworeasons. First, its internal geometrical shape definitions arenot well covered in the representation file. Second, its internaldescriptors are well defined, but the given drawing is notdrawn precisely. External-failures are also caused by tworeasons. The first is similar to that of internal-failures:External descriptors are not covered in the knowledgerepresentation file. The second is that external descriptorsare well defined, but there are drawing errors like missingsource entities, conflicting size annotations, etc. As men-tioned previously, the system will record the correspondingarea and creates a warning message for further manualchecking or interaction when such a failure occurs.

Our experiments might be improved in several ways. First,the interpretation parameters might be calculated moreintelligently. For instance, we set the distance threshold asthe average height of all the text in the input engineeringdrawings. However, for the enlarged section view led out bythe ellipse and the line in Fig. 1, such a global distancethreshold may not be suitable. How to calculate the varioustypes of interpretation parameters is still a difficult problem.Second, our experiment only extracts typical high-levelengineering knowledge based on real-life architecturaldrawings. To be generic, more analysis of other types ofengineering drawings is necessary.

8 CONCLUSION

We present a knowledge-based system for automaticinterpretation of engineering drawings. Our major con-tributions include a novel hierarchical descriptor-basedknowledge representation method for high-level interpreta-tion of complex real-life engineering drawings and a newknowledge-based interpretation system to convert real-lifeengineering drawings to content-oriented high-level de-scriptions. We apply our system to a number of real-lifecomplex architectural engineering drawings, achieving ourdesired interpretation sequences, high recognition rates,and acceptable time cost. Our system can be easily appliedto other engineering fields.

There are several possible future directions. We areinvestigating extensions including an improved representa-tion for more types of domain knowledge helpful for high-level analysis, automated learning of human feedback, anddynamic threshold calculation for recognition of differentengineering entities. Another interesting area to explore isautomatic or semiautomatic algorithms for generatingknowledge-descriptors.

ACKNOWLEDGMENTS

This project was supported by the National Natural ScienceFoundation of China (Grants 60603086, 60723003, and60721002) and the National High-Tech Research andDevelopment Program of China (Grant 2007AA01Z334).

REFERENCES

[1] A. Cardone, S.K. Gupta, and M.V. Karnik, “A Survey of ShapeSimilarity Assessment Algorithms for Product Design andManufacturing Applications,” J. Computing and Information Sciencein Eng., vol. 3, no. 2, pp. 109-118, 2003.

[2] D.G. Ullman, The Mechanical Design Process, second ed. McGraw-Hill, 1997.

[3] K. Tombre, “Ten Years of Research in the Analysis of GraphicsDocuments: Achievements and Open Problems,” Proc. 10thPortuguese Conf. Pattern Recognition, pp. 11-17, 1998.

[4] K. Tombre, “Analysis of Engineering Drawings: State of the Artand Challenges,” Lecture Notes in Computer Science, vol. 1389,pp. 257-264, 1998.

[5] T. Kanungo, R-M. Haralick, and D. Dori, “UnderstandingEngineering Drawings: A Survey,” Proc. First IAPR Int’l WorkshopGraphics Recognition, pp. 217-228, 1995.

[6] X. Hilaire and K. Tombre, “Robust and Accurate Vectorization ofLine Drawings,” IEEE Trans. Pattern Analysis and MachineIntelligence, vol. 28, no. 6, pp. 890-904, June 2006.

[7] Y. Zheng, H. Li, and D. Doermann, “A Parallel-Line DetectionAlgorithm Based on HMM Decoding,” IEEE Trans. PatternAnalysis and Machine Intelligence, vol. 27, no. 5, pp. 777-792, May2005.

[8] Y.H. Yu, A. Samal, and S.C. Seth, “A System for Recognizing aLarge Class of Engineering Drawings,” IEEE Trans. PatternAnalysis and Machine Intelligence, vol. 19, no. 8, pp. 868-890, Aug.1997.

[9] J.Y. Chiang et al., “A New Algorithm for Line Image Vectoriza-tion,” Pattern Recognition, vol. 31, no. 10, pp. 1541-1549, 1998.

[10] D. Dori and W. Liu, “Sparse Pixel Vectorization: An Algorithmand Its Performance Evaluation,” IEEE Trans. Pattern Analysis andMachine Intelligence, vol. 21, no. 3, pp. 202-215, Mar. 1999.

[11] W.Y. Liu and D. Dori, “Incremental Arc Segmentation Algorithmsand Its Evaluation,” IEEE Trans. Pattern Analysis and MachineIntelligence, vol. 20, no. 4, pp. 424-431, Apr. 1998.

[12] E. Bodansky and M. Pilouk, “Using Local Deviations ofVectorization to Enhance the Performance of Raster-to-VectorConversion Systems,” Int’l J. Document Analysis and Recognition,vol. 3, no. 2, pp. 67-72, 2000.

[13] Y. Chen, N.A. Langrana, and A.K. Das, “Perfecting VectorizedMechanical Drawings,” Computer Vision and Image Understanding,vol. 63, no. 2, pp. 273-286, 1996.

[14] R. Kasturi, S.T. Bow, W. El-Masri, J. Shah, J.R. Gattiker, and U.B.Mokate, “A System for Interpretation of Line Drawings,” IEEETrans. Pattern Analysis and Machine Intelligence, vol. 12, no. 10,pp. 978-992, Oct. 1990.

[15] R.W. Smith, “Computer Processing of Line Images: A Survey,”Pattern Recognition, vol. 20, no. 1, pp. 7-15, 1987.

[16] S.H. Joseph and T.P. Pridmore, “Knowledge-Directed Interpreta-tion of Mechanical Engineering Drawings,” IEEE Trans. PatternAnalysis and Machine Intelligence, vol. 14, no. 9, pp. 928-940, Sept.1992.

[17] K.H. Lee, Y.C. Choy, and S.B. Cho, “Geometric Structure Analysisof Document Images: A Knowledge-Based Approach,” IEEETrans. Pattern Analysis and Machine Intelligence, vol. 22, no. 11,pp. 1224-1240, Nov. 2000.

[18] J.Q. Song, F. Su, J.B. Chen, and S.J. Cai, “A Knowledge-Aided LineNetwork Oriented Vectorization Method for Engineering Draw-ings,” Pattern Analysis and Application, vol. 3, no. 2, pp. 142-152,2000.

[19] P. Vaxiviere and K. Tombre, “CELESSTIN IV: Knowledge-BasedAnalysis of Mechanical Engineering Drawings,” Proc. IEEE Int’lConf. System Eng., pp. 242-245, 1992.

[20] J. Cherneff, R. Logcher, J. Connor, and N. Patrikalakis, “Knowl-edge-Based Interpretation of Architectural Drawings,” Research inEng. Design, vol. 3, no. 4, pp. 195-210, 1992.

[21] Y.Q. Cheng and J.Y. Yang, “A Knowledge-Based GraphicDescription Tool for Understanding Engineering Drawings,” Proc.First Int’l Conf. Systems Integration, pp. 302-309, 1990.

[22] J.Q. Song, M.R. Lyu, and S.J. Cai, “Effective Multiresolution ArcSegmentation: Algorithms and Performance Evaluation,” IEEETrans. Pattern Analysis and Machine Intelligence, vol. 26, no. 11,pp. 1491-1506, Nov. 2004.

1456 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 31, NO. 8, AUGUST 2009

Page 14: 1444 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND …taicl/papers/PAMI-interpret.pdf · A Novel Knowledge-Based System for Interpreting Complex Engineering Drawings: Theory, Representation,

[23] J.Q. Song, F. Su, C.L. Tai, and S.J. Cai, “An Object-OrientedProgressive-Simplification-Based Vectorization System for Engi-neering Drawing: Model, Algorithm, and Performance,” IEEETrans. Pattern Analysis and Machine Intelligence, vol. 24, no. 8,pp. 1048-1060, Aug. 2002.

[24] J.Q. Song, F. Su, J.B. Chen, and S.J. Cai, “A Highly Efficient GlobalVectorization Method for Line Drawings,” Proc. Third IAPR Int’lWorkshop Graphics Recognition, pp. 32-37, 1999.

[25] J.Q. Song, F. Su, C.L. Tai, J.B. Chen, and S.J. Cai, “Line Net GlobalVectorization: An Algorithm and Its Performance Evaluation,”Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1,pp. 383-388, 2000.

[26] F. Su, J.Q. Song, C.L. Tai, and S.J. Cai, “Dimension Recognitionand Geometry Reconstruction in Vectorization of EngineeringDrawings,” Proc. IEEE Conf. Computer Vision and Pattern Recogni-tion, pp. 710-716, 2001.

[27] B. Couasnon, “DMOS, a Generic Document Recognition Method:Application to Table Structure Analysis in a General and in aSpecific Way,” Int’l J. Document Analysis and Recognition, vol. 8,no. 2, pp. 111-122, 2006.

[28] B. Couasnon, J. Camillerapp, and I. Leplumey, “Access by Contentto Handwritten Archive Documents: Generic Document Recogni-tion Method and Platform for Annotations,” Int’l J. DocumentAnalysis and Recognition, vol. 9, nos. 2-4, pp. 223-242, 2007.

[29] B. Couasnon, “Dealing with Noise in DMOS, a Generic Methodfor Structured Document Recognition: An Example on a CompleteGrammar,” Lecture Notes in Computer Science, vol. 3088, pp. 38-49,2004.

[30] O. Bimber, L.M. Encarnao, and A. Stork, “A Multi-LayeredArchitecture for Sketch-Based Interaction within Virtual Environ-ments,” Computers and Graphics, vol. 2, no. 6, pp. 851-867, 2000.

[31] A. Caetano, N. Goulart, M. Fonseca, and J. Jorge, “Javasketchit:Issues in Sketching the Look of User Interfaces,” Proc. AAAI SpringSymp. Sketch Understanding, pp. 9-14, 2002.

[32] J.V. Mahoney and M.P.J. Fromherz, “Three Main Concerns inSketch Recognition and an Approach to Addressing Them,” Proc.AAAI Spring Symp. Sketch Understanding, pp. 105-112, 2002.

[33] C.J. Su and F. Lin, “Syntactic Interpretation Approaches forInnovative Engineering Drawings Conversion Support System,”Computers and Industrial Eng., vol. 35, nos. 3-4, pp. 635-638, 1998.

[34] Q. Ji and M.M. Marefat, “Machine Interpretation of CAD Data forManufacturing Applications,” ACM Computing Surveys, vol. 24,no. 3, pp. 264-311, 1997.

[35] G.S. Zhi, S.M. Lo, and Z. Fang, “A Graph-Based Algorithm forExtracting Units and Loops from Architectural Floor Plans for aBuilding Evacuation Model,” Computer-Aided Design, vol. 35, no. 1,pp. 1-14, 2003.

[36] P. Dosch, K. Tombre, C. Ah-Soon, and G. Masini, “A CompleteSystem for Analysis of Architectural Drawings,” Int’l J. DocumentAnalysis and Recognition, vol. 3, no. 2, pp. 102-116, 2000.

[37] B.S. Prabhu and S.S. Pande, “Intelligent Interpretation of CADDDrawings,” Computers and Graphics, vol. 23, no. 1, pp. 25-44, 1999.

[38] T. Lu, C.L. Tai, F. Su, and S.J. Cai, “A New Recognition Model forElectronic Architectural Drawings,” Computer-Aided Design,vol. 37, no. 10, pp. 1053-1069, 2005.

[39] T. Lu, C.L. Tai, L. Bao, F. Su, and S.J. Cai, “3D Reconstruction ofDetailed Buildings from Architectural Drawings,” Computer-AidedDesign and Applications, vol. 2, nos. 1-4, pp. 527-536, 2005.

[40] J.Y. Ramel and N. Vincent, “Strategy for Line Drawing Under-standing,” Lecture Notes in Computer Science, vol. 3088, pp. 1-12,2004.

[41] T.P. Pridmore, A. Darwish, and D. Elliman, “Interpreting LineDrawing Images: A Knowledge Level Perspective,” Lecture Notesin Computer Science, vol. 2390, pp. 245-255, 2002.

[42] R.D.T. Janssen and A.M. Vossepoel, “Adaptive Vectorization ofLine Drawing Images,” Computer Vision and Image Understanding,vol. 65, no. 1, pp. 38-56, 1997.

[43] O. Hori and S. Tanigawa, “Rastor-to-Vector Conversion by LineFitting Based on Contours and Skeletons,” Proc. Int’l Conf.Document Analysis and Recognition, pp. 272-281, 1993.

[44] J.B. Roseborough and H. Murase, “Partial Eignenvalue Decom-position for Large Image Sets Using Run-Length Encoding,”Pattern Recognition, vol. 28, no. 3, pp. 421-430, 1995.

[45] D. Dori and W.Y. Liu, “Automated CAD Conversion with theMachine Drawing Understanding System: Concepts, Algorithms,and Performance,” IEEE Trans. Systems, Man, and Cybernetics—Part A: Systems and Humans, vol. 29, no. 4, pp. 411-416, 1999.

[46] D. Crevier and R. Lepage, “Knowledge-Based Image Under-standing Systems: A Survey,” Computer Vision and Image Under-standing, vol. 67, no. 2, pp. 161-185, 1997.

[47] T. Lu, H.F. Yang, R.Y. Yang, and S.J. Cai, “Automatic Analysis andIntegration of Architectural Drawings,” Int’l J. Document Analysisand Recognition, vol. 9, no. 1, pp. 31-47, 2007.

[48] H.F. Yang, R.Y. Yang, T. Lu, and S.J. Cai, “The HierarchicalFeature Extraction Approach for Symbol Recognition in Con-struction Engineering Drawings,” Proc. Sixth IAPR Int’l WorkshopGraphics Recognition, pp. 46-55, 2005.

Tong Lu received the PhD degree in computerscience from Nanjing University in 2005. He iscurrently an associate professor in the Depart-ment of Computer Science and Technology atNanjing University. His research interests in-clude graphics recognition, automatic interpreta-tion of engineering drawings, computer graphics,and CAD.

Chiew-Lan Tai received the BSc degree inmathematics from the University of Malaya, theMSc degree in computer and informationsciences from the National University of Singa-pore, and the DSc degree in information sciencefrom the University of Tokyo. She is an associateprofessor in the Department of ComputerScience and Engineering at the Hong KongUniversity of Science and Technology. Herresearch interests include geometric modeling,

computer graphics, and graphics recognition.

Huafei Yang is a PhD candidate in the Depart-ment of Computer Science and Technology atNanjing University. His research interests in-clude graphics recognition, automatic interpreta-tion of engineering drawings, and computergraphics.

Shijie Cai received the degree from NanjingUniversity in 1967. He is a professor in theDepartment of Computer Science and Technol-ogy at Nanjing University. His research interestsinclude computer graphics, graphics recognition,image processing, and document analysis.

. For more information on this or any other computing topic,please visit our Digital Library at www.computer.org/publications/dlib.

LU ET AL.: A NOVEL KNOWLEDGE-BASED SYSTEM FOR INTERPRETING COMPLEX ENGINEERING DRAWINGS: THEORY,... 1457