CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer...

33
CHAPTER - IV RULE-BASED APPROACH FOR DISTRIBUTED QUERY OPTIMIZATION

Transcript of CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer...

Page 1: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

CHAPTER - IV

RULE-BASED APPROACH FOR DISTRIBUTED QUERY OPTIMIZATION

Page 2: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

Database systems are C Q m p ~ t e r s y ~ t e m a that store a i ~ r p e

number o f flcta about some subject. Dt?~ign and implementation of

database system ie developing and i e a most challenging field of

computer sciencel Many techniques have been developed to enable

the efficient representation, tora age and retrieval of a number of

facts. In centrallaed or distributed database myatem, query

optimization is most complex task. The subject, query processing,

becomes interesting when it require8 deductive reamoning with

facts in database while retrieving answers to given queries.

There are several problem8 in deeigning much intelllgent

information retrieval systems, First, i t is very difficult in

building a system that (:an understand queries stated in a natural

language like English. The language under~tanding problem is

solved by specifying some formal machine understandable query

languages but the second problem, deducing answers from stared

facts remains as it is, Third, understanding query and deducing

an answer t o it may require more knowledge than that i e

represented in database, This type of common knowledge ia often

required for solving suoh problems. Artificial intelligence

method. are required to represent S U Q ~ C O ~ o n knowledge and 8l.o

in deeigning eystems using common knowledge,

Page 3: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

Many researchers developed intelligent extensible database

eysteme such as EXODUS 1 1 2 1 , PROBE [ 2 6 1 , Starburst 1 4 5 1 and

postgreat 101 1 , Freytag I 3 3 1 and Graef e I40 1 proposed a rule-bared

query optimization. In 161 an approach automatic rule derivation

for aemantic query optimization was prnpoaed, All theee papere

deal with query processing in centralised database Sy8tems. But

from the available literature , it ie clear that there ir no

intelligent system developed for distributed query prooessing.

There are many papers 118,35,36,80,92,97,1101 related to

distributed query procemeing. Some author0 [13,82,971 coneider

fragmentation technique as part of optimization problem and some

other [14,1021 separated fragmentation problem complet,ely from

query optimization problem,

There la no single approach which oontaine algebraia

trenaformations, query translation (query over global relations

into a query over fragment) and proces~ing of query ae part of

optimization problem.

Most of distributed query prooeesing algorithms oonaider only

Borne operation^ in query qualifications, auch as only loins in

Join queries eto.. Algorithms for general queries which

allows more available operations, are very rare*

Page 4: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

Different algorithms are ~roposed depending on the basic

made and also objective funot.inn, Thie chapter

preesnts an architecture of distributed query processing eyatem

with rule-based approach. Thie model contains all three etagam,

algebraic transformatiori, Translation end prooeesing of query In

distributed environment. Qualification of query in this model may

include union and some other relational operations.

4 1 OVERVIEW OF ART1 FI CI AL INTELLIGENCE:

Many human mental activities such as writing oomputer

programs, solving 'mathematical problem^, engaging in common senee

reasoning, understanding language, and even an automobile driving

require 'Intelligence'. Over the past few decadaa, eeveral

computer systema were built that can perform tasks deecribed

above. Some computer systems are designed speoillcally for

solving problem such as diagnose of dieeaees, plan rynthssie of

complex organio chemical compounds, solving differential equationa

in aymbolio form, analyelng electronlo nircuits, nstural language

processing eto,, Such eystems possess some degree of artiflois1

intelligence, Those aystems are ooneidered under the field of

Artificial Intelligence ( A 1 1 , mince they uee some A 1 methods and

techniques,

Page 5: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

4*1.1. EARLY WORKS I N AII

During 1950s eeveral event8 marked thle period as real

beginning of During thie period reeearohers like Claude

Shannon at H I T , Allen Newell at RAND oorporation developed chesr

playing Programsl Other types of game playing and elmulation

programs were also developed during t ,h ia period. Language

translation programs were developed during 1 9 5 5 ~ ~ During mid

1950s a Bummer workshop on A 1 was uponnored by IBM which i e

coneidered as date of birth of A I , In an seminar on A 1 during

1972, much discussion was focused on automatic theorem proving

and new programming. languages.

Some significant A 1 events of the 1 9 6 0 ~ include the

1962 --- G, W. Ernst made the first A 1 computer controlled

Rob0 t

1964 --- B. Ruphael worked on question-answering eyetern6

giving trivial databases

1961-65 --- A. L , Samuel developed a program which learned to

play checkers at master's levelB

1965 -..- J . A . Robinson introduced rseolutlon ae an lnferencs method in loglc

1965 --- Work on expert eystem DENDRAL was begun,

1968 --- Work on MACSYMA was initiated at MI7

Page 6: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

During 1970s A1 S~Stefns were developed which had large

domain-~pecific knowledge bamea and extensive proosdural

knowledge, BY end of 1 9 7 0 ~ ~ the field of expert syrteme and

natural language Proce~sing had begun to emerge. By thta time A1

researah already affected many fields, including program

techniques in Mathem~t ics, Chemletry, P~ychology, Geology, Oil

exploration etc,, In 1980's , e p ~ c i a l A 1 hardware and software

development tools have begun to emerge, and commercial

applications were increased. At the same time research ha8 begun

on fundamental iasues of learning, memory structure, knowledge

representation and knowlerlge acquisition,

4 a 1 a 2. APPLI CAT1 ONS OF ART1 FI CI AL I NTELLI GENCEt

Real time application of A1 c lo~ely related to human beings

l e Natural Language Processing. Developing a computer system

capable of generating and under~t~anding fragments of a natural

language such as English i e conaidered as natural language

Proceeaing, Grosz 1 4 4 ) presents a good eurvey of current

techniques and problema in natural language processing,

Business application of A1 is Intelligent Retrieval from

Databases. Retrieving answers for given queries from database

using deductive reasoning with facts in the database. Paper8

Page 7: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

describing various ap~licationa of A 1 and lagio to database

organization and retrieval are contained in a book edited by

Gallaire and Minker 1341,

A1 techniques are been applied in development of autornatio

consulting eyetems, Many such systems employ the A 1 technique of

rule-based deduction, Expert consulting systems have been

developed for a variety of domains, Expert conuulting eyeteme

/e been built that can diagnose diseases I1041, evaluate ore

deposita [ 2 9 1 , ~ u g g e e t ~tructurea for complex organic chemicals

1111 etc..

In general finding a proof for a theorem in mathematics in

considered a8 an intellectual t,ask, In A I , formalization of

deductive procees uses predicate logic. Early application of A1

in theorem proving are plans geomet,ry, propo~itional logic,

analysis, topology, g e t theory eta,, Theorem proving technique

applied in building excellent resolution based ~ysteme.

Much o f the theoretical reaearch in robotics was oonducted

thorough robot proJects in late 1960s and e a r l y 19708, More

reeearch o n practical applications of robotics i n industrial area

*ere conducted during late 1970s. Research on robotios in

theoretical ae well as practical applications has helped to

Page 8: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

develop many A 1 ideas. There are plenty o f application of A t in

different fields* In automatic programming Manna and WaLdingcr

1741 describe logic baaed method fur program verification,

Lauriere I 6 4 1 present,^ a computer l n n g ~ l a g e and a syatem for

solving combinatorial problems u ~ i r ~ g A 1 mcthods, Many papers on

the problema o f visual perception by machine are contained in

volumes edited by Hanaon and Riaeman 1 4 8 1 , Muc:h research ia going

on in developing A1 technique^ in different field8 such aa

automatic design, education, organic chemical synt.hesir etc. .

In this chapter the technique of rule-based daduation for

intelligent retrievdl from database8 is used.

4.2. EXPERT SYSTEMS:

In 1841 Newell surveyed several organizational alternative0

for problem solvers . He was concerned with how one ehould

Proceed in designing problem-solving systems. Many techniques

have been developed in AI research and many expert eyatems have

been built, Expert systems are problem-aolving programa that

s o l v e aubstant ia l problem8 , generally considered as d l f f ioult and

requiring e x p e r t i s e They are knowledge based beoauee their

Performancs depend. critically On the Use o f fact. arid heurigtioa

Used by experts. Reoently some textbook8 on A 1 I79.811 present

Page 9: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

examples of advanced programming technique. in A I . A textbook on

expert syatema containing a number of paper8 related to this

field, was edited by Amar gupta and P r a ~ a d 1 8 4 1 .

An expert systems knowledge is obtained from expert eourcem

and coded in a form suitable for the system to uee in its

reasoning process, The expert knowledge is generally obtained

from apecialists or other sources of expertise, such as texts,

journal articles and data basen. Thla knowledge is enooded in

required form, loaded into a knowledge bane, then tested and

refined continuously throughout the life of the yete em.

4.2.1, DEFINITION AND CHARACTERISTICS OF EXPERT SYSTEMS:

Expert eystems are non-conventional knowledge lnteneive

Programs that e o l v e problems normally requiring human expertise.

I t performs many of the secondary funct.ions that expert doee . By

examining the functioning of expert B Y U ~ ~ ~ R , common

charaotsrietice of expert system can be summarieed aa follows:

---They aolve very difficult problem as well as or better than

expert8

---They reason heuristically, using rule8 which experts conrider am

effective

Page 10: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

---They interact with humans in appropriate ways. Inoluding the

use of natural language,

---They manipulate and reason about symbolic deecriptiona

---They function with orroneou8 data and uncertain judgmental

rules

---They prooela multiple hypotheeio simultaneously

---They explain why they are asking a question

---They Justify their concluelona

4.2.2. EVOLUTION AND BACKGROUND H I STORY I

Expert systems first emerged from refiearch laboratoriee of

few US Universities during 19608 and 19708, They were developed

a8 specialleed problem solvers which emphasized the ure of

knowledge rather than algorithms and general search methods.

Figure 4 , l porsitione expert symtems around 1980-81, when efforts

begun to commercialise the techtrology, The first company formed

exclueively to promote expert systems in the field of genetic

Engineering was, "INTELLI GENETICS". The first completed expert

System 'DENDRAL' was developed at Stanford University in late

1960e, The Stanford group developed the first nucne~mful learnlnfi

syatem, 'META-DENDRAL', and also a varlety of applioationa in

medicine as 'MYCIN', Now the range and depth of applicatione of

expert eystems are expanding in many areas.

Page 11: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

MAT EYA ICAL pg88E6E6Xf Eoo18 TEOR T f ~ T O M A ~ I ~ R

) I I 1

)

4.2.3. IMPORTANCE AND APPLICATIONS:

The value of expert system was eutabliahed by the early

1980s. Theee systems proved to be cost effective in moet of the

Praotical applioations. In an sppllcation, difficult problem6 are

solved by experts. These experte may not be available always or

they may retire from their aervicee. So their knowledge or

experienoe can be utilieed forever to solve such stereotype

Problem by developing an exper,, aystema with experts' experience^.

Page 12: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

There are many application in almost all areaR of bu8inees and

government, They include the areas such as;

---Different types of medical diagnonea

---Diagnosis of complex electronic and electromechanical Byateme

---Diagnoeie of dieeel electric locomotion systems

---Diagnosis of software development projects

---Planning experiment^ in biology, chemistry and molecular

genet i cs

---Forecasting Crop damage

---Identification of chemical compaund structures and chemical

compound8

---Evaluation of loan applicants for lending institutions

---Design o f VLSI system8

---Military applications ranging from battlefields asses~ment to

ocean surveillance

---~esessment of geologic structures from dip meter logs

---Teaching students specialized tasks

---planning curricula for student8

4-21 4, RULE-BASED SYSTEM ARCHITECTURE:

Expert uystem architectures are categorised lnto two models

as production rule system , known a l ~ o as rule-based aystem and

nOn production rule ayatem, based on rule representation iohernes.

Page 13: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

Moat,oommon form of architecture i8 rule-baaed system. Thie type

of ~ymtern uae8 knowledge encoded in the form of production rules,

that is , i f . , . then.. . rules. Main components of rule-baaed

eyatem ie shown in figure 4 . 2 .

USER I EXPERT SYSTEM

I

INFERENCE CASE

WOLKXN

MEMORY

Figure 4 , 2 , Components of Expert system

KNOWL E L S E BASE:

Knowledge base contains facts and rules about some

epeoialised knowledge domain, For developing expert eyatem, much

domain knowledge is required. Then it may acoeaa much knowledge

t 0 give intelligent advice re la t ed to that domain. Thia element

of system is most important while oonstructing expert system and

system are also known as knowledge base systems, Knowledge bare

contains both declarative knowledge and procedural knowledge.

Page 14: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

INFERENCE PROCESS:

Simply having acoesa to a great deal of knowledge doe. n o t

make an expert ayetern intelligent, The component that directs the

implementation of knowledge is known an inference engine. Th i a

inference engine accepts user input queries and reeponrse to

questions through 110 interface and uRes this information with the

knowledge stored in knowledge base. Thie procees is oarried out

recursively in three stages : 1, Match, 2 . Select and 3 , Execute.

During the match stage, t.he content^ of working memory are

compared to facts and rule^ contained in knowledge baae. When

consistent matches are found, corresponding rules are placed in

conflict set. One of the rules from conflict set i~ selected for

execution. Selected rule ie executed and action part of the rule

is carried out.

I / O I N T E R F A C E :

The I/Q interface permits the user to communicate with the

ayetern in a more natural way, u ~ i n g eimple selection menus or use

of a language cloae to natural language. The aornmunication

performed by 1/0 interface ia bidirectional. AI teohnique,

Natural language processing can be used in this interface to

Communioate with system in an ordinary Engllah and enable the

Computer to reepond in same language. This type of user interface

is oalled a natural language front end.

Page 15: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

This i8 an area of memory used for storing a dcacription of

the problem constructed by the system from the facta s u p p l i e d by

the user or inferred from the knowledge bane during a

consul tation,

EXPL A l NATZ ON HOWLE;

This module provides the user with an explaination of the

reasoning process when requeated. This will explain the

responses to 'HOW query iu process' and 'Why query needs certain

information while processing',

L E A R N I N G MOWLE AND HISTORY F I L E :

Theee are not cammon components of expert systems, They are

provided to assist in building and refining the knowledge base.

4*2* 56 RULE-BASED QUERY OPT1 MI ZATI ON:

To minimize the ahanges needed to build an optimizer for a

new databaee system, recent re~earchere developed extensible query

optimizere as EXODUS 1121, PROBE 1261 and POSTGRES t1011, Both

Freytag 1331 and Graefe [ 4 0 1 have proposed a rule-based view of

query optimization This approach allows a database implementcr

Page 16: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

to ~ p e c I f ~ algebraic tranaformationa as a set of rewrite rules.

his ~pecifioation is used to generate an exeeutablc conventional

query optimizer. Freytag [ 3 3 1 desorlben a rule-baled approaoh to

generate different query plans, given an initial query

a p e c i t i ~ a t i o n ~ He deeoribee an approach that e e l e c t ~ the optimal

Bet of algebraic traneformatione whlch can he applied to a given

query.

~ r a c f e ' e syrstem I 4 0 1 ia based on an optimizer generator,

which uses a set of algebraic tranaformationa t,o d e r i v e a

executable oonvent.iona1 query optimizer. Graefe considere problem

similar to those found in semantic query optimization, euoh a a

identification and select ion of transformat ions bseed on

approximate methods . These methods use c o ~ t formulae and pant

database performance to evaluate the worth of a given

transformation.

In 1981 authors describe the architecture o f a yete em having

~ W O interrelated components: combined oonventional/semsntio query

optimizer, and an automatic rule deriver. Semantic query

optimizer ia 8imple generalisation of oonventional rule-based

O ~ t i m i z e r , in which aemantia transformation heuri~tiel are used in

place of algebraic transformation heuristios.

Page 17: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

B U ~ from the knowledge available . it is obvloua that there

is very little research on rule-based approach for Distributed

query optimization. In this chapksr an archlt,ecture for

rule-based approach for distributed query proceaulng itv propomed.

4.3. HEURISTIC RULE-BASED DISTRIBUTED QUERY OPTIMIZATION:

In general, distributed query prncesaing will be executed in

two major steps, ( 1 ) Tran~lation of a query over glotlnl rqlationa

into equivalent form of query over fragmentr and ( 2 )

Transformation of query into optimally equivalent form by applying

algebraic and semantic equivalent tran~formationa. In distributed

environment at each node local and global optimizer8 exist.

Global optimizer, translates given query into opLimal fragmented

query and non local ~egmenta are transmitted to carreaponding

aitsa where Local optimization procedure i e applied to that

segment to execute optimally. Therefore local optimizer is

analogous to an optimizer in Centraliaed ~ystem with alight

modifications. In centralised aysteme goal of optimization la to

reduce proceasing and 1/0 cogts whereas in dimtributed environment

goal of optimization is to reduce oommunioation cost-

Already rule-baaed approach 1331 for query processing and

also for semantic trall~formnt.ion [ g e l in centralieed syatem were

Page 18: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

proposed. Same procedure can be adopted for second 8tcp of

distributed query optimization to generate an optimal plan by

adding more trsn~formation heuristics. Therefore main problem in

developing rule-baaed approach for dist,r i butcd query optimization

i s translation of a global query into fr~gmented query. In this

aectibn architecture of heuristic baaed query optimizer ie

presented and functioning of t,ranslation procensor J R discussed in

deta i 1.

Dietributed query optimizer i s decomposed into two

independent components: ( 1 ) Tran~lat~ion Processor nnd ( 2 )

Algebraic/Semantic optimizer a8 shown in Figure, 4 , 3 , A query

over global relations will be input to ' Translatior1 Processor '

and output of this component will be optimal fragmented query

which i e an input to second component ' Algabra!c/Semantic

optimizer', Output of ~ e c o n d component is an opt.fma1 query plrn

which is sent to query proceReor to execute according to

minimization of transmission cost, In most of distributed query

processing algorithms [18,80,97,l10ll authors are a ~ n u m e d that the

given query is algebraically and semantically in optimal form over

fragments. Those algorithmg can be applied directly to the output

of query plan reBulted from this heuri~tic based optimizer.

Page 19: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

QUERY OVER ALOEBRAIC

OLOBAL RELATIONS PROCESSOR FRAOMENTS TRE! P ~ R M PTIMXZER

r- - - - -- -- -- -- .- --.-A I 1 APPL I C A T 1 ON RESULT OF QUERY - OF ANY D P P -

ALOOR I THM

Figure 4 , 3 , Distributed Query Processor

4+3* 1 0 TRANSLATION PROCESSOR :

This processor accepts query over global relat.lon~ am input.

A set of tranelation rules are applied to get an optimally

equivalent query . o v e r fragments. Fragment t ransf ormat. ion

heuristics are used to identify moat promising tranaformations.

The site of transformation set reduoea, on uaing these heuriatica.

4 * 3 + 1 * 1 * RULES AND FRAGMENTED TRANSFORMATIONS!

T h e following database scheme is used to illuatrate query

examples in this section.

DOCTOR ( DNUM, NAME, D E P T )

PATIENT ( PNUM, NAME, DEPT, TREAT, DNUM, AMOIJNT-DUE)

DEPARTMENT ( DEPT, LOCATION, D I R E C T O R )

These global relations are fragmented and are allocated to

different eitee, Each fragment will be in the form 9 j repremeriting ith fragment of relation F allocated at nits j . A l l

Page 20: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

fragment. are disjoint and satisfy oompletences and reconstruction

[ 1 4 , 1 0 2 1 condition.. Let the fragmenta or above relation. be:

DOCTOR22 = SL DEPT = * PEDIATRICS' (DOCTOR

DOCTOR33 = SL DEPT O ' SUROERY' AND DEPT 0 ' PEDIATRICS ' (DOCTOR 1

PATIENT14 e PJ PNUM, NAME, AMOUNT-DUB: ( PATIENT)

PATIENT21 SLDEPT = ' SUROERY * ( PJ

PNUM, bEPT, TREAT, DENUM

(PATIENT) 1

PATIENT32 = SLDEPT ' PEDIATRIGS ' ( PJ

PNUM, PEPT, TREAT, DENVM

(PATIENT) 1

PATIENT43 = SLmzpT > * S U R ~ E I Y ' AND <) 'PEDIATRICS'

( PJ ' PNUM, DEPT, TREAT, DEHUM

I PATIENT) )

DEPARTMENT11 = SL LOCATION = ' S V X U S '

( DEPARTMENT )

DEPARTMENT22 I SLLOCATION l S Y I R * ( DEPARTMENT) 1

Reconstruction o f fragmenta into a global relation i r etored

in fragmentation rule-bame in form of equivalence rules. In

addition to theae reconstructing rules, eome inference rule8 are

also stared in rule-baae. These rules are provided by the system

t o aharacteriae the database, For example in this dstabase at

SVlMS only SURGERY department exist. Thia can be framed as rule

FR4, Rules are repreeented as shown in Figure 4 . 4 . . Eaah tow

repreeents s rule in four oolumna as Rule-name, Lett part of the

rule , Right part of the rule and the operator

Page 21: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

RULE-NAME OPERAT OR LWS RHS

DOCTOR DOCTOR 1 lol UN DOCTOR2 ZQ1 UN

PAT I ENT ENT14QJ J N p N u H = p N u Y

P A T I E N T 4 3 Q g AND Q4 1

DEPARTMENT DEPARTMENTllQ6 UN

DEPARTMENT Q 7

Figure 4 . 4 . Sample Fragmented Rule-Baae

Where

Q : DEFT = ' S U R ~ E R Y '

cat : DEPT = 'PEDIATRICS'

QB : DEPT <> ' SUROERY' A N D DEPT <) ' PEDIATRICS'

a4 : P N U X , DEPT, TREAT, DNUM

QZI : PNUM, NAME, AMOUNT-DUE

Q6 : LOCATION : ' SVIMS'

0 7 : LOCATION = ' S V R R '

Here onward. we u.e qua1 ification identif ier8 (~i,~2,. . etc. ,)in

place of qualifications.

Page 22: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

4.3.1.2. HEURISIIC BASED FRAGMENT TRANSFORMATIONS:

Tranalormation set for a query may be largo mnd i t I 8

impractioal to consider a11 possible tranaformationa. Yoreovor.

only a small percentage of the possible tranaformatione are

useful. Therefore to identify the most u~eful transformations,

transformations heuristics are developed. These are same as

algebraio tranaformntione according to the translatton procesm.

The proposed architecture for tranalation proceasor i shown in

Figure 4.5.

RULE - BASE

CANON 1 CAL QUERY TRANSDUCER

TREE --

- OLOBAL TO FRAUYENT OPTIMAL TRANSFORMATION FRACIMENTED

> MODULE QUERY PLAN rr > I

Figure 4 . 5 , Translation Processor

translation process is decomposed into a series of modulea.

each of which 18 straight forward. In this section first three

modules of thia proceasor are described and fourth module is

diecueeed in next section.

Page 23: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

The procesa begins with a query in the form of relational

algebralo ex~reaalon. TRANSDUCER module tranalntca t h i s query

into a tree re~reaentat ion form. Succeeding modules opsralr on

t h i s tree structure,

N e x t module SUBSTITUTOR nubetitutes all global rela,,iona with

corresponding fragment expression ueing fragment rule-base, For

example, let given query be

P J DNUM,HAME

( St (DOCTOR J N Qb DEPT = PEPT

DEPARTMENT) ,,,(Ql)

Canonioal form QL ' from rules FR1 and FR3 of fragmented

rule-base (Figure 4 , 4 ) i a

For each global relation In query, Subatitutor module

searches for a matching In LHS column of rule-base, I f a correct

match is found with ' = ' aa operator in Operator column then the

global relation in query will be replaced with correapondlng RHS

Part of the rule in rule-base. In tree format generally these

global relations are at leaf nodes and are substituted by

carreaponding equivalent expressionsl

Page 24: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

This fragmentation tranaformation module take8 t,he canonical

form of tree query as input, and trien to find a tran~formtion

heurietic that applies to aome nubtree, Same f raymcnted

t r a n s f o r m ~ t i ~ n heurintic. uaed in explaining example. in thin

chapter are given below:

Push eelections down to the leave8 of cannnical t r e e and

then apply on them relevant qualified algebraic rules; Substitute the eelcution reeult with empt.y rclstlon l f the qualification of the result i e contradictory

FTHI: Puah-Seleotlon Heuriat,ic

Qualifioation of operandm of Join are evaluntsd uring required gual if i c h algebraic rules, I f the q u ~ l I f ication of the result of the Join is contradictory then replaoe the corresponding subtree wlth empty relakion

FTH2 : Join-Distribution Heuristic

Unionr murt be purhad up beyond the Jo lna t o dlstrlhuts jo ine over union6

FT83: Union-Dietribution Heuristic

A selection on a qualified relation results i n a qualified relotion wlth predicates containing selection formula and qualification of the o r i g i n a l relation.

FTH4: Seleotion-Qualification Rauristic

Page 25: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

The result of a proJecti0n on qualified relstion is original relation having same qualification with selected attributen.

FTHS: Projection-Qualification H e u r i ~ t i r

The result of any of the binary operatione CP, SJ or J N is empty i f one of the operands is an empty relation

FTH6: Binary-Empty Heuristic

Union operation UN with one operand ae empty relation results in the second operand

FTH7: Union-Empty Heurietic

In addition to above heuristics more number of

heuristic8 can be created according to fragmentation tranelation

formulas while implementing the ~ y s t e m , The following examplee

describe how a global query is translated into a equivalent

optimal fragmented query us ing fragment t.rsnsformst, ion heur lrt i o h

Let t h e given q u e r y be

SLDLPT = ' EHT' ( DOCTOR)

Canonioal torn of 92 from substitutor module uning Fragment

rule-base be

BY applying transformation heuristic FTHl and FTH4 , the

above query t r a r ~ rn:? into

Page 26: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

First two operands of the union operation of Q2h reuult in

empty relations because of the oontrad ict on in qua1 t f l o a t ions.

Therefore by applying FTH6 , Q2b traneformr into fragmented form

DOCTOR33 DEFT=' ENT' A N D 09

whioh is equivalent to St ( DOCTOR33) , . . . . (020 ) DEPT = ' ENT'

Consider another query

J w u Y , D N u Y ( DOCTOR JN PATIENT) . . . . , , ,(Q3)

Canonical form of Q3 be

PJ ( DOCTORllQi UN DOCTOR22a2 UN DOCTOR30Q1) JN PNUM,DNUM DNUM=DNUM

( PAT I ENT 1 4Q5 JN ( PATIENT21a4 AND al UN PAT1 EN'T32Q4 at

UN PATIENT43 1 ) , d ,,..(Q3a) 04 AND a8

From FTH2 and FTH5, join operation over PATIEN'l'l4 results in

empty, since it does not contain the join attribute DNUM.

Query Q3a transforms into

J N ( PATIENT21Qq A N D Oi IJN PATIENT 3 2,, AND Q2 DNUM=DNUM

UN PATIENT43Q4 Am as 1 ) . , . , . , ( Q 3 b )

Using heur i ,,. PTHZ, F . 1 1 ~ 7 1 r I FTH7, query Q3b re8IJltl into

Page 27: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

930 having only three Joins . All remaining joina result. into

empty relation8 because of the contradiction in qualification of

operands

J p N u ~ , ~ ~ ~ ~ ( ( DOCTOR1 lQi J N PATIENT21

04 AND Q1 1 UN

( DOCTOR22 J N PATIENT32 0 2 Q4 AND Q2 1 UN

DOCTOR33 JN PATIENT43 4 3 1 , , . , . ( Q ~ C )

Q4 AND Q3

Thie can be tranelated into final fragmented form Q3d u ~ i n g

P J PNUM,DNUM

( DOCTOR1 lQi J N PATIENT21Q4 AND 0i UN

PJ ( DOCTOR22 JN PATIENT32Q4 AND (11 ) \)N PNUb4,DNUM 4 2

P J ( DOCTOR33 JN PATIENT43 ,,,,,(Q3d) PNUM,DNUM QS 9 4 A N D 9 3

Let us consider another query

P J PNUM,AMQUNT-DUE

( PAT I EN? 1

Canonical form of 94 is

P J ( PATIENT14 Q5

J N PNUM,AMOUNT-DUE PNUM = PNVM

(PATI ENT2 lai and UN PATIENT32 UN PAT I EN'I'4 nQ3 and ) ) Qz and a4

In aecond operand of join operation, the required projected

attribute AMOUNT-DUE in not available , Both attributes are

available in first operand . Therefore from h e u r i s t , i c S FTH2 and

FTHS , Q4a is tranalated into fragmented form as

Page 28: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

This fragment trannformation module f i n d s a rule which i u

u ~ e f u l to translate the query. These tran~formationr rrplace the

relevant subtrees with corresponding trannformed node.. A1 1

applicable transformation8 are aLorrd in art FR-SET. The

Tranaformation Selection module determlnro which tranmformation6

o f thia list FR-SET are to be used to produce learnt coat query.

4.3.1- 3. TRANSFORMATION SELECTION MODULEt

Ao in came of rulebased conventional query optimizer here

also the fragment transformatione which are applicable to query

are inc luded in list ,FR-SET, A ~ e t of a p p l i c a b l e t.r~n~format.ione

are created to each query. This module of procegsor ham to elect

the subset of t,ran~format ions to produce an opt. imnl fragmented

queryl A successive refinement approach to t,hc nelection proce~m

i s adopted which is similar to that described by Graefe 1401 for

selection of algebraic transformations in conventiunal query

optimization, Transformations are selected in order of greateat

expected cost eavlngs. The expected coet eavirrg ia an estimation

of saving that is resulted while applying a transformation rule.

For example, while applying FTHl and FTH4 heurietics ,nome

relations result in empty because of contradiction in

qua1 i f ication, Thone empty relation6 are ignored 60 that

prooeaaing coat n v r l 1 as cot v l ~ ( nt , ion cost ( i f i t is non-local

Page 29: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

operation) can be saved, Cost saving ~at.imat ion for each

transformation can be evaluated using cnnt formulna ~ i v c n in

section 2 . 1 . . by calculating coat of query plan before application

o f tranaformation and cost of query after transformation heurietic.

Application of a selected tran~format ton nn t h e qurry reRults

in a new query tree, I f t h i s new query contain# a mlnglc

operation on a single fragment as in caRe of q u e r y 94b , then the

translation process end^. Otherwise , the new query t.rrr 18

examined again for additional t.ransformat inn^ t ,ha t may be

appl icable and these additional transformat. I ~ I I are add~d t.o

FR-SET, This process, i a cont irrrlsd unt i 1 no morc tranafurmat ion^

are selected from FR-SET, As shown in f~gurc? 4 , 5 , t.hn l'rnn.slat.~on

process is organized aa feedback l o o p , wh~ch term1 nates when t h e r e

are no more transforrnat.inn~ to be performed otr quf'ry. The

transformation selection process ~imulatcs R 1 1 1 1 1 climbing

technique for finding minimuml

4.3.2. ALGEBRAIC-SEMANf IC QUERY OPT1 MI B R t

Thia i m similar to rule-based opt,imizer given by Graefe 1401 .

Architecture of this optimizer i n given in Figurn 4 , 6 . T h i e i e

divided into a series of moduleno Input to sem~ntic/algebraic

transformations rr! , t f rr + ( . d query tree renulted from

Page 30: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

translation processor. This module la integration o f aemantie and

conventions1 optimizere.

S B M A N T I C / A L O k B R h I C

T R A N S F O R M A T I O N

QUERY .--

P U E P Y PLAN ISELECT ION QUERY A N S V L I

LC, J 1 Figure 4 , 6 , Algebraic/Semantic Query Optimizer

Algebraic transformation and semantic traneformations are

identical and are eimilar to fragment traneformations, But while

applying semantic transformation proposed rule8 I 9 8 1 are generated

and are sent to rule-match module, This module finds a match from

rule set. I f a matching rule exist, the corresponding

tranmformstion is added to set OPEN otherwise the tranaformntion

i a ignored.

When a query enters the optimizer, all a lgebra ic

transformation8 are tested, Applicable transformations will be

added directly to OPEN, Semantlo tranaformntionm are te8t.d and

then sent to matching module, I f a match oocure then only

tranalormation i r ndded to sp' OPEN as in 1961 . Therefore the

reeulting trart~' on s f I contain both algebraio and

Page 31: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

aemantic transformations. Either t y p r of transformation Is

selected form OPEN based on emtimated cant Raving, and spplled to

query tree as in previaua aection. As chnngem are made t.o the

tree representation, additional semantir/algebraic tranaformatlons

may become applicable. Those are again added to OPEN, end i m

sent to selection module, There i e feedback from ~ r l e c t l o n module

to algebraic/semantic traneformation module, Thla process is

continued until there ie no tren~formation t h ~ t can apply to query

or result of query can be obtained without aacenaing dnt.abeaeI981,

Therefore the output of aecorld component is eithcar remult nf

the query o r optimal fragmented query plan which atin be evaluated

using any simple distributed query processing algorithm,

In both componente of the rule-haaed optimizer, the problem

i e when and how the heuristic traneform~tions arc applied t o t.he

query, Each heuristic i e aseigned a phase or R R ~ . of phaeeu

defining when the heuristic is active. For exnmple fragment

transformation that pushes Releotion downwards t.o execut.e on ell

fragments of global relation can be applied In a l l phaaes. Rut

Join distribution heuristic and onrresponding qualified relational

algebraic rules can be applied only when join over union phase

occurs. Such ordering of tran~formations helps reduce the

transformation ru t ~ ~ t must l p ~ ied to query st any given

Page 32: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

time. The uae of phases can also help to order trnnaformationa ao

t h a t early change. made to the query will produce new querien, in

which other transformatian can be applied, Tran~form~tiane which

are active in a g i v e n phase are a p p l i e d to the query. Suoh

ordering of transformation^ oocurs in EXODUS I121 optimizrr,

Another way to control the application o f heuri~ti(?n in t o

use syetem performance statistics, That im, uy~t.em oan maintain,

information that estimate8 t h e probability o f a heurimtic

transformation to be worthy. lleur i ~t ICR whoue eat; imate

i e sufficiently high would be tried, A simi l n r c:ontrol was

demonstrated in 1471; where n list of hcurlstio/aubquery paira

were ordered according to est.imated cost savirrga. Au pairs are

aelected, the heuristic i s applied to t h e ~ubquery, and any

resulting transformat ion is ueed ta change t,hs query, Th i a

process is continued until further selection of these paire i8 not

useful to query optimization.

In t h i a ahapter a brief survey on artificial intelligence and

expert sys tem is presented, A rule-baeed approach for query

proces8 ing in d i 3 t r i huted dat,ahnse i u propoged, Archi teoture of

rule-based appri for diaf ' 1 query prnces~ing i a pre~ented

Page 33: CHAPTER IV - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/49226/10... · 1950s a Bummer workshop on A1 was uponnored by IBM which ie coneidered as date of birth of AI, In

and functioning of different modulee are explain~d using eimple

queries as examples.