Model Extension Alpha

download Model Extension Alpha

of 6

Transcript of Model Extension Alpha

  • 8/9/2019 Model Extension Alpha

    1/6

  • 8/9/2019 Model Extension Alpha

    2/6

    with x k/Rk being a range term that becomes trueif the tuple variable xk is substituted by atuple tk in Rl, and false otherwise. The func-tion g' is an arbitrary logical function definedon R1xR2X...XRm. It may Include some other rangeterms xk/q or their denial YC~/$ conjunctivelycombined with other terms. (It may also includeother tuple variables than x 1,x2,...,xm whichare bound in g' by some means.) The function gis used for selecting ordered sets of tuplesqualified for g ' from R1xR2X...XRm and called atuple-selecting function (TSF).On the other hand the function f is a or-dered set of n functions of tuple variables xl,x2 . . . ,x m where n is the number of attributes inthe output relation. It is used for generating atuple in the output relation from each orderedset of tuples selected 'by the TSF g. The func-tion f is called a tuple-generating function(TGF).For example, let RemP and Rot be two inputrelations. The relation R has three attri-emPbutes emp-no, salary and ot-rate, while Rot hastwo attributes emp-no and overtime. A relationRpay with four attributes emp-no, salary, ot-charge and net-pay can be defined by the TSF9(x1 ,x2)

    ~lIRemp~~2/Rot hemp-no(xl)=emp-no(x2)and the TGFf(x,,x,>

    -(emp-no(x ) , salary(xl) ,where ot-charge(xl,x2),net-pay(xl,x2))ot-charge(xl,x2)Eot-rate(xl)xovertime(x2),

    net-pay(xl,x2)%alary(xl)+ot-charge(xl,x2),if it is assumed that every tuple in RemP has acorresponding tuple (a tuple with the same emp-no value) in Rot'Let us call an operationa[f:gl(R1,R2,...,R,)=~f(xl,x2'...'xm)lg(x1,x2,'..,xm)~an alpha operation with alpha expression f:gafter Codd's Alpha sublanguage. An alpha ex-pressi.on in Alpha sublanguage is a pair composedof a relational calculus and a target list. Weare considering a wider class of alpha expres-sions. The relational calculus and target listare respectively special TSFs and TGFs. Let usnext consider how these TSFs and TGFs must bedefined in a formal way,3. FUNCTIONOF TUPLESThere are various ways of defining TSFs and TGFs;

    Procoedlngr of the Tenth IntematlonalCenference on Very Larga Data BUN.

    --however, we will start at defining function oftuples which become the basis of describing TSFsand TGFs. Function of tuples (FOTS) are func-tions ":RlxR2"...xRm+V+where +R2,..-, R, are m relations and V4 is anarbitrary value set. The above FOT $ is said tobe of span m. As a special case the span m canbe 0. FOTs are defined by FOTl through FOT6 de-scribed below.(FOTl) For an arbitrary constant C, Cp( )EC is anFOT of span 0 (with no tuple variables).(FOT2) For an attribute Al, $(x)EAk(x) is a

    function of tuples of span 1 (with a singletuple variable x). If the attribute Ak isundefined for the relation R over which x isranged, then the Al(x) value is assumed to beQ (undefined).(FOT3) A range term O(x)GxlR Is an FOT of span1. Its value becomes true when x is substi-tuted by a tuple belonging to R, and falseotherwise.Given a relation instance R, the values of

    functions defined by FOT2 and FOT3 can be di-rectly evaluated. Conversely, only the FOTs de-fined by FOTl through FOT3 can be directly eval-uated provided that a relation Instance R isgiven. In this sense the FOTs defined by FOTlthrough FOT3 are collectively called basic FOTs(BFOTS).Let z be an ordered set of tuple variables(xl,x~,.**'xmL Let us assume that it is possi-ble to write $(x,,x,,...,x~) simply as $(z).Also let us denote {x1,x2,...,xm) by i.(FOT4) Let $,(z,) (k=1,2,...,m) be an FOT whose

    range is a value set Vk' and VJI is an arbi-trary value set. If an operator$:V1W2X.. . "vm"v$is defined over these value sets, then thefunction $(4,,$,,...,$m) defined by

    wt3,~2,...,~m)(z)fw$l(Z1) A (z ) , . l . Am )

    is an FOT. Here ~=u&~,. The span of thisFOT is the cardinality of i.Provided that BFOTs are typed, that is,BFOTs are classfied into several groups accord-

    ing to their ranges, operators defined over var-ious value sets can be extended to the operators:that combine FOTs to form new FOTs. Let (Pk bethe set of FOTs wdth the range Vk. Then an oper-ator $:v1xv2x...xvm+v$is extended to the operatorJI:@p2x.. ."@m*JIwhere QJ, is a set of FOTs with the range VdJ'

    g~ngspore, August, 198458

  • 8/9/2019 Model Extension Alpha

    3/6

    Usually the same notation is used for both J, and$. If an infix notation is used, parenthesesmust be introduced to specify the operator pre-cedence.It is not necessary to restrict attributesto those with simple value sets. The attributecan be of structured type like fixed or variablelength arrays, sets (power set elements), graphs,records (Cartesian product elements) and repeat-ing groups (elements of a power set of a Carte-sian product). Various operators are defined inthese value sets. The operators $ can not onlybe common operators like arithmetic, logical andrelational operators defined in simple sets, butalso be various complicated operators like addi-tion and inner product (vectors), addition, mul-tiplication, determinant and eigenvalue (matri-ces), pop, push and top (stacks), enqueue anddequeue (queues), union, intersection, differ-ence, belongs to, included in and intersect(power set members), concatenation and partialmatch (graphs) defined in compound value sets.All these operators can be extended to the oper-ators $ defined in the corresponding sets ofFOTs. To cope with a wider class of applicationsthat require such complicated operators, it isnot advisable to impose the first normal formtransformation on the relation schema.(FOTS) Let ~1(~1,x2,...,~m,xm+l,...,xp) be an

    FOT of the form4j(x1'x2'. . . ,xm,xm+l,. . . ,x,>=x1/ RlhX2/Rp.. . "X,/R mAf$'(X lYX2 ,...,X xl'xm+l"." pwhere 6' is an FOT whose range is the truthvalue set (logical FOT) which does not con-tain any range terms regarding tuple vari-ables xm+l'xm+2~""xp' For tuple variables

    X1'X2'..~'Xm' $' may contain range termsxk'% or their denial "~~1% conjunctivelycombined to other terms. Let $2(x1,x2,...,xm,

    x )p+y.. q be an FOT whose range is V. Ifan aggregate function V is defined in thevalue set V, that is, if

    then ?[$I$2 defined byot4p2H~)

    -wlh1x2. . . ,x ,xm+l,. . . ,xp) I4p1X2 .. . xm:p+l ,... ,x9)

    is an FOT whose range is V. Here z={x m+l'X m+2 ...,xplu{x x I.p+YXp+2'"" qThe right hand side of the above definitionmeans that the Q2(tl,t2,...,tm,x p+l ,...,xq) val-ues are aggregated for the ordered sets (tl,t2,. . ..t.) of tuples qualified for $(tl,t2,...,tm,X x ).m+l""' p Either or both ~x~+~,x~+~,...,x~~

    and {x p+l'xps2'*"' q 1 can be an empty set. Ifthe both sets are empty, Z=@ and v[$ll$, becomesconstant function.Let aL be the set of FOTs that satisfy con-ditions imposed on the $1 defined above, and fwith the set of FOTs whose range is V. Then v isan operator7 : @LL"@v*v.Usually the same notation is used for represent-ing V and 7.Tuple variables x1,x2,...,xm are said to bebound in the scope of the aggregate operator 0,while tuple variables in 2' are said to be freein the scope of V.There are various aggregate operators de-fined on various value sets. The aggregate op-erator C, lI, max, min, average and standard de-viation are defined on the set of numbers. Theaggregate operator .C corresponds to the arith-metic operator +, while n corresponds to x. In asimilar way, it is possible to introduce an ag-gregate operator A corresponding to A and Vcorresponding to V in the truth value set. Thesetwo operators are different from others in thepoint that the range of $2 is the truth valueset as well as that of $1. It is easy to see thatand /jtX/RA@ll'$'2-:2bdRl (dy$,>c':bdR@ll 02~~/tx/Rl ($'lf+'2) .

    In particular, A [x/R]$ is written as(Wx/R),while ~X/RIPs (3x/R)+. The tuplevariable x is said to be quantified by the uni-versal quantifier v or by the existential quan-tifier 3. Obviously quantified variables arebound variables. This is an interpretation ofquantifications as aggregate functions. In data-base environment where relations are composed ofa finite number of tuples, such an interpreta-tion often becomes useful [Kobayashi 19841.(FOT6) Only functions defined by FOTl throughFOT5 are' FOTs.Now we are ready to define TSFs and TGFs.(TSF) A TSF is an FOT of the formg(xl'x2'...' xm)3@lAx2/R2A...~xm/Rm~~(x1,x2,...,xm),where @ is an FOT whose range is the truthvalue set. Tuple variables x1,x2,...,xm arefree in 4, that is, they are not bound in thescope of any aggregate operators used in de-fining 4. The FOT 4 may contain range termsxk/Rk or their denial ~~1% for free vari-ables x k if these are combined conjunctivelyto other terms. It may also contain othertuple variables than x1,x2, . . .,x m' which arebound in the scope of some aggregate opera-tors used in defining $.As TSFs of a special type, we may consider

    Singrpom, August, 1984roceodlngs of the Tenth InternationalConiwence on Very Large Data Basea.59

  • 8/9/2019 Model Extension Alpha

    4/6

    TSFs with no free variables. Such a TSF becomesthe constant true or false according to thestate of relations over which tuple variablesare bound. Such TSFs are regarded as rules im-posed on database relations. These rules may beused as integrity constraints against whichdatabase updates m ust be validated, or as axiomsto be used in deductive question answering.(TGF) A TGF is an FOT of the formf(z) 34p) A,W, * * * An(d),where 0,) @*, . . . . 'n

    are FOTs.A TGF is an ordered set of n FOTs. Given anordered set z of tuples, it is used to generatea tuple with n attributes.Let S(Fs,Qs,Gs) be a set of TSFS generated

    from a set Fs of BFOTs using a set Qs of opera-tors according to a generative grammar Gs, andG(F~,@~,G~) be the set of TGFs generated from aset Fg of BFOTs using a set @ of operators ac-gcording to a generative grammar G . The sets Sgand G can be regarded as representing the totaldata processing capability possessed by a givensystem. In this section a very wide class ofTSFs and that of TGFs have been presented; how-ever, in most cases (data models and databasemanagement systems) only TSFs and TGFs of a r'e-stricted form are dealt with.In information algebra [CODASYL 19621, analpha operation is called a bundling operation.A TSF and a TGF were respectively called a bun-dling function and a function of bundles. As Fsand Fg' FOTs defined by FOTl through FOT3 wereused. However, only a very limited number of op-erators were used, and the grammars Gs and Ggwere defined in a very conventional manner.As mentioned previously in Relational Model,a TSF and a TGF were respectively called a rela-tional calculus and a target list. As Fs, FOTsdefined by FOTl through FOT3 were used. However,as operators in @S' only six relational opera-tors and universal and existential quantifierswere used. The grammar Gs was almost the same tothat mentioned in this section. On the otherhand, only FOTs defined by FOT2 were used as Fand Qg was an empty set. The grammar G was a g'gvery restricted one accordingly.

    To accord with a wider class of applica-tions including advanced engineering applica-tions, it is very desirable to deal with the al-pha operations with extended alpha expressionsdescribed here.4. IMAGINARY TUPLESThe alpha operation so far defined is very pow-erful to describe data processing applicationsin general. However, it still has a fatal dis-advantage when its being used to describe tradi-

    Proceedlngr of tb Tenth IntematlonalConferen~ on Very Large Data Sases.

    tional business data processing application.Let us reconsider the example presented insection 2. There were two input relations RemPand Rot, for which it is assumed that everytuple in RemP has its corresponding tuple inRot' that is, for every tuple tl in Remp thereexist one (and only.one) tuple t2 in Rot forwhich emp-no(tl)=emp-no(t2). However in practi-cal applications, there can be a tuple in RemPfor which no corresponding tuples exist in Rot,and also there can be a tuple in Rot for whichno corresponding tuples exist in Remp. (The lat-ter may be an erroneous case.)Selecting a tuple in Remn with a corre-sponding tuple in Rot can be made by specifyingthe TSF91 (x1 4

    3c /R1 empnX2'Rot hemp-no(xl)=emp-no(x2).The TGF for this case isf+&:(emp-no(xl),salary(xl),

    where ot-charge(xl,x2),net-pay(xl,x2),)ot-charge(xl,x2)and Zot-rate(xl)xovertime(x2)net-pay(xl,x2)

    %salary(xl)+ot-charge(xl,x2).Selecting a tuple in RemP with no correspondingtuple in Rot can be made by specifying the TGF

    92 (Xl)=x /R1 emph(t/x2/Rot)emp-no(xl)femp-no(x.2).The TGF for this case isf,(y)-(emp-no(~l),salary(xl),0,salary(xl),).

    Finally selecting a tuple in Rot with no corre-sponding tuple in R can be made by specifyingthe TGF ew93 (x2)

    s2/Rotn(tlx /R1 emp)emp-no(xl)femp-no(x2).The TGF for this case isf3(x2)Z(emp-no(x2),,,,error).

    To create Rpay properly, it is necessary to exe-cute a program likebegin Rl:=a[fl:gl](Remp,Rot);

    R2:=a[f2:g2](R1);R3:=a[ f3:g31(Rot);%ay:=~(R1,R2,R3)nd;

    Singapore, August, 199460

  • 8/9/2019 Model Extension Alpha

    5/6

    where u is the operation making a union of threerelations Rl,R2 and R3. The union operation it-self must be described in a procedural combina-tion of several alpha operations [Kobayashi19831. In general, given n input relations, itis necessary to execute 2"-1 alpha operationsand a union operation that combines 2n-l inter-mediate results.Such a program has two major problems.First it is not easy to describe a proper TSFand a TGF for each of 2n-l (complete or partial)matches. In fact, it is necessary to introduceone or more quantified variables. Secondly theabove program, if executed as it were, includesmany duplicate operations: It is known that, ifR and Remp ot are organized in the sequence ofemp-no values, the desired processing can beachieved by a single sequential collation withno redundant operations.To resolve this difficulty it is desirableto devise some means by which the whole opera-tion can be described (nonprocedurally) by asingle alpha operation, which also enables aneasy application of sequential collation. Thisis achieved by introducing imaginary tuples de-fined below.An imaginary tuple i to be attached to re-lation Rk is defined by the following threeconditions:(ITl) g(tl,t2,...,tksl,i,tk+l,...,tm) becomes

    true if and only if(3Xk/~)S(tl,t2,"',tk-1,Xk'tk+l.".,tm)is false but if for appropriate tuple twhich is not currently contained in R k

    ~(tl't2'."'tk~l'tk'tk+l'""tm) k'becomes true.

    (IT2) g(i,i,..., i) is always false. That is, atleast one tuple must be an existing tuple.(IT3) There is an FOT f' of span m-l such thatf'(X1'X2' . . . . Xk-l'xk+l'. . . ,x,>-f(xl,x2,. . . ,xk-l,i,xk+l,. ..,xm).

    Let us denote 5-c (i) by 5. Then the aboveexample can be achieved by theAprogrambegin Rwhere pay:=a[f:gl (iiemp,Rot) end;g(xl ,x2>3x Ifi Ax /$1 emp. 2 ot hemp-no(xl)=emp-no(x2)

    / fi(X1 J,) (if xlfihx2fi)f(x 1 ,x2> =i' f2 (x1> (if xlfiAx2=i)

    .bf3(x2) (if x1=iAx2fi)with fl, f2 and f being those defined previous-ly. 3It is not necessary to add imaginary tuplesto all input relations. For example, if a tuplein Rot has always a corresponding tuple in.R emp'then the program can be

    Proceedlngo oi the Tenth InternationalConference on Very Large Data Bares.

    begin Rwhere pay:=a[f:gl (Remp,iio,) ena;dXlX2)3 /R1 empAx2/~otAemp-no(x1)=emp-no(x2)

    f(x, ,x2> = - f,(x,,x,> (if x2fi).f2(xl) (if x2=i)The outer join introduced by Codd [Coda 19791 isa special alpha operation in which both of twoinput relations contain an imaginary tuple.

    5. CONCLUSIONThe alpha operation (specified by a pair of re-lational calculus and a target list) in the Al-pha sublanguage was an outstanding data manipu-lation model proposed in relation to the Rela-tional Model. However, since the alpha expres-sion in the Alpha sublanguage was of a restrict-ed form, it can hardly describe advanced appli-cations which require complicated data manipu-lating operations. Also since the Alpha sub-language was designed to describe rather simpleapplications for casual users, it is not suit-able for dealing with traditional data process-ing applications.In this paper, it is shown that the firstdifficulty can be resolved by extending the syn-tax of defining alpha expressions. In particu-lar, FOT4 and FOT5 that enables us to use vari-ous operations already defined in various valuesets in defining FOTs is a very powerful exten-sion for advanced applications. This policy waspartially enbodied in an experimental databasemanagement system FORIMS [kohri 19751.The second difficulty can be resolved byintroducing imaginary tuples. This function canbe easily integrated in the query optimizationalgorisms [Kobayashi 19811. It is possible toextend the relational algebra to make it rela-tionally complete with respect to the extendedalpha operation [Kobayashi 19831. Also a summaryoperation can be formally defined by using ag-gregate operations ,defined by FOT5. Regardingquantifiers as aggregate operators is useful todevelop an optimal mechanism for validatingdatabase updates [Kobayashi 19841.

    REFERENCES[CODASYL 19621 CODASYL Development Committee, AnInformation Algebra: Phase I Report, Corrrn.ACM, Vo1.5, No. 4, pp.190-204, 1962.[Codd 19711 E.F.Codd, A Data Base SublanguageFounded on the Relational Calculus, Proc. ACMSIGFIDET '71, pp.35-68, 1971.[Codd 19721 E.F.Codd, Relational Completeness ofData Base Sublanguage, in R.Rustin, ed.,pp.65-98, Prentice-Hall, Englewood Cliffs, NewJersy, 1972.[Coda 19791 E.F.Codd, Extending the Data BaseRelational Model to Capture More Meaning, ACMTrans. Database Systems, Vo1.4, No.4, pp.397-

    Singapore, August, 198461

  • 8/9/2019 Model Extension Alpha

    6/6

    434, 1979.[Kobayashi 19811 I.Kobayashi, Evaluation of Que-ries Based on the Extended Relational Calculi,In-b. Jour. Computer and Infomnution Sciences,vol.10, No.2, pp.63-102, 1981.[Kobayashi 19831 I.Kobayashi, Abstract DatabaseOperations, in T.Kitagawa, ed., Japan AnnualReviews in Electronics, Computers and Telecom-munications Vo1.7: Computer Science and Tech-nologies, pp.242-261, North-Holland,Amsterdam/Ohm, Tokyo, 1983.[Kobayashi 19841 I.Kobayashi, Validating Data-base Updates, Int. Jour. Information Systems,Vo1.9, No.1, pp.l-17, 1984.[Kohri 19751 K.Kohri, and Y.Chiba, FORIMS Phase2 Design Specification: A FORTRANOrientedInformation Management System, The Soken Kiyo,Vo1.5, No.1, pp.l77-210, Nippon Univac SogoKenkyusho, Inc., 1975.

    Pkcoedlnge ot the Tenth IntomatlondConloren~ on Very Lam Data W.