Parallel K-set mutual range-join in hypercubes
Transcript of Parallel K-set mutual range-join in hypercubes
![Page 1: Parallel K-set mutual range-join in hypercubes](https://reader035.fdocuments.us/reader035/viewer/2022080308/5750082d1a28ab1148b41aaa/html5/thumbnails/1.jpg)
s __ __ ii!il!z ELSEVIER
Microprocessing and Microprogramming
Microprocessing and Microprogramming 41 (1995) 443-448
Parallel k-set mutual range-join in hypercubes *
Hong Shen *
School of Compuring and Informafion Technology, Grlflth University, Nathan, QLD 41 I I, Austrulia
Received I September 1994; revised 3 March 1995; accepted 6 June 1995
Abstract
The mutual range-join of k sets, S,, S,; . . , S,, is the set containing all tuples (s,, s2; . ., st) that satisfy e, I 1 si -s, 1 I e, for all 1 I i +j I k, where si E Sj and e, 5 e2 are fixed constants. This paper presents an efficient parallel algorithm for computing the k-set mutual range-join in hypercube computers. The proposed algorithm uses a fast method to determine whether the differences of all pair numbers among k given numbers are within a given range and applies the technique of permutation-based range-join [ll]. To compute the mutual range-join of k sets S,, S,; . . , S, in a hypercube of p processors with 0(X;, ,n;/p) local memory, p < 1 Si 1 = n, and 1 < i 5 k, our algorithm requires at most
O((k log k/pN-I& ,_ ,ni) data comparisons in the worst case. The algorithm is implemented in PVM and its performance is extensively evaluated on various input data.
Keywords: Data comparison; Hypercube; Parallel algorithm; Permutation; Range-join
1. Introduction
The problem of k-set mutual range-join is to
compute a subset of the Cartesian product of k given
data sets in which each element (tuple) satisfies that the (absolute value) difference between any pair
components falls into the range [e,, e,], where 0 s e, I e, are fixed constants (bounds), 1 I i Zj I
” This work was partially supported by Australia Research Council under its Small Grants Scheme.
’ Email: [email protected]
k. Like k-set chain range-join [14], k-set mutual range-join is a generalisation of the standard range- join operation raised in [ 111 that covers both equijoin
and non-equijoin and has wide applications in database management, information processing and
statistics. When k = 2 this problem is the standard (Zset) range-join problem. When e, = 0, the prob- lem becomes k-set equijoin which is the ordinary
join operation in most cases [16]. Various algorithms on sequential and parallel
equijoin have been proposed [l-2,4-5,7-9,15] and
they are mainly based on the most effective and promising technique for equijoin, the hash-based
0165-6074/95/$09.50 0 1995 Elsevier Science B.V. All rights reserved SSDI Ol65-6074(95)00018-6
![Page 2: Parallel K-set mutual range-join in hypercubes](https://reader035.fdocuments.us/reader035/viewer/2022080308/5750082d1a28ab1148b41aaa/html5/thumbnails/2.jpg)
444 H. Shen /Microprocessing and Microprogramming 41 (1995) 443-448
join. As to non-equijoin for which the hash-based join technique is ineffective, few results are known
until recently [3,&l l-141. In [ 1 l-131 we proposed permutation-based join and selection-based join as
two alternative techniques for solving the problem of
2-set range-join in hypercubes. In [3] we presented a
permutation-based join algorithm for 2-set range-join
in N-dimensional torus. In [14], we described an
efficient algorithm for computing the k-set chain
range-join. In this paper, we present a simple and efficient
parallel algorithm to compute the k-set mutual range-join in hypercubes.
2. Fast method for mass determination
Given k numbers s,, s2, * . . , sk, and two con-
stants e, I e2, we are required to determine whether
or not tuple (s,, s2;. . , sk) is a mass in range [et,
e,], i.e. whether or not e, I 1 si - sj 1 I e2 for all 1 I i # j I k. A straightforward method for solving this problem is to check all different pairs of num-
bers, which would need k(k - 1)/2 comparisons. However, if the components in the tuple are sorted,
we find that k comparisons are sufficient to com- plete the above job as described in the following
lemma:
Lemma 1. Let s, I s2 I , . . . , I sk be k given num- bers. If e, I l si - si+ i [se, for 1 silk-1 and e, I I sk - s1 I I e2 hold, then e, I I si - sj I I e2 is true for all 1 I i Zj I k.
Proof. Assume that the theorem is false, that is, its condition holds whereas 3x, y E [ 1, k] and x < y
such that (I s, - sY I < e, 1 V (I s, - sY I > e,>.
If lsx-sYl<e,, then ls,-ss,+ll~ls,-syl<
e, since s, I s,+i I . . . s s_,,, which contradicts the condition of the theorem.
If I s, - - sJ > e2
s,I>el, then lsk-s,l~Is,-skl~ls, since s1 I . +. s, I . . . Is, 5 . . . I
sk, which is contradictory with the condition, too.
Thus the assumption can not hold. q
We further claim that the above k comparisons
are necessary to check whether tuple (s,, s2,. . . , sk) is a mass in range [e,, e2] by the following lemma:
Lemma 2. Let s, I s2 I * . . I sk be k given num- bers. Checking whether e, I I si - sj I I e2 holds for all 1 I i Zj I k requires at least k comparisons.
Proof. View s, I sz I . 1 . I sk as k vertices and add an edge between si and sj (i f j) if there is a
comparison being carried out between them. To ob- tain the relation between si and sj w.r.t. the satisfac- tion to inequality et I I si - sj I I e2 for all 1 5 i #j I k, we require that all the k vertices are connected since otherwise there is no way to know the relation
between any pair of unconnected vertices, which would require at least k - 1 edges.
Clearly k - 1 edges can connect k vertices only into a tree. In order to obtain the relation (w.r.t. the
satisfaction to the specified inequality) between ev- ery pair of non-neighbouring vertices in the tree, we
must have that the tree is an ordered tree so that
sequence {s, , s2, * * . , sk} can be produced by using a traversal scheme, and that all the relations of neigh- bouring pairs of vertices are cyclically transitive to
generate the relations of non-neighboring nodes. Achieving this with adding fewest extra edges into
the tree requires that the above tree is a linear array
whose elements follow the same partial order as
input (i.e. the i-th element is si, 1 I i I k) and that one extra edge is added to connect the head (s,) and
tail (So) of the array to make the array become a ring. 0
For arbitrary (si, s2,. . . , s,>, making the rela- tions of all neighbouring pairs of numbers to be transitive is equivalent to sorting them. Hence by Lemma 1 and 2 we have the following theorem:
Theorem 1. Let s,, s2,. . *, sk be k given numbers. An upper bound on the number of comparisons for
![Page 3: Parallel K-set mutual range-join in hypercubes](https://reader035.fdocuments.us/reader035/viewer/2022080308/5750082d1a28ab1148b41aaa/html5/thumbnails/3.jpg)
H. Shen / Microprocessing and Microprogramming 41 (1995) 443-448 445
checking whether e, < 1 si - sj ( I e2 holds for all 1 < i # j 5 k is O(k log k) and it is tight.
3. Parallel algorithm for mutual range-join
We follow the conventional definition on hyper- cube: it is an SIMD machine without shared memory and each processor has local memory of size M and a local disk and can perform basic arithematic and logic operations, data comparison and data transfer to a neighboring processor in constant time. We use 6”‘) to denote the resulting integer after inverting the wth bit of integer i. Let data sets S,, S,, . . . , Sk be each distributed evenly over p processors in a hyper- cube, p = 2” I ) Sj I = ni, and processor Pj holds S!, where 0 2 i I k and 0 I j s p - 1. We are required to compute the mutual range-join of S,, S,, . * . , Sk: T= MJ(S,, s,; . .) S,). Following Lemma 1, to check whether tuple (s, , s2,. . . , sk) is an element of T we should first sort its components, then compare each pair of adjacent components as well as the pair of the first and last components, and finally restore the sorted tuple back to the original order if it is accepted. The sketch of our algorithm is the follow- ing:
(1) T:= Sk; (2) For i=k-1, k-2;..,1 permute Si in the
hypercube such that all its subsets meet S” at processor P,. to form all possible combinations of Sy and T” forO<u, u<p- 1; Following each step of permutation, use a proper sequential algorithm to compute the mutual range-join of Sy permuted to processor P, and T” at P,.: for each pair of elements s and t, s E S,!’ and t E T”, first insert s into t and then compute the differences of s and its cyclical left and right neighbours in t; accept t if both differences are within range [e,, e2] or reject it otherwise. Thus T” = MJ(T”, Si) is obtained at processor P,, 0 I u I p - 1, after Si is fully
permuted. Hence T = U L,dT” = MJ(S,,
S,-,; . *, Si) is produced. Obviously using the above method to expand t
(increase the arity from 1 to k), we can ensure that finally if t is accepted all its components are sorted and each pair of cyclically neighbouring components are matching pair (whose difference is within the range [e,, e,]), and therefore by Theorem 1 t is a mass in range [e,, e2] and thus an element of T.
We denote the time required for a single data comparison, for a single arithmetic operation and for a single I/O operation by tC, t, and t,, respec- tively. Let every element in T” be an r-tuple. Elabo- rating the second part of Step 2 yields clearly the time required for computing MJCT”, SF> at proces- sor P,, in the worst case to be
IT” I ni log r tc+(2r+6)t,
+ (“,+ l)IT”ln;t
PM IO’
where I Sy I = n/p and M is the local memory size. The first part of Step 2 can be realized by our data
permutation algorithm described in [ 1 l] which shows that fully permuting Si to all processors in the hypercube can be done in p - 1 (parallel) steps. Let M = O(Cf, ,ni/p). When I T”I > O(Cf, ,ni/p) T” is updated at P,, in blocks of size 0(X:, , n/p> one by one. Clearly I T” I I (k/pKlf= ,n, (in the worst case I T I= k@=, ni (all data match each other)). Elaborating the whole Step 2 we can easily derive the total running time of our algorithm:
k log kfini \
0 i=l
P tc +
k21jIjni k2,fini -t,+r
P t10
C ni \ i= 1 )
Hence, we have the following theorem:
Theorem 2. Let sets S,, S,, . . . , S, be each dis- tributed evenly over p processors in a hypercube,
![Page 4: Parallel K-set mutual range-join in hypercubes](https://reader035.fdocuments.us/reader035/viewer/2022080308/5750082d1a28ab1148b41aaa/html5/thumbnails/4.jpg)
446 H. Shen /Microprocessing ond Microprogramming 41 (1995) 443-448
where p < 1 Si ( = ni and processor Pi holds subsets S/, where 1 <ilk and Oljlp- 1. The mutual range-join of S, , S,, . * 1, Sk can be computed in the hypercube with O<Cf= ,n/p) local memory in at most O((k log k/p>FI f=, ni> comparisons in the worst case.
Our algorithm compares favourably with that by simply sorting all data of every set and then joining sets one by one which would require 1 /PC:_ ,njl:
jn,i = 0((k2/p>II~= 1 ni> comparisons in the worst case.
4. Performance evaluation
Our algorithm has been implemented in the PVM environment on an S-processor hypercube. We have tested our algorithm extensively with various input data generated by a random number gegerator. The experimental results have validated our algorithm and its running time analyzed theoretically and re- vealed the performance of the algorithm on different input data.
Fig. 1 gives some figures based on our experi- mental results that depict the performance of the algorithm from different angles and show the rela-
cc
-= 0.0625
0.015625.-, - -
0.00195312 I I I I
0.015625 2 3 4 tiun&ofinputr6crS(k) 7 a 9 0 20 40 60 a0
Silt of b!$ 8atZ 140 160 Lao 200
(a) Varying number of input sets(k) (b) Varying size. of input sets (n)
Fig. 1. Experimental results on execution time of the algorithm.
![Page 5: Parallel K-set mutual range-join in hypercubes](https://reader035.fdocuments.us/reader035/viewer/2022080308/5750082d1a28ab1148b41aaa/html5/thumbnails/5.jpg)
H. Shen / Microprocessing and Microprogramming 41 (1995) 443-448 447
tions between different time components. Each tim- ing result is obtained by taking the average over
those measured on 10 random inputs. For each re- sult, we also measure its variance (on the curve
‘Total variance’) with the result of the worst case among the 10 tested cases.
5. Concluding remarks
We have proposed a novel method to determine
whether the differences of all pair numbers among k given numbers are within a given range. Our method reduces the number of comparisons from O(k*) required by the usual method to O(k log k) which is
proved to be optimal. Using this method and the data
permutation technique described in Ill], we have
presented a simple and efficient parallel algorithm
for solving the k-set mutual range-join problem in
hypercube computers. The proposed algorithm has been implemented.
InthecasewhenlSj(<pandp<ISj(fori=~,,
73,’ ., n-, and j=r,+,, rrt+2,...,~k,where(~,,
772,’ . . , rk) is a permutation of (1, 2,. . *, k}, our
algorithm will also work correctly using the same treatment as described in [l 11.
The algorithm assumes that Si is evenly dis-
tributed in the hypercube. For other typical cases of
data distribution, the same solutions as described in [14] will apply.
Like k-set chain range-join, our algorithm joins
sets one by one in arbitrary order. We have observed that the join order affects the algorithm’s perfor-
mance in many cases, but have not yet been able to find the optimal join order which minimises the
algorithm’s running time. This remains as an open problem for further study.
References
[I] D. Bitton, H. Boral, D.J. Dewitt and W.K. Wilkinson, Parallel algorithms for the execution of relational database
operations, ACM Trans. Darubase Systems 8(3) (1983) 324-
353.
[2] S.D. Chen, H. Shen and R. Topor, An improved hash-based join algorithm in the presence of double skew on a hyper- cube computer, Proc. 17th Annual Compufer Science Conj’.
(1994) 179-188. 131 S.D. Chen, H. Shen and R. Topor, An efficient permutation-
based parallel range-join algorithm on iv-dimensional torus computers, Information P recessing Letters 52( 1) ( 1994) 35- 38.
[4] V. Deshpande, P. Larson and T.P. Martin, Parallel join algorithms for nested relations on shared-memory multipro- cessors, Proc. 2nd IEEE Symp. on Parallel and Distributed
Processing (IEEE Comput. Sot. Press, 1990) 344-347. [5] D.J. Dewitt and R.H. Gerber, Multi processor hash-based
join algorithms, Proc. I lth Int. Co& Very Large Data
Bases, Stockholm (1985) 151- 164. [6] D.J. Dewitt, J.F. Naughton and D.A. Schneider, An evalua-
tion of non-equijoin algorithms, Technical report, 1990. [7] M. Kitsuregawa, H. Tanaka and T. Mote-oka, Application of
hash to data base machine and its architecture, New Generu-
tion Compuring l(1) (1983) 63-74. [8] H.J. Lu, K.L. Tan and M.C. Shan, Hash-based join algo-
rithms for multiprocessor computers with shared-memory, Proc. 16rh VLDB Con$, Brisbane, Australia (1990) 198-209.
191 E. Omiecinski and E. Tien, A hash-based join algorithm for a cube-connected parallel computer, Information Processing
Lerrers 3d5) (1989) 269-275.
[IO] E. Omiecinski and E. Tien, The adaptive-hash join algorithm for a hypercube multicomputer, IEEE Trans. Purallel Dis-
rribured Sysr. 3(3) (1992) 334-349. [11] H. Shen, An efficient permutation-based parallel algorithm
for range-join in hypercubes, Parallel Computing 21 (1995)
303-313.
[I21 H. Shen, Selection-based parallel range-join in hypercubes, Proc. High Performance Computing Co@ ‘94, Singapore (1994) 160-169.
[13] H. Shen, An improved selection-based parallel range-join algorithm in hypercubes, Proc. ELI-ROMICRO 94, Liver- pool, UK, (1994) 65-72.
[I41 H. Shen, Efficient parallel k-set chain range-join in hyper- cubes, to appear in Computer J. (in press).
[I51 P. Valduriez and G. Gardarin, Join and semi-join algorithms for a multiprocessor database machine, ACM Trans.
Darabase Sysrems 9(l) (1984) 133-161. [16] M.T. &su and P. Valduriez, Principles of Distributed
Database Sysrems, (Prentice-Hall, Englewood Cliffs, 199 I).
![Page 6: Parallel K-set mutual range-join in hypercubes](https://reader035.fdocuments.us/reader035/viewer/2022080308/5750082d1a28ab1148b41aaa/html5/thumbnails/6.jpg)
H. Shen /Microprocessing and Microprogramming 41 (I 995) 443-448
Hong Shen is currently a Senior Lec- turer at the School of Computing and Information Technology, Griffith Uni- versity, Australia. He received the B.S. degree from Beijing University of Iron & Steel Technology, China, in 1982, MS. degree from the University of Sci- ence and Technology of China in 1287, Ph.Lic and Ph.D. degrees from Abo Akademi University, Finland, in 1990 and 1991 respectively, all in Computer Science. He became an Assistant Profes- sor at the Deoartment of Comouter Sci-
ence, Abe Akademi University, in 1991, and joined Griffith University in 1992. His main research interests include parallel algorithms, parallel and distributed computing, parallel computer architectures. He has oublished over 50 technical oaoers in the above areas.
. .
Dr. Shen is a member of the Association for Computing Machin- ery and the IEEE Computer Society.