SIGMOD 2006 PAKDD 2009 Finding k-Dominant Skylines in High Dimensional Space K-Dominant Skyline...

Click here to load reader

download SIGMOD 2006 PAKDD 2009 Finding k-Dominant Skylines in High Dimensional Space K-Dominant Skyline Computation by Using Sort-Filtering Method 1.

of 39

Transcript of SIGMOD 2006 PAKDD 2009 Finding k-Dominant Skylines in High Dimensional Space K-Dominant Skyline...

  • Slide 1
  • SIGMOD 2006 PAKDD 2009 Finding k-Dominant Skylines in High Dimensional Space K-Dominant Skyline Computation by Using Sort-Filtering Method 1
  • Slide 2
  • Outline 2 Motivation Definition Analysis One-scan Two-scan Sorted Retrieval Sort-Filtering Method Experimental Result Conclusion
  • Slide 3
  • Motivation The Number of skyline point may be huge in high dimensional space. A new concept, called k-dominant skyline to alleviate the effect of dimensionality curse on skyline query in high dimensional spaces. 3
  • Slide 4
  • Definition D-dimensional space A set of points : is a data set on S if every is a d-dimensional data point on S. Total order relationship, denoted, we assume > here. 4
  • Slide 5
  • Definition 5 p2 dominate p5.
  • Slide 6
  • Definition 6 p5 is dominated by p2. SP(D,S)={p1,p2,p3,p4}
  • Slide 7
  • Definition 7 Assume k=5 p1 is better than p4 on s1,s2,s3,s5,and s6. p1 5-dominants p4
  • Slide 8
  • Definition 8 p1 cant be 5-dominanted by the other points p1 is 5-dominant skyline point.
  • Slide 9
  • Analysis User want most choose :k is bigger p1 5-dominates p4,and p1 4-dominates p4 9
  • Slide 10
  • Analysis 10 k=5, 5-dominate skyline points:p1,p2,p3 k=6, 6-dominate skyline points:p1,p2,p3,p4
  • Slide 11
  • One-Scan 11
  • Slide 12
  • One-Scan 12 Skyline point:P1,P2,P3,and P4.
  • One-Scan 15 For each point in R, 1.If p k-dominates,then is moved from R to T. 2.If k-dominates p, then p is not k- dominant End of p compared against points in R. P is not dominated-> insert to R P is dominated-> insert to T.
  • Slide 16
  • One-Scan 16 K=5 p1: initial p1 insert to R T:{}, R:{p1}
  • Slide 17
  • One-Scan 17 p2: T:{}, R:{p1}->T is empty,check point in R p2 is not 5-dominated by p1 and p1 is not 5-dominated by p2 and->p2 insert to R T:{}, R:{p1,p2} T is empty,check point in R p3 is not 5-dominated by p1 or p2 and p1 and p2 are not 5-dominated by p3 ->p3 insert to ">
  • One-Scan 18 p3: T:{}, R:{p1,p2}->T is empty,check point in R p3 is not 5-dominated by p1 or p2 and p1 and p2 are not 5-dominated by p3 ->p3 insert to R T:{}, R:{p1,p2,p3}
  • Slide 19
  • One-Scan 19 p4: T:{}, R:{p1,p2,p3}->T is empty,check point in R p4 is 5-dominated by p1, p2, and p3 ->p4 insert to T T:{p4}, R:{p1,p2,p3}
  • Slide 20
  • One-Scan 20 p5: T:{p4}, R:{p1,p2,p3}->check point in T p5 dont dominates p4 and p4 dont dominates p5 -> check point in R p5 is 5-dominated by p2 and p3->p5 insert to T T:{p4,p5}, R:{p1,p2,p3}
  • Slide 21
  • One-Scan 21 p5 is dominated by p2. p5 is not skyline, but it is in T.
  • Slide 22
  • Two Scan 22 In the One-Scan algorithm, free skyline points (i.e., T ) need to be maintained to compute the k-dominant skyline points. Scanning D twice avoid need to maintain T. Fist scan of D, computed a set of candidate k-dominant R. Base on Lemma 4.1 p2, false positive can exist in R. Second scan D-R to determine whether a point is indeed k-dominate skyline
  • Slide 23
  • Two Scan 23 k=3 First Scan: Initinal p1:insert to R R={p1} S1S2S3S4 P112925 P211 48 P310 86 P4108662
  • Slide 24
  • Two Scan 24 k=3 First Scan: p2 compared against point in R={p1}. p2 3-dominates p1 p1 remove from R, p2 is inserted to R. R={p2} S1S2S3S4 P112925 P211 48 P310 86 P4108662
  • Slide 25
  • Two Scan 25 k=3 First Scan: p3 compared against point in R={p2}. p3 is 3-dominated by p2, R={p2} S1S2S3S4 P112925 P211 48 P310 56 P4108662
  • Slide 26
  • Two Scan 26 k=3 First Scan: p4 compared against point in R={p2}. p2 is inserted to R R={p2,p4} S1S2S3S4 P112925 P211 48 P310 86 P4108662
  • Slide 27
  • Two Scan 27 k=3 Second Scan: R={p2,p4},D-R={p1,p3} choose p1 compared against point in R={p2,p4} R ={p2,p4}, S1S2S3S4 P112925 P211 48 P310 86 P4108662
  • Slide 28
  • Two Scan 28 k=3 Second Scan: R={p2,p4},D-R={p1,p3} choose p3 compared against point in R={p2,p4} p3 3-dominates p4 (false positive) remove p4 from R,R={p3} 3-dominant skyline point: p3 S1S2S3S4 P112925 P211 48 P310 86 P4108662
  • Slide 29
  • Sorted Retrieval 29
  • Slide 30
  • Initial T=D 4-dominate p3,p4 Remove p3, p4 from T. Sorted Retrieval 30
  • Slide 31
  • Sorted Retrieval 31
  • Slide 32
  • Sorted Retrieval 32 3=d-k+1=6-4+1 p1 is 4-dominant skyline point Moved from T to R
  • Slide 33
  • Sort-Filtering Method K-Dominant Skyline Algorithm: (From k=d calculation) 1.Domination Power Calculation 2.k-Dominant Checking 33
  • Slide 34
  • Sort-Filtering Method Domination Power Calculation Example : p(9,1,2) and q(3,2,3) :in 3D space Domination Power p=2, q=1 sum(p)=12, sum(q)=8 sum(p)>sum(q), but Domination Power p>q p is 2-dominated q. 34
  • Slide 35
  • Sort-Filtering Method Domination Power Calculation Calculate Domination Power and sum. 35
  • Slide 36
  • Sort-Filtering Method Domination Power Calculation 36
  • Slide 37
  • Sort-Filtering Method k-Dominant Checking Consider 5-dominant N 5,N 3,N 8,N 1,N 6 are 5-dominated by the first object N 2, remove 5-dominated objects,output N 2 37
  • Slide 38
  • Experimental Result 38
  • Slide 39
  • Conclusion Use domination power to find k-domination skyline? Choose k to reduce the number of k-dominant skyline points. 39