Jianzhong Qi Rui Zhang Lars Kulik Dan Lin Yuan Xue The Min-dist Location Selection Query University...

22
Jianzhong Qi Rui Zhang Lars Kulik Dan Lin Yuan Xue The Min-dist Location Selection Query University of Melbourne 26/03/22

Transcript of Jianzhong Qi Rui Zhang Lars Kulik Dan Lin Yuan Xue The Min-dist Location Selection Query University...

Jianzhong Qi Rui Zhang Lars Kulik Dan Lin Yuan Xue

The Min-dist Location Selection Query

University of Melbourne18/04/23

Outline

.2.

Backgrounds Algorithms

Sequential Scan Algorithm Quasi-Voronoi Cell Nearest Facility Circle Maximum NFC Distance

Experiments Conclusions

Motivation

.3.

The min-dist location selection problem Problem setting: a set of facilities serving a set of

clients

If we want to set up a new facility, choose a location from a set of potential locations to minimize the average distance between the facilities and the clients

Motivating applications Urban planning simulations: deploy public facilities

Multiple player online games: place players

Motivation: urban planning simulation

.4.

Modeling urban dynamics [1]

Motivation: online computer games

.5.

An online game example [2]

Problem Definition

.6.

A set of clients, C A set of existing facilities, F A set of potential locations, P Select a potential location for a new facility

to minimize the average distance between a client and her nearest facility

Related Work

.7.

The min-dist optimal location problem [3] A set of clients C A set of existing facilities F A candidate region Q Compute a location in Q for a new facility to

minimize the average distance between a client and her nearest facility

Q

Related Work

.8.

Location Optimization Problems

Problem Optim.

Function

Solution

Space

Distance

Function

Datasets

[4] Max-inf Continuous L2 C, F

[5] Max-inf Discrete L2 C, F

[6] Max-inf Continuous L1 C, F

[7] Max-inf Discrete L2 C, P

[8] Max-inf Discrete L2 C, F, P

[3] Min-dist Continuous L1 C, F

[9] Min-dist Continuous Network C, F, E

[10] Min-dist Discrete L2 C, P

Proposed Min-dist Discrete L2 C, F, P

Algorithms: Problem Redefinition

.9.

Larger distance reduction smaller average client-facility distance

The influence Set of p, IS(p)

The distance reduction of p, dr(p)

IS(p)c

dist(c,p)facility)exsistingneareastdist(c,c'sdr(p)

IS(p)c facility)exsistingneareastdist(c,c'sdist(c,p)

IS(p1)

IS(p2)

Algorithms: Sequential Scan

.10.

Sequential Scan Algorithm Sequentially check all the potential locations

For every potential location p Sequentially check all the clients, compute IS(p) and

dr(p)

Report the one with the largest dr value Drawback – repeated dataset accesses

Key algorithm design considerations Restrict the search space for IS(p) Share the computation for determining the

influence sets of multiple potential locations

Algorithms: Quasi-Voronoi Cell

.11.

A potential location’s surrounding existing facilities constraint its search space for IS

The Quasi-Voronoi Cell (QVC) [11]

Algorithms: Nearest Facility Circle

.12.

Constraint the search space from clients’ perspective Nearest facility circle of a client c, NFC(c)

An R-tree on the NFCs An R-tree on the potential locations Synchronous traversal

IS(p)c)(cNFCp

Algorithms: Maximum NFC Distance

.13.

An index reduced version of NFC NFC requires two R-trees to index the clients

One for the NFCs The other for the clients Inefficient to maintain with clients coming and

leaving constantly

Key insight Combine two R-trees together A single value to describe a region that encloses

the NFCs of the clients in an R-tree node N The Maximum NFC Distance

Algorithms: Maximum NFC Distance

.14.

Maximum NFC Distance (MND) The largest distance between the points on the

NFCs and the MBR of a node on the clients

Algorithms: Maximum NFC Distance

.15.

Efficient MND Computation Only requires checking four points per node The four candidate furthest points (CFP): Iv1, Iv2,

Ih1, Ih2

CFP(N)}|I{dist(I,N) MND(N) max

Experiments: settings

.16.

Hardware 2.66GHz Intel(R) Core(TM)2 Quad CPU,3GB RAM

Datasets Synthetic datasets: Uniform, Gaussian, Zipfian

Real datasets: populated places and cultural landmarks in US and North America [13] US: |C| = 15206, |F| = 3008, |P| = 3009 NA: |C| = 24493, |F| = 4601, |P| = 4602

Parameter Value

Disk page size 4KB

Client set size 10K, 50K, 100K, 500K, 1000K

Existing facility set size 0.1K, 0.5K, 1K, 5K, 10K

Potential location set size 1K, 5K, 10K, 50K, 100K

; σ2 (Gaussian distribution ) 0; 0.125, 0.25, 0,5, 1, 2

N; ∂ (Zipfian distribution) 1000; 0.1, 0.3, 0.6, 0.9, 1.2

Experiments: dataset cardinality

.17.

MND is as good as NFC in running time and I/O.They both outperform SS and QVC by one order of magnitude.

Experiments: dataset cardinality

.18.

MND reduces 40% in index size compared to NFC

Experiments: data distribution

.19.

Gaussian

Real

MND shows the best overall performance

Conclusions

.20.

A new location optimization problem Urban simulation Massively multiplayer online games

Two approaches from commonly used techniques Quasi-Voronoi Cell Nearest Facility Circle

A new approach MND High efficiency No additional index

Reference

.21.

[1] http://www.simcenter.org.[2] http://connect.in.com/free-online-games-com/photos-540361-9095265.html.[3] D. Zhang, Y. Du, T. Xia, and Y. Tao, “Progressive computation of the min-dist optimal-location query,” in VLDB, 2006.[4] S. Cabello, J. M. D´ıaz-B´a˜nez, S. Langerman, C. Seara, and I. Ventura, “Reverse facility location problems.” in CCCG, 2005.[5] T. Xia, D. Zhang, E. Kanoulas, and Y. Du, “On computing top-t most influential spatial sites.” in VLDB, 2005.[6] Y. Du, D. Zhang, and T. Xia, “The optimal-location query.” in SSTD, 2005.[7] Y. Gao, B. Zheng, G. Chen, and Q. Li, “Optimal-location-selection query processing in spatial databases,” TKDE, vol. 21, pp. 1162–1177, 2009.[8] J. Huang, Z. Wen, J. Qi, R. Zhang, J. Chen, and Z. He, “Top-k most influential locations selection,” in CIKM, 2011.[9] X. Xiao, B. Yao, and F. Li, “Optimal location queries in road network databases,” in ICDE, 2011.[10] http://www.esri.com/.[11] I. Stanoi, M. Riedewald, D. Agrawal, and A. E. Abbadi, “Discovery of influence sets in frequently updated databases,” in VLDB, 2001.[12] http://www.rtreeportal.org.

Thank you!

Jianzhong [email protected]