Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data...

1

Efficient Placement of Geographical Data Over Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query Broadcast Channel for Spatial Range Query

Under Quadratic Cost Model Under Quadratic Cost Model

Jianting Zhang

Le Gruenwald

School of Computer ScienceThe University of Oklahoma

Norman, Oklahoma, 73019, USA{jianting, ggruenwald}@ou.edu

2

OutlineOutlineIntroductionRelated WorkReview of the Cost ModelThe Optimization Method– General Ideas– The Approximation Algorithm– An Example

Experiments and ResultsConclusions and Future Work Directions

3

IntroductionIntroductionWhat is What is Geographical Information?

• Mailing Address:

Engineering Laboratory, Room 139

200 Felgar Street,Norman,OK, 73019-6151

• Relative Direction:

4 miles southeast of Norman

• Coordination: Longitude/Latitude

(-97.443067, 35.194425)

4

IntroductionIntroductionWhy Broadcasting?Why Broadcasting?

Help solving several key problems in mobile computing– Bandwidth

Independent of number of usersExcellent scalability

– Power Consumption:Listen/Sleep mode consumes less power than in send mode

– MobilityNo mobility management is required at neither server side nor client side

5

IntroductionIntroductionTypes of BroadcastingTypes of Broadcasting

– Pull basedNeeds explicit client requestOnly requested data are broadcastNeeds frequent scheduling

– Push basedSchedule broadcast sequence without explicit requestsNeed prior knowledge for efficient sequencingSuitable for pushing data to a large number of users

6

IntroductionIntroductionWhy GI Broadcasting ?Why GI Broadcasting ?

Public Information – Service locations: ATM machines, Restaurants– Traffic & Road Conditions– Weather Information

Large number of potential users of GI (metropolitan area for example)Relatively static/low update frequencyMostly Read-onlyPrivacy is not a big concernDistributed in nature

7

IntroductionIntroductionDisk Access vs. Air AccessDisk Access vs. Air Access

8

IntroductionIntroductionParameters in Broadcast SystemParameters in Broadcast System

Access Time (Latency) – AT: –The duration between the time the broadcast channel is accessed to the time when all data are retrieved

–The user may switch to sleep mode in between active downloading

9

IntroductionIntroductionParameters in Broadcast SystemParameters in Broadcast System

Tune-in Time -TT: – Time for downloading data from broadcast

sequence – Mobile hosts are in active (listen) mode

10

IntroductionIntroductionResearch Objectives: A big pictureResearch Objectives: A big picture

11

IntroductionIntroductionBroadcast Scheme Under ConsiderationBroadcast Scheme Under Consideration

Index and Data UseSeparate Channels

12

Related WorkRelated Work•General Data Broadcasting:

•Tree-indexing, Hashing, Signature, Hybrid, etc.

•Suitable for One-dimensional and/or categorical data

•Allows only one data item per access

•Focus on trade off between TT and AT using replication

•Object-Orientated/Relational Database Broadcasting

•Allows multiple data items per access

•Assumes data access has predefined orders

13

Related WorkRelated Work

•Geographical Data is multi-dimensional and continuous data.

•There may be multiple data items in a spatial range query result set and they may not have a pre-defined order.

•Existing broadcast techniques can not be applied to geographical data broadcast for efficient spatial range query processing.

14

Cost ModelCost ModelBrief ReviewBrief Review

Assumption 1 : it takes unit time to broadcast a single data itemThus we can use the difference of the positions (D) between two data items in a broadcast sequence as the measurement of the data access time.

Assumption 2 : Data and index are broadcast exactly once in a broadcast cycle. We do not consider replication in this study.

Assumption 3: The number of range queries(M) that are requested within a region is proportional to its area (A): M=c*A

The cost is measured by A*D.

15

Cost ModelCost ModelBrief ReviewBrief Review

qx/2

qy/2

P1

A1

))]()...2(),1(min())()...2(),1([max(*...

))](),(),(min())(),(),([max(*

|)()(|*

,...2,1

1,,

1,

nnw

kjikjiw

jiw

Cost

n

nkjikji

njiji

ππππππ

ππππππ

ππ

−++

−+

−

=

∑

∑

≤≤<≤

≤<≤

qy/2qx/2

qy/2

qy/2P2

P1

∑

∑

∑

∈

∈

∈

=

=

=

Qqyqx

qqnn

Qqyqx

qqikji

Qqyqx

qqjiji

yx

yx

yx

Aw

Aw

Aw

),(

),(,...2,1,...2,1

),(

),(,,

),(

),(,,

~...

~

~

nn

n

kkjijiji

n

jjiii

AA

kjiAAA

jiAAA

..2,1,...2,1

1,,,

1,

~...

)(,~

)(~

=

≠≠−=

≠−=

=

=

U

U

A12

16

Optimization Optimization General IdeasGeneral Ideas

Requirements:

•Low-Cost: near real time scheduling (sequence 100-10000 nodes in 0-5 minutes)

•Trade off between Greedy and Non-greedy methods (n! possible orderings)

MinLA algorithm of (Bar-Yehuda, 2001):•Divide-and-conquer strategy•Space Complexity: O(2depth(T)) O(n)•Time Complexity: O(n2)

•Examines 2n-1 orderings in O(n2) time

∑∈Tt

tdepth )(2

17


Graph Minimum Linear Arrangement problem

|)()(|)*,()(),(

vuvuwGlaEvu

ππ∑∈

−=

)}(),...(),(min{)}(),...(),(max{ 2121 kk nnnnnn ππππππ −

]2

)1)(([1)( 222

2−−−

−=LLLL

LL

LgQuadratic

18


Observation: the monotonic relationship between L2and g(L2)

Motivation: Use L2 to approximate g(L2)

Assumption: the optimized ordering where the optimization is based on the definition of la(G) which is linear with respect to L2, is also a good ordering according to quadratic cost model respect to L2.

]2

)1)(([1)( 222

2−−−

−=LLLL

LL

Lg

19

Optimization Optimization The Approximation AlgorithmThe Approximation Algorithm

10

115

8

64

7

31

9

02

BDT

20


A BDT:

• A binary tree that has all the nodes in a graph as its leaf nodes

• Two options: 0-orientation and 1-orientation

• Number of possible orderings is 2n-1, if the BDT is full and balanced

1-Orientation 0-Orientation+ -

LR RL

21


The algorithm:

• Starts with the root of the BDT and computes the costs of the two possible orientations of its two sub-trees recursively.

•Keeps the orientation that has lower cost while discard the one that has a higher cost.

•The computed orientations at each intermediate node of the BDT form an orientation tree that has the same structure as the BDT.

•The orientation tree determines an ordering sequence of all the nodes in a graph.

22


∈∩∈−∈∩∈∈∩∈−

=

∑∈ otherwise

RvtVuutVvuwLvtVuuvuwtVvtVuvuvuw

Evu 0)(|)()(|)*,()()(*),(

)()(|)()(|)*,(],[Cost

),(

RV(t),L,

ππ

πππ

•In-Cut

•Left_cut

• Right_cut

LR

V(t)

23


Use position implicitly in computing the cost (access time in our applications) which is very efficient.

t̂t̂t̂t̂t̂t̂

24

Optimization Optimization An ExampleAn Example

T0

1 2 3 4

T11 T12

iA~

jiA ,~

kjiA ,,~In_cuts:

T11: {1,2}=22 T12: {3,4}=8

T0: {1,3}=2, {2,3}=38, {2,4}=3, {1,2,3}=14,{2,3,4}=4, total=62

25

Optimization Optimization DBW: An ExampleDBW: An Example

+ +

3 4 2 122

3

38

43 4 2 1

- +

22

2

14

T11 T11+-

Left_cut(1)=2+14+22=38Right_cut(1)=0Cost(1)=Left_cut(1)=38

Left_cut(2)=38+3+4=45Right_cut(2)=22Cost(2)= Left_cut(2)=45

Left_cut(T11)=45+38-22=61Right_cut(T11)=22+0-22=0Cost(T11)=45+38+(22-22)*1+(38-22)*1=99

26

Optimization Optimization DBW: An ExampleDBW: An Example

Left_cut(1)=2Right_cut(1)=22Cost(1)= Left_cut(1)=2

+

3 4 222

1

38

43 4 1 2

- -

2214

3

+T11 T11--

2

Left_cut(2)=38+4+3+22+14=81Right_cut(2)=0Cost(2)=Left_cut(2)=81

Left_cut(T11)=2+81-22=62Right_cut(T11)=22+0-22=0Cost(T11)=2+81+(22-22)*1+(81-22)*1=142

27


T11: 1-orientation 99, 0-orientation 142


T0= 1-orientation 114, [4,3,2,1]



T0=0-orientation 114 [1,2,3,4]

Best among all 4!=24 possible orderings

28


How good is the approximation?

Answer from the example:

29

Experiments and ResultsExperiments and ResultsGenerating Data SetsGenerating Data Sets

•Five synthetic point data sets: 100/200/300/400/500 data points

•Data space [0,1) ×[0,1)

•Query window size: (0.1,0.1)

30

Experiments and ResultsExperiments and ResultsOrdering HeuristicsOrdering Heuristics

•Hilbert Space Filling Curve Ordering

•R-Tree Traversal Ordering

1 4 3 6 2 5

31

Experiments and ResultsExperiments and ResultsComparison of Random Orderings Comparison of Random Orderings

32

Experiments and ResultsExperiments and ResultsComparison of Two HeuristicsComparison of Two Heuristics

33

Experiments and ResultsExperiments and ResultsOptimization of ROptimization of R--Tree OrderingTree Ordering

34

ConclusionsConclusions•Observe the structural similarity between the quadratic cost model we previously developed and the MinLA problem and the monotonic relationship between the cost in terms of DPW+DBW and the DBW for a single query

•Propose to use the access time of DBW to approximate the access time of DPW+DBW and convert the optimization problem under the quadratic cost model into a MinLA optimization problem

•The experiment results using the five synthetic data sets based on optimization method showed that the optimized ordering is 21%-32% better than the 1000 random orderings average under the quadratic cost model.

35

Future WorkFuture Work•Extend our cost models to handle the access time both to the data channel and the index channel

•Explore more ordering heuristics as well as exact and/or approximation optimization methods

•Perform more experiments using both synthetic and real data sets with different sizes, distributions and densities to examine the effectiveness and scalabilities of the optimization methods

36

Thanks!

Questions?

Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data...

Documents

Transcript of Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data...