Data Structures for Range Minimum Queries in ... · Title: Data Structures for Range Minimum...

Post on 27-Sep-2020

6 views 0 download

Transcript of Data Structures for Range Minimum Queries in ... · Title: Data Structures for Range Minimum...

Data Structures for Range Minimum Queries

in Multidimensional Arrays

Hao Yuan Mikhail J. Atallah

Purdue University

January 17, 2010

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 1 / 28

Outline

Introduction

Results

OverviewDetails

Step 1: Comparison-Efficient Data StructuresStep 2: Random Access Machine Implementation

Future Work

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 2 / 28

Definitions

Given a d-dimensional array A with N entries, a Range Minimum Query(RMQ) asks the minimum element in the query rangeq = [a1, b1]× [a2, b2]× · · · × [ad , bd ], i.e.,

RMQ(A, q) = minA[q] = min(k1,...,kd )∈q

A[k1, . . . , kd ].

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 3 / 28

Applications

String Pattern Matching: 1D RMQ and its related Least CommonAncestor (LCA) problems are fundamental building blocks in suffixtrees/arrays

Computational Biology: Finding min/max number in an alignmenttableau (genome sequence analysis)

Image Processing: Finding the lightest/darkest point in a range(Dilate/Erode Filter)

Databases: Range Min/Max Query in OLAP Data Cube

Example

Select the highest paid employee whose age is between 30 and 40 andjoined the company during the period between 1995 and 2005

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 4 / 28

Previous Work - 1D RMQ

1D Range Minimum Query

Linear Reduction to Least Common Ancestor (LCA) Problem[Gabow, Bentley and Tarjan 1984]

LCA: O(N) Preprocessing, O(1) Querying[Harel and Tarjan 1984]

RMQ & LCA: Much Studied(Parallelization, Simplification, Distributed Algorithms, etc)[Schieber and Vishkin 1988][Bender and Farach-Colton 2000][Alstrup et al. 2002]

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 5 / 28

Previous Work - Semigroup Model

Related to the semi-group sum problem (MIN is a semi-group operator)Data Structures: O(M) preprocessing time and space (M ≥ N),O(α(M,N)) querying time

One Dimensional: [Yao 1982], [Alon and Schieber 1987]

Multidimensional (fixed d): [Chazelle and Rosenberg 1989]

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 6 / 28

Multidimensional RMQ

Unit-Cost RAM Model:O(1) cost for: Read/Write Memory, +,−, ∗, /, <<, >>

Comparison-Based: Array entries can only be compared

Table: Results for d-dimensional RMQ (d is fixed). The O(·) is omitted.

Preprocess Time Space Querying Time

Gabow et al. 1984 N logd−1 N N logd−1 N logd−1 N

Chazelle and Rosenberg 1989 M M αd(M,N)

Poon 2003 N(log∗ N)d N 1

Amir et al. 2007 (d = 2) N log[k+1] N kN 1

Our result N N 1

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 7 / 28

Overview

General Approach

Design Comparison-Efficient Algorithm:Only comparisons between input array entries are counted

Implement the Algorithm in RAM:All the computations are counted

Example: Minimum Spanning Tree Verification[Komlos 1984] [Dixon, Rauch and Tarjan 1992]

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 8 / 28

Overview

Following the general approach:

Comparison-Efficient Data StructuresNew 1D RMQ

Preliminary: O(N log N)-comparison preprocessingand 1-comparison queryingSpeedup the preprocessing to O(N) comparisons

New data structure generalizes to two or higher dimensional cases

Preprocessing: O(N) comparisonsQuerying: O(1) comparisons

RAM Implementations

Micro blocks of size ε log NSolve big size query by well-known algorithmsSolve small size query by table lookup

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 9 / 28

Overview

Following the general approach:

Comparison-Efficient Data StructuresNew 1D RMQ

Preliminary: O(N log N)-comparison preprocessingand 1-comparison queryingSpeedup the preprocessing to O(N) comparisons

New data structure generalizes to two or higher dimensional cases

Preprocessing: O(N) comparisonsQuerying: O(1) comparisons

RAM Implementations

Micro blocks of size ε log NSolve big size query by well-known algorithmsSolve small size query by table lookup

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 9 / 28

Overview

Following the general approach:

Comparison-Efficient Data StructuresNew 1D RMQ

Preliminary: O(N log N)-comparison preprocessingand 1-comparison queryingSpeedup the preprocessing to O(N) comparisons

New data structure generalizes to two or higher dimensional cases

Preprocessing: O(N) comparisonsQuerying: O(1) comparisons

RAM Implementations

Micro blocks of size ε log NSolve big size query by well-known algorithmsSolve small size query by table lookup

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 9 / 28

Overview

Following the general approach:

Comparison-Efficient Data StructuresNew 1D RMQ

Preliminary: O(N log N)-comparison preprocessingand 1-comparison queryingSpeedup the preprocessing to O(N) comparisons

New data structure generalizes to two or higher dimensional cases

Preprocessing: O(N) comparisonsQuerying: O(1) comparisons

RAM Implementations

Micro blocks of size ε log NSolve big size query by well-known algorithmsSolve small size query by table lookup

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 9 / 28

Overview

Following the general approach:

Comparison-Efficient Data StructuresNew 1D RMQ

Preliminary: O(N log N)-comparison preprocessingand 1-comparison queryingSpeedup the preprocessing to O(N) comparisons

New data structure generalizes to two or higher dimensional cases

Preprocessing: O(N) comparisonsQuerying: O(1) comparisons

RAM Implementations

Micro blocks of size ε log NSolve big size query by well-known algorithmsSolve small size query by table lookup

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 9 / 28

Overview

Following the general approach:

Comparison-Efficient Data StructuresNew 1D RMQ

Preliminary: O(N log N)-comparison preprocessingand 1-comparison queryingSpeedup the preprocessing to O(N) comparisons

New data structure generalizes to two or higher dimensional cases

Preprocessing: O(N) comparisonsQuerying: O(1) comparisons

RAM Implementations

Micro blocks of size ε log NSolve big size query by well-known algorithmsSolve small size query by table lookup

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 9 / 28

Overview

Following the general approach:

Comparison-Efficient Data StructuresNew 1D RMQ

Preliminary: O(N log N)-comparison preprocessingand 1-comparison queryingSpeedup the preprocessing to O(N) comparisons

New data structure generalizes to two or higher dimensional cases

Preprocessing: O(N) comparisonsQuerying: O(1) comparisons

RAM Implementations

Micro blocks of size ε log NSolve big size query by well-known algorithmsSolve small size query by table lookup

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 9 / 28

Overview

Following the general approach:

Comparison-Efficient Data StructuresNew 1D RMQ

Preliminary: O(N log N)-comparison preprocessingand 1-comparison queryingSpeedup the preprocessing to O(N) comparisons

New data structure generalizes to two or higher dimensional cases

Preprocessing: O(N) comparisonsQuerying: O(1) comparisons

RAM Implementations

Micro blocks of size ε log NSolve big size query by well-known algorithmsSolve small size query by table lookup

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 9 / 28

Overview

Following the general approach:

Comparison-Efficient Data StructuresNew 1D RMQ

Preliminary: O(N log N)-comparison preprocessingand 1-comparison queryingSpeedup the preprocessing to O(N) comparisons

New data structure generalizes to two or higher dimensional cases

Preprocessing: O(N) comparisonsQuerying: O(1) comparisons

RAM Implementations

Micro blocks of size ε log NSolve big size query by well-known algorithmsSolve small size query by table lookup

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 9 / 28

Overview

Following the general approach:

Comparison-Efficient Data StructuresNew 1D RMQ

Preliminary: O(N log N)-comparison preprocessingand 1-comparison queryingSpeedup the preprocessing to O(N) comparisons

New data structure generalizes to two or higher dimensional cases

Preprocessing: O(N) comparisonsQuerying: O(1) comparisons

RAM Implementations

Micro blocks of size ε log NSolve big size query by well-known algorithmsSolve small size query by table lookup

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 9 / 28

Overview

Following the general approach:

Comparison-Efficient Data StructuresNew 1D RMQ

Preliminary: O(N log N)-comparison preprocessingand 1-comparison queryingSpeedup the preprocessing to O(N) comparisons

New data structure generalizes to two or higher dimensional cases

Preprocessing: O(N) comparisonsQuerying: O(1) comparisons

RAM Implementations

Micro blocks of size ε log NSolve big size query by well-known algorithmsSolve small size query by table lookup

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 9 / 28

Multidimensional RMQ

If only count comparisons:

2D RMQ Lower Bound: [Demaine, Landau and Weimann, 2009]If NO COMPARISON is allowed at the query stage, then Ω(N log N)comparisons preprocessing is required

Our Result: O(2.89d(d + 1)!N) comparisons preprocessing,2d − 1 comparisons querying

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 10 / 28

Canonical Ranges

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

x

CR(x) = [5, 8]

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 11 / 28

Pre-Computations

For each p ∈ CR(x), define

LeftMin(x , p) = mink∈CR(x) and k≤p

A[k]

RightMin(x , p) = mink∈CR(x) and k≥p

A[k]

5 6 7 8

x

LeftMin(x, 7) = min A[5], A[6], A[7]

RightMin(x, 7) = min A[7], A[8]

p

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 12 / 28

Query

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

x y

RightMin(x, 6) LeftMin(y, 14)

min A[6..14]LCA(6, 14)

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 13 / 28

Pre-Computations

Naıve Algorithm: O(N log N) Comparisons

Faster Approach

Sort the canonical ranges by their lengthsCompute the LeftMin and RightMin entries for canonical ranges inthe sorted order

For length-one canonical range CR(w),LeftMin(w , p) = RightMin(w , p) = A[p] (p ∈ CR(w))For a canonical range CR(w) covering more than one position,compute the LeftMin and RightMin arrays in O(log |CR(w)|) time(instead of O(|CR(w)|))

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 14 / 28

Pre-Computations

Case 1: p ∈ CR(x)

LeftMin(w, p) = LeftMin(x, p)

x y

w

p

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 15 / 28

Pre-Computations

Case 2: p ∈ CR(y)

x y

LeftMin(w, p) = min min CR(x), LeftMin(y,p)

w

p

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 16 / 28

Pre-Computations

Case 2: p ∈ CR(y)

x y

LeftMin(w, p) = min min CR(x), LeftMin(y,p)

w

p

Monotonicity (Non-Increasing): LeftMin(y , p) ≥ LeftMin(y , p + 1)

Binary Search!

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 17 / 28

Binary Search

Example

x y

min CR(x) = 40

w

p

LeftMin(y, ...) = 90, 70, 50, 20, 10

LeftMin(w, ...) = 40, 40, 40, 20, 10

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 18 / 28

Comparison Complexity

T (n): the number of comparisons to compute the LeftMin and RightMinentries for canonical ranges whose size is at most n

T (1) = 0

T (n) = 2T(n

2

)+ O(log n) for n ≥ 2

We have the preprocessing comparison complexity

T (n) = O(n),

and need to do 1 comparison at the query stage.

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 19 / 28

2D Case

2D Canonical Range: Cartesian Product of Two 1D Canonical Ranges

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

12

34

56

78

910

1112

1314

1516

x

y 2D Canonical Range

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 20 / 28

2D Pre-Computations

For each 2D canonical range r and a point p ∈ r , compute the 4“Dominance Min” array entries

BotLeftMin(r, p)

pTopLeftMin(r, p) TopRightMin(r, p)

BotRightMin(r, p)

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 21 / 28

2D Query

For any query range q, we can always divide it into 4 parts, which are allpre-computed

BotLeftMin(r2, p2)

TopLeftMin(r4, p4)TopRightMin(r3, p3)

BotRightMin(r1, p1)

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 22 / 28

Efficient Pre-Computations

For any canonical range r , cut the middle of its longer side to obtaintwo smaller canonical ranges r1 and r2

Do binary search row by row (or column by column)

O(N) comparisons for 2D preprocessing

Generalize to any fixed dimension d :

Preprocess: O(2.89d(d + 1)!N) comparisonsQuery: 2d − 1 comparisons

BotLeftMin(r, p)

r1 r2

pBinary Search

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 23 / 28

Efficient Pre-Computations

For any canonical range r , cut the middle of its longer side to obtaintwo smaller canonical ranges r1 and r2

Do binary search row by row (or column by column)

O(N) comparisons for 2D preprocessing

Generalize to any fixed dimension d :

Preprocess: O(2.89d(d + 1)!N) comparisonsQuery: 2d − 1 comparisons

BotLeftMin(r, p)

r1 r2

pBinary Search

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 23 / 28

Efficient Pre-Computations

For any canonical range r , cut the middle of its longer side to obtaintwo smaller canonical ranges r1 and r2

Do binary search row by row (or column by column)

O(N) comparisons for 2D preprocessing

Generalize to any fixed dimension d :

Preprocess: O(2.89d(d + 1)!N) comparisonsQuery: 2d − 1 comparisons

BotLeftMin(r, p)

r1 r2

pBinary Search

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 23 / 28

Efficient Pre-Computations

For any canonical range r , cut the middle of its longer side to obtaintwo smaller canonical ranges r1 and r2

Do binary search row by row (or column by column)

O(N) comparisons for 2D preprocessing

Generalize to any fixed dimension d :

Preprocess: O(2.89d(d + 1)!N) comparisonsQuery: 2d − 1 comparisons

BotLeftMin(r, p)

r1 r2

pBinary Search

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 23 / 28

Overview of RAM Implementations

Divide the array into micro blocks of size ε log N

Each block is a d-dimensional cube, with side length (ε log N)1d

For example in 2D, make each block√

ε log N by√

ε log N

q1

q2 q3

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 24 / 28

Overview of RAM Implementations

For query that crosses the border of any micro block: there existsO(N)-time preprocessing and constant-time querying data structures tosolve it, using dimension reductions and the help of the data structures in[Yao 1982] [Chazelle and Rosenberg 1989]

q1

q2 q3

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 24 / 28

Overview of RAM Implementations

For query that is complete within a micro block, use table lookuptechnique (Four Russian’s Trick) to get the locations of at most 2d

candidates to compare at the querying stageBased on our linear-comparison preprocessing data structure

q1

q2 q3

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 24 / 28

Micro Block

Key Idea: If two micro blocks have the same type, then they should sharethe same data structures

Type of a micro block: Comparison results (true/false sequence) ofthe linear-comparison preprocessing algorithm

Assume cε log N comparisons to preprocess a block: at most2cε log N = Ncε possible types

Choose ε < 1c , then there are only a sublinear number of types:

Ncεpolylog(ε log N) = o(N)

Recognizing the types for all micro blocks in linear time:Build a linear-depth decision tree according to the linear-comparisonpreprocessing algorithm

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 25 / 28

Summary of RAM

Our unit-cost RAM data structure

Preprocess in O(2.89d(d + 1)!N) time and (2dd!N) space

Query in O(3d) time

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 26 / 28

Future Work

Future Work:

Extend the lower bound of [Demaine, Landau and Weimann, 2009]

If at most t comparisons are allowed at the querying stage, find thelower bound for the number of comparisons required to preprocess theinput

Dynamic Updates [Poon, 2003]

Extend our results to the external memory model

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 27 / 28

Future Work

Future Work:

Extend the lower bound of [Demaine, Landau and Weimann, 2009]

If at most t comparisons are allowed at the querying stage, find thelower bound for the number of comparisons required to preprocess theinput

Dynamic Updates [Poon, 2003]

Extend our results to the external memory model

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 27 / 28

Future Work

Future Work:

Extend the lower bound of [Demaine, Landau and Weimann, 2009]

If at most t comparisons are allowed at the querying stage, find thelower bound for the number of comparisons required to preprocess theinput

Dynamic Updates [Poon, 2003]

Extend our results to the external memory model

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 27 / 28

Future Work

Future Work:

Extend the lower bound of [Demaine, Landau and Weimann, 2009]

If at most t comparisons are allowed at the querying stage, find thelower bound for the number of comparisons required to preprocess theinput

Dynamic Updates [Poon, 2003]

Extend our results to the external memory model

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 27 / 28

END

Thank You!

Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 28 / 28