Sum Selection in Arrays Allan Grønlund Jørgensen Kvalifikationseksamen.
-
Upload
kathryn-prevost -
Category
Documents
-
view
215 -
download
2
Transcript of Sum Selection in Arrays Allan Grønlund Jørgensen Kvalifikationseksamen.
Sum Selection in Arrays
Allan Grønlund Jørgensen
Kvalifikationseksamen
2 KvalifikationseksamenAllan Grønlund Jørgensen
Progress Report
Priority Queues Resilient to Memory Faults, with Moruz, Mølhave (WADS 07)Optimal Resilient Dictionaries, with Brodal, Fagerberg, Finocchi, Grandoni, Italiano, Moruz, Mølhave (ESA07)Comparison Based Dictionaries: Fault Tolerance versus I/O Efficiency, with Brodal and Mølhave (Manuscript-ICALP08)
A Linear Time Algorithm for the k Maximal Sums Problem, with Brodal (MFCS 07)Sum Selection, with Brodal. (Manuscript-ICALP08)
Fault Tolerance: Sum Selection:
3 KvalifikationseksamenAllan Grønlund Jørgensen
-8 742 2 -52
34-1
9-50
411
-4343
-51-9
-8742
2-52
4 KvalifikationseksamenAllan Grønlund Jørgensen
Outline
Introduction
The k maximal sums problem
Length constrained k maximal sums problem
Sum selection problem
Summary and plans for the future
5 KvalifikationseksamenAllan Grønlund Jørgensen
The Maximum Sum Problem
Given array of numbers, find the largest sum
7 -12-3 1 6 -3 5 -2
(4,7,9)
6 KvalifikationseksamenAllan Grønlund Jørgensen
4 911 -4 778
Kadanes Algorithm(’77)
Scan array from left and in step i update:
Largest suffix sum (Largest sum ending at A[i])
Largest sum so far (Largest sum in A[1,…,i])
7 -121 1 6 -3 5 -2
1 8 9
7 KvalifikationseksamenAllan Grønlund Jørgensen
Outline
Introduction
The k maximal sums problem
Length constrained k maximal sums problem
Sum selection problem
Summary and plans for the future
8 KvalifikationseksamenAllan Grønlund Jørgensen
The k Maximal Sums Problem
Given array of numbers, find the k largest sums (they may overlap)
Example with k=2
7 -12-3 1 6 -3 5 -2
9
8
9 KvalifikationseksamenAllan Grønlund Jørgensen
Goal
Optimal O(n+k) time algorithm
outputting the k maximal sums
10 KvalifikationseksamenAllan Grønlund Jørgensen
Main Idea(Intuition)
Build all sums and insert them into a heap ordered binary tree
Find the k largest sums using Frederickson’s heap selection algorithm(’93) in O(k) time
11 KvalifikationseksamenAllan Grønlund Jørgensen
Example(k=4)
9
86
4
-8
-12
73
-11
-3
-3
-5 5
21
-12 1 6 -3 5
Fredericksons algorithm finds the red nodes in O(k) time (no particular order)
12 KvalifikationseksamenAllan Grønlund Jørgensen
The Iheap
It is a heap ordered binary tree
Supports insertions in amortized constant time
13 KvalifikationseksamenAllan Grønlund Jørgensen
Inserting 7 in an Iheap
9
3
4
5
7
7
3
7
4
5
7T1
T2
T3
T4
T4
T3
T2
3
4
5
T3
T4
T4
T3
T2
14 KvalifikationseksamenAllan Grønlund Jørgensen
Main Issue
There are n(n+1)/2 = (n2) sums
Constructing and inserting (n2) sums into a heap ordered binary tree takes (n2) time
15 KvalifikationseksamenAllan Grønlund Jørgensen
Grouping Sums
The sums are grouped by their endpoint in the array
j
isj sAsumjisumjiQ 1|),,(
7 -12-3 1 6 -3 5 -2
(1,4,-7)
(2,4,-4)
(3,4,-11)
(4,4,1)
Q4:
16 KvalifikationseksamenAllan Grønlund Jørgensen
(4,5,7)
(3,5,-5)
(2,5,2)
(1,5,-1)
7 -12-3 1 6 -3 5 -2
(1,4,-7)
(2,4,-4)
(3,4,-11)
(4,4,1)
(5,5,6)
Constructing Q5 from Q4
}),1,(|])[,,{(])}[,,{( 1j-j QsjijAsjijAjjQ
Q4:Q5:
17 KvalifikationseksamenAllan Grønlund Jørgensen
Main Idea Continued
Represent each Q set as a heap ordered binary tree H
Combine all heaps by assembling them into one big heap using dummy infinity keys
18 KvalifikationseksamenAllan Grønlund Jørgensen
The Assembled Heap
H5H4H3
H2H1
19 KvalifikationseksamenAllan Grønlund Jørgensen
Representing Q Sets:
Each set Qj is represent by a tuple < j , Hj >
Hj is an Iheap containing all j sums from Qj
j is a number must be added to all elements
We get the following construction equation
< 0 , H0 > = < 0, { } >
< j+1 , Hj+1 > = < j + A[j+1], Hj {-j}>
20 KvalifikationseksamenAllan Grønlund Jørgensen
Example
31 0
7 -12-3
0
342
0
3
-4
83
)3(3Set
)0(-Insert
01
0
δ
)4(7Set
)3(Insert
12
1
)8(12Set
)4(Insert
23
2
{-3}
{4,7}
{-8,-5,-12}
< 0 , H0 > = < 0, { } >
< j+1 , Hj+1 > = < j + A[j+1], Hj {-j}>
21 KvalifikationseksamenAllan Grønlund Jørgensen
Analysis of Pair Construction
Building each pair takes amortized constant time (One insertion into Iheap)!! But the old version disappearsSolution: Partial Persistence (Driscoll.. ‘89)
9
3
4
5T1
T2
T3
T4
9
57T1
3
4
5
T3
T4
T4
T3
T2
7
insert
Version i Version i+1
25 KvalifikationseksamenAllan Grønlund Jørgensen
Resume
Build all pairs in O(n) time
Join them into a single heap in O(n) time
Use Fredericksons algorithm to get the k+n-1 largest and discard the dummies in O(n+k) time
O(n+k) time algorithm
H5H4H3
H2H1
26 KvalifikationseksamenAllan Grønlund Jørgensen
Space Reduction
Current algorithm uses O(n+k) time and additional space
The input array is considered read only
Kadanes algorithm uses O(1) additional space
Reduce the additional space usage to O(k)
29 KvalifikationseksamenAllan Grønlund Jørgensen
Higher Dimensions
Can be reduced to 1D case.
……..
space )( time,)( 2 knOknmO
space )( time,)(1
12
21 knOknnO
d
ii
d
ii
For an m x n matrix, we get
In general we get
30 KvalifikationseksamenAllan Grønlund Jørgensen
Outline
Introduction
The k maximal sums problem
Length constrained k maximal sums problem
Sum selection problem
Summary and plans for the future
31 KvalifikationseksamenAllan Grønlund Jørgensen
Length Constrained k Maximal Sums Problem
Each sum must be an aggregate of at least l numbers and at most u numbers
Example with l=3 and u=5
7 -66612 8 7 -6 4 -2
Best: 19
Best Valid: 13
32 KvalifikationseksamenAllan Grønlund Jørgensen
Goal
Optimal O(n+k) time algorithm outputting the k maximal sums
with length between l and u
33 KvalifikationseksamenAllan Grønlund Jørgensen
First Approach
Use the same idea as before but redefine Q to match the length criteria
Constructing equation is almost identical but requires a deletion
H5H4H3
H2H1
34 KvalifikationseksamenAllan Grønlund Jørgensen
(5,7,2)
(4,7,-8)
(3,7,34)
(2,7,51)
(1,7,46)
Constructing Q SetsUsing Deletions (l=3,u=6)
(1,6,56)
(2,6,61)
(3,6,44)
(4,6,2)
-5 17 -10 042 12 -10 666
35 KvalifikationseksamenAllan Grønlund Jørgensen
Result
Same algorithm as before using the new way of constructing the next heap
Deleting an element in a heap of size n with constant time insertion takes O(log n)
O(nlog(u-l) +k) time alg.
H5H4H3
H2H1
36 KvalifikationseksamenAllan Grønlund Jørgensen
A Better Way of Constructing the Q
sets(u=8,l=4)
-5 17 -10 042 12 -10 11 7 7 666
Slab 1 Slab 2
0
-10
1
11
13
l -1
j + l -1
Divide into slabs of size u-l+1For each slab build two sets of heaps: One from left (L) and one from right (R) For each index j group all sums of length between l and u ending at j+l-1 using the sets from above and two constantsExample j=3 in slab 2
32
0+693=693
-10+693=683
1+680=681
11+680=691
13+680=693
:L3
:R3
:R2
37 KvalifikationseksamenAllan Grønlund Jørgensen
Result
Same algorithm using the new way to group sums.
Building the L and R sets takes O(u-l) time for each slab.
O(n+k) time algorithm
H5H4H3
H2H1
38 KvalifikationseksamenAllan Grønlund Jørgensen
Outline
Introduction
The k maximal sums problem
Length constrained k maximal sums problem
Sum selection problem
Summary and plans for the future
39 KvalifikationseksamenAllan Grønlund Jørgensen
Sum Selection
Given array of numbers, find the k’th largest sumExample with k=5
-56 -52 -50 -43 -14 -13 -6 -4 2 7 9 29 36 38 42
The 15 sums in sorted order:
-8 742 2 -52
34-1
9-50
411
-4343
-51-9
-8742
2-52
40 KvalifikationseksamenAllan Grønlund Jørgensen
First Solution
Use the algorithm finding the k maximal sums to find the k largest and output the smallest of these
Algorithm uses O(n+k) time.
What if is large?
41 KvalifikationseksamenAllan Grønlund Jørgensen
Lower BoundReduction from the Cartesian Sum Problem (X+Y)A lower bound of (|Y| + |Y|log(k/|Y|)) (Frederickson and Johnson ’82)
7
-5
9
13
2
12
1
-3
8
X Y
3
42 KvalifikationseksamenAllan Grønlund Jørgensen
Reduction
21 xx
7 -5 9 13 2 12 1 -3 8X Y
12 -14 -4 117+15 10 -11 -4 11
14 yx
32 xx 43 xx 21 yy 32 yy 43 yy 54 yy
=
-4
113 = 117 - 4
|)||(|}||max{ | YXYzXzz
43 KvalifikationseksamenAllan Grønlund Jørgensen
Result
An (n+nlog(k/n)) lower bound for the sum selection problem
44 KvalifikationseksamenAllan Grønlund Jørgensen
Goal
Optimal O(n+nlog(k/n)) time algorithm for selecting the k’th
largest sum
45 KvalifikationseksamenAllan Grønlund Jørgensen
Algorithm
Reduction to selection in sorted arrays and weight balanced search trees
Frederickson and Johnson(’82) already solved selection in n arrays in optimal O(n + nlog(k/n)) time
Adapt this algorithm such that it also works on weight balanced trees
46 KvalifikationseksamenAllan Grønlund Jørgensen
Block Heap
54,49,42
39,31,25
24,12,7 23,22,21
17,13,11
10,5,1 9,6,3
Heap ordered binary tree
Each node stores B sorted elements
Inserting a block of B elements takes O(B) time.
47 KvalifikationseksamenAllan Grønlund Jørgensen
22
54
Reducing Sum Selection to Selection in Arrays and
Trees-10 042 12 -10 11 7 2 666
Slab
Divide into slabs of size k/nEach index j is associated with two data structures that together cover all sums ending at index jFirst data structure is all sums starting in current slab and is named WBj
The second is the rest and is named BHj
ExampleExtending within a slabExtending to new slab - a block of k/n elements is inserted to BH
2
9
20
666
668
675
686
0
0
WB:WB:
BH:
676
720
688
10
BH:
WB:
666
668
675
686
676
720
688
48 KvalifikationseksamenAllan Grønlund Jørgensen
Reducing Problem
One insert in tree per step and one insert in Block heap every k/n steps.
n trees of size at most k/n and n Block heaps.
Join all Block heaps together and use Frederickson to find the 4n blocks with largest minimum
n trees and O(n) sorted
arrays left
H5H4H3
H2H1
49 KvalifikationseksamenAllan Grønlund Jørgensen
Result
Selection in O(n) trees and sorted arrays storing O(k) elements can be done in O(n+nlog(k/n)) time
Result is an O(n+nlog(k/n)) time algorithm.
50 KvalifikationseksamenAllan Grønlund Jørgensen
Outline
Introduction
The k maximal sums problem
Length constrained k maximal sums problem
Sum selection problem
Summary and plans for the future
51 KvalifikationseksamenAllan Grønlund Jørgensen
Summary of Results
Problem Time Complexity
k Maximal Sums O(n+k)
Length Constrained k
Maximal Sums
O(n+k)
Sum Selection (n+nlog(k/n))
Sum Selection:
52 KvalifikationseksamenAllan Grønlund Jørgensen
Summary of Results
Data Structure Query Update
Priority Queue (log(n)+) O(log(n)+ ) am
Sorted Array O(log(n)+) -
Rand. Dictionary (log(n)+) exp. O(log(n)+) exp.
Dictionary (log(n)+) O(log(n)+ ) am.
I/O Dictionary I/O I/O
Fault Tolerant Data Structures:
BB
Nc cB
1
)(log1
BB
Nc cB
1
)(log1
53 KvalifikationseksamenAllan Grønlund Jørgensen
Progress and Future
TimePhD Start Qualification Exam
Priority Queue
Searching
I/O Eff. Search
k Max Sums
Sum Selection
Fault Tolerance
Sums in Arrays
I/O Eff. Sorting
Cache Oblivious
MITSelection in arb. Trees
(l,u) k Max Sums
Dictionary