I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)
-
Upload
anton-konushin -
Category
Education
-
view
1.030 -
download
3
description
Transcript of I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)
![Page 1: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/1.jpg)
I/O-efficient Algorithms and Data Structures
Lars Arge
Professor and center director
Aarhus University
ALMADA summer schoolAugust 1, 2013
![Page 2: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/2.jpg)
• Pervasive use of computers and sensors• Increased ability to acquire/store/process data
→ Massive data collected everywhere• Society increasingly “data driven”
→ Access/process data anywhere any time
Nature/Science special issues• 2/06, 9/08, 2/11• Scientific data size growing exponentially,
while quality and availability improving• Paradigm shift: Science will be about mining data
Massive Data
Obviously not only in sciences:• Economist 02/10:
• From 150 Billion Gigabytes five years ago
to 1200 Billion today• Managing data deluge difficult; doing so
will transform business/public life
I/O-efficient Algorithms and Data Structures
Lars Arge 2
![Page 3: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/3.jpg)
Lars Arge 3
Random Access Machine Model
• Standard theoretical model of computation:– Infinite memory– Uniform access cost
• Simple model crucial for success of computer industry
R
A
M
I/O-efficient Algorithms and Data Structures
![Page 4: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/4.jpg)
Lars Arge 4
Hierarchical Memory
• Modern machines have complicated memory hierarchy– Levels get larger and slower further away from CPU– Data moved between levels using large blocks
L
1
L
2
R
A
M
I/O-efficient Algorithms and Data Structures
![Page 5: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/5.jpg)
Lars Arge 5
Slow I/O
– Disk systems try to amortize large access time transferring large contiguous blocks of data
• Important to store/access data to take advantage of blocks (locality)
• Disk access is 106 times slower than main memory access
track
magnetic surface
read/write armread/write head
“The difference in speed between modern CPU and
disk technologies is analogous to the difference
in speed in sharpening a pencil using a sharpener on
one’s desk or by taking an airplane to the other side of
the world and using a sharpener on someone else’s
desk.” (D. Comer)
I/O-efficient Algorithms and Data Structures
![Page 6: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/6.jpg)
Lars Arge 6
Scalability Problems• Most programs developed in RAM-model
– Run on large datasets because
OS moves blocks as needed
• Moderns OS utilizes sophisticated paging and prefetching strategies– But if program makes scattered accesses even good OS cannot
take advantage of block access
Scalability problems!
data size
runn
ing
tim
e
I/O-efficient Algorithms and Data Structures
![Page 7: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/7.jpg)
Lars Arge 7
N = # of items in the problem instance
B = # of items per disk block
M = # of items that fit in main memory
T = # of items in output
I/O: Move block between memory and disk
We assume (for convenience) that M >B2
D
P
M
Block I/O
External Memory Model
I/O-efficient Algorithms and Data Structures
![Page 8: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/8.jpg)
Lars Arge 8
Fundamental Bounds Internal External
• Scanning: N• Sorting: N log N• Permuting • Searching:
• Note:– Linear I/O: O(N/B)– Permuting not linear– Permuting and sorting bounds are equal in all practical cases– B factor VERY important: – Cannot sort optimally with search tree
NBlog
BN
BN
BMlog
BN
NBN
BN
BN
BM log
}log,min{BN
BN
BMNN
N2log
I/O-efficient Algorithms and Data Structures
![Page 9: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/9.jpg)
Scalability Problems: Block Access Matters• Example: Traversing linked list (List ranking)
– Array size N = 10 elements– Disk block size B = 2 elements– Main memory size M = 4 elements (2 blocks)
• Large difference between N and N/B large since block size is large– Example: N = 256 x 106, B = 8000 , 1ms disk access time
N I/Os take 256 x 103 sec = 4266 min = 71 hr
N/B I/Os take 256/8 sec = 32 sec
Algorithm 2: N/B=5 I/OsAlgorithm 1: N=10 I/Os
1 5 2 6 73 4 108 9 1 2 10 9 85 4 76 3
9Lars Arge
I/O-efficient Algorithms and Data Structures
![Page 10: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/10.jpg)
Outline• Today:
– Sorting upper bound (merge- and distribution-sort)– Sorting (permuting) lower bound– Cache-oblivious sorting (if time permits)– Batched geometric problems: Distribution sweeping
• Monday:– Searching (B-trees)– Batched searching (Buffer-trees)– I/O-efficient priority queues– I/O-efficient list ranking
Lars Arge 10
I/O-efficient Algorithms and Data Structures
![Page 11: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/11.jpg)
Lars Arge 11
Queues and Stacks• Queue:
– Maintain push and pop blocks in main memory
O(1/B) Push/Pop operations amortized
• Stack:– Maintain push/pop blocks in main memory
O(1/B) Push/Pop operations amortized
Push Pop
I/O-efficient Algorithms and Data Structures
![Page 12: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/12.jpg)
Lars Arge 12
Sorting• <M/B sorted lists (queues) can be merged in O(N/B) I/Os
M/B blocks in main memory
• Unsorted list (queue) can be distributed using <M/B split elements in O(N/B) I/Os
I/O-efficient Algorithms and Data Structures
![Page 13: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/13.jpg)
Lars Arge 13
Sorting• Merge sort:
– Create N/M memory sized sorted lists– Repeatedly merge lists together Θ(M/B) at a time
phases using I/Os each I/Os)( BNO)(log
MN
BMO )log(
BN
BN
BMO
)(MN
)/(BM
MN
))/(( 2BM
MN
1
I/O-efficient Algorithms and Data Structures
![Page 14: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/14.jpg)
Lars Arge 14
Sorting• Distribution sort (multiway quicksort):
– Compute Θ(M/B) split elements– Distribute unsorted list into Θ(M/B) unsorted lists of equal size– Recursively split lists until fit in memory
I/Os if split elements in O(N/B) I/Os
Can compute split elements in O(N/B) I/Os
phases I/Os
)log(BN
BN
BMO
BM
)(log)(logMN
MN
BM
BM OO )log(
BN
BN
BMO
I/O-efficient Algorithms and Data Structures
![Page 15: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/15.jpg)
Lars Arge 15
Permuting Lower BoundPermuting N elements according to a given permutation takes
I/Os in “indivisibility” model
• Indivisibility model: Move of elements only allowed operation• Note:
– We can allow copies (and destruction of elements)– Bound also a lower bound on sorting
• Proof:– View memory and disk as array of N tracks of B elements– Assume all I/Os track aligned (assumption can be removed)
})log,(min{BN
BN
BMN
I/O-efficient Algorithms and Data Structures
M D
![Page 16: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/16.jpg)
Lars Arge 16
Permuting Lower Bound– Array contains permutation of N elements at all times– We will count how many permutations can be
reached (produced) with t I/Os– Input:
* Choose track: N possibilities* Rearrange ≤ B element in track and place among ≤ M-B
elements in memory:–
possibilities if “fresh” track
– otherwise
at most permutations after t inputs– Output: Choose track: N possibilities
BN
B
M BN t )!())((
)(!B
MB )(
B
M
I/O-efficient Algorithms and Data Structures
M D
![Page 17: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/17.jpg)
Lars Arge 17
Permuting Lower Bound– Permutation algorithm needs to be able to produce N! permutations
(using Stirlings formula and )– If we have– If we have and thus
!)!())(( NBN BN
B
M t
)!log())log((log)!log( NNtBB
M
BN
NNBNtBN BM log)log(loglog
BM
BN
BN
Nt
loglog
log
xxx log!log BMB
B
M log)log(
BMBN loglog B
NBMB
NB
N
BMBN
t /log2
loglog
BMBN loglog NB
NNNNNt NB
N
NBN
21
21
loglog
21
log2
log
})log,(min{ BN
BN
BMNt
I/O-efficient Algorithms and Data Structures
![Page 18: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/18.jpg)
Lars Arge 18
Sorting lower bound• Similar argument but assuming comparison model in internal memory
– Initially N! possible orderings– Count how may possible after t I/Os
Sorting N elements takes I/Os in comparison model
!)!())(( NBN BN
B
M t
)!log())log((log)!log( NNtBB
M
BN
NNBNtBN BM log)log(loglog
BM
BN
BN
Nt
loglog
log
)log(BN
BN
BM
I/O-efficient Algorithms and Data Structures
![Page 19: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/19.jpg)
Lars Arge 19
Summary/Conclusion: Sorting• External merge or distribution sort takes I/Os
– Merge-sort based on M/B-way merging– Distribution sort based on -way distribution
and (complicated) split elements finding– Key is linear I/O M/B-way merging/distribution
• Optimal in comparison model
• lower bound in stronger indivisibility model– Holds even for permuting
)log(BN
BN
BMO
})log,(min{BN
BN
BMN
BM
I/O-efficient Algorithms and Data Structures
![Page 20: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/20.jpg)
Outline• Today:
– Sorting upper bound (merge- and distribution-sort)– Sorting (permuting) lower bound– Cache-oblivious sorting (if time permits)– Batched geometric problems: Distribution sweeping
• Monday:– Searching (B-trees)– Batched searching (Buffer-trees)– I/O-efficient priority queues– I/O-efficient list ranking
Lars Arge 20
I/O-efficient Algorithms and Data Structures
![Page 21: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/21.jpg)
21Lars Arge
Cache-Oblivious Model
• N, B, and M as in I/O-model– Assumed that M>B2 (tall cache assumption)
• M and B not used in algorithm description• Block transfers by optimal paging strategy
Analyze in two-level modelEfficient on one level,
efficient on all levels (of fully associative hierarchy using LRU)
I/O-efficient Algorithms and Data Structures
![Page 22: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/22.jpg)
22Lars Arge
• levels• Distribution in I/Os
algorithms
I/O-model Distribution Sort
BM
1
)(BNO
)log(BN
BN
BMO
N
I/O-efficient Algorithms and Data Structures
BM
)(log)(logMN
MN
BM
BM OO
![Page 23: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/23.jpg)
23Lars Arge
Cache-Oblivious Distribution Sort• General idea: way distribution
N
N
N N N N N N
• Distribution performed in accesses after– breaking the N elements into subarrays of size – sorting each subarray recursively
)(BNO
N N
Recurrence for number
of accesses used to sort
)log()(BN
BN
BMONT
MNO
MNONTNNT
BN
BN
if)(
if)()(2)(
I/O-efficient Algorithms and Data Structures
![Page 24: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/24.jpg)
24Lars Arge
Cache-Oblivious Distribution Sort• Distribution in accesses (assuming buckets known):)(
BNO
NN
– Divide subarrays into two equal sized groups A1 and A2
– Divide buckets into two equal-sized groups B1 and B2
– Recursively:
1. Distribute (relevant) elements from A1 into B1
2. Distribute (relevant) elements from A2 into B1
3. Distribute elements from A1 into B2
4. Distribute elements from A2 into B2
1 23 4
I/O-efficient Algorithms and Data Structures
![Page 25: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/25.jpg)
25Lars Arge
Cache-Oblivious Distribution SortAnalysis of distribution:
– Consider recursive subproblems on B subarrays* such problems* Each subproblem solved in
accesses accesses
2
log4BNB
N
BB movedelements
)()( 2 BN
BN
BN OBO
N
2N
22
N
B
BNlog
NB
I/O-efficient Algorithms and Data Structures
![Page 26: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/26.jpg)
26Lars Arge
Batched Geometric Problems• Example: Orthogonal line segment intersection
– Given set of axis-parallel line segments, report all intersections
• In internal memory many problems is solved using sweeping
I/O-efficient Algorithms and Data Structures
![Page 27: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/27.jpg)
27Lars Arge
Plane Sweeping
• Sweep plane top-down while maintaining search tree T on vertical segments crossing sweep line (by x-coordinates)– Top endpoint of vertical segment: Insert in T– Bottom endpoint of vertical segment: Delete from T– Horizontal segment: Perform range query with x-interval on T
I/O-efficient Algorithms and Data Structures
![Page 28: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/28.jpg)
28Lars Arge
Plane Sweeping
• In internal memory algorithm runs in optimal O(Nlog N+T) time• In external memory algorithm performs badly (>N I/Os) if |T|>M
• Solution: Distribution sweeping– Combination of distribution and plane sweeping
I/O-efficient Algorithms and Data Structures
![Page 29: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/29.jpg)
29Lars Arge
Distribution Sweeping
• Divide plane into M/B-1 slabs with O(N/(M/B)) endpoints each• Sweep plane top-down while reporting intersections between
– part of horizontal segment spanning slab(s) and vertical segments• Distribute data to M/B-1 slabs
– vertical segments and non-spanning parts of horizontal segments• Recourse in each slab
I/O-efficient Algorithms and Data Structures
![Page 30: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/30.jpg)
30Lars Arge
Distribution Sweeping
• Sweep performed in O(N/B+T’/B) I/Os I/Os• Maintain active list of vertical segments for each slab (<B in memory)
– Top endpoint of vertical segment: Insert in active list– Horizontal segment: Scan through all relevant active lists
* Removing “expired” vertical segments* Reporting intersections with “non-expired” vertical segments
)log(BT
BN
BN
BMO
I/O-efficient Algorithms and Data Structures
![Page 31: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/31.jpg)
31Lars Arge
Distribution Sweeping• Other example: Rectangle intersection
– Given set of axis-parallel rectangles, report all intersections.
I/O-efficient Algorithms and Data Structures
![Page 32: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/32.jpg)
32Lars Arge
Distribution Sweeping
• Divide plane into M/B-1 slabs with O(N/(M/B)) endpoints each• Sweep plane top-down while reporting intersections between
– part of rectangles spanning slab(s) and other rectangles• Distribute data to M/B-1 slabs
– Non-spanning parts of rectangles • Recurse in each slab
I/O-efficient Algorithms and Data Structures
![Page 33: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/33.jpg)
33Lars Arge
Distribution Sweeping
• Seems hard to perform sweep in O(N/B+T’/B) I/Os• Solution: Multislabs (contigious set of slabs)
– Reduce fanout of distribution to – Recursion height still– Room for block from each multislab (activlist) in memory
)( BM
)(logBN
BMO
I/O-efficient Algorithms and Data Structures
![Page 34: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/34.jpg)
34Lars Arge
Distribution Sweeping
• Sweep while maintaining rectangle active list for each multisslab– Top side of spanning rectangle: Insert in active multislab list– Each rectangle: Scan through all relevant multislab lists
* Removing “expired” rectangles* Reporting intersections with “non-expired” rectangles
I/Os)log(BT
BN
BN
BMO
I/O-efficient Algorithms and Data Structures
![Page 35: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/35.jpg)
35Lars Arge
HomeworkDesign an I/O algorithm that given N rectangles in the
plane computes the measure (area) of the their union. Make sure to
argue for correctness and I/O-complexity of your algorithm.
Hint: First find for each input y-coordinate y the length of the
intersection between the rectangles and a horizontal line through the y.
Use distribution sweeping with a combine step to do so.
)log(BN
BN
BMO
I/O-efficient Algorithms and Data Structures
![Page 36: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/36.jpg)
Outline• Today:
– Sorting upper bound (merge- and distribution-sort)– Sorting (permuting) lower bound– Cache-oblivious sorting (if time permits)– Batched geometric problems: Distribution sweeping
• Monday:– Searching (B-trees)– Batched searching (Buffer-trees)– I/O-efficient priority queues– I/O-efficient list ranking
Lars Arge 36
I/O-efficient Algorithms and Data Structures
![Page 37: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/37.jpg)
Lars Arge 37
References• Input/Output Complexity of Sorting and Related Problems
A. Aggarwal and J.S. Vitter. CACM 31(9), 1988
• External-Memory Computational Geometry. M.T. Goodrich, J-J. Tsay, D.E. Vengroff, and J.S. Vitter. Proc. FOCS'93
• Cache-Oblivious Algorithms. M. Frigo, C. Leiserson, H. Prokop, and S. Ramachandran. Proc. FOCS '99.
I/O-efficient Algorithms and Data Structures
![Page 38: I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)](https://reader034.fdocuments.us/reader034/viewer/2022051400/5550947db4c9051e5b8b5361/html5/thumbnails/38.jpg)
• High level objectives:– Advance algorithmic knowledge in “massive data” processing area– Train researchers in world-leading international environment– Be catalyst for multidisciplinary/industry collaboration
• Building on:– Strong international team– Vibrant international environment – Focus areas (among others):
* I/O-efficient algorithms* Streaming algorithms* Cache-oblivious algorithms* Algorithm engineering
Center for Massive Data Algorithmics
We are hiring!!
Lars Arge 38
I/O-efficient Algorithms and Data Structures