Improve Run Generation
-
Upload
manishbhardwaj8131 -
Category
Documents
-
view
222 -
download
0
Transcript of Improve Run Generation
-
7/27/2019 Improve Run Generation
1/30
Improve Run Generation
Overlap input,output, and internal CPU work. Reduce the number of runs (equivalently, increase
average run length).
DISK
MEMORY
DISK
-
7/27/2019 Improve Run Generation
2/30
Internal Quick Sort
6 2 8 5 11 10 4 1 9 7 3
Use 6 as the pivot (median of 3).
Input first, middle, and last blocks first.In-place partitioning.
Input blocks from the ends toward the middle.
Sort left and right groups recursively.
Can begin output as soon as left most block is ready.
4 2 3 5 1 6 10 11 9 7 8
-
7/27/2019 Improve Run Generation
3/30
Alternative Internal Sort Scheme
DISK
DISK
B1 B2 B3
Partition into 3 areas, each may
be more than 1 block in size.
-
7/27/2019 Improve Run Generation
4/30
Steady State Operation
Read from
disk
Write to
disk
Run
generation
Synchronization is done when the current input area gets
full (the current output area will be empty at this time).
-
7/27/2019 Improve Run Generation
5/30
DISK
MEMORY
DISK
New Strategy
Use 2 input and 2 output buffers.
Rest of memory is used for a min loser tree.
Input 1Input 0
Output 0 Output 1
Loser Tree
Actually, 3 buffers adequate.
-
7/27/2019 Improve Run Generation
6/30
Steady State Operation
Read from
disk
Write to
disk
Run
generation
Synchronization is done when the active input buffer gets
empty (the active output buffer will be full at this time).
-
7/27/2019 Improve Run Generation
7/30
4 3 6 8 1 5 7 3 2 6 9 4 5 2 5 8
4
3
8
O0 O1
I0
I1
Initialize
-
7/27/2019 Improve Run Generation
8/30
4 3 6 8 1 5 7 3 2 6 9 4 5 2 5 8
4
6
8
3
5
1
7
Initialize
O0 O1
I0
I1
-
7/27/2019 Improve Run Generation
9/30
4 3 6 8 1 5 7 3 2 6 9 4 5 2 5 8
4
6
8
3
5
3
7
1
6
2
9
Initialize
O0 O1
I0
I1
-
7/27/2019 Improve Run Generation
10/30
4 3 6 8 1 5 7 3 2 6 9 4 5 2 5 8
4
6
8
3
5
3
7
2
5
2
8
1
6
4
9
Initialize
O0 O1
I0
I1
-
7/27/2019 Improve Run Generation
11/30
4 3 6 8 1 5 7 3 2 6 9 4 5 2 5 8
4
6
8
3
5
3
7
2
5
5
8
1
6
4
9
Initialize
O0 O1
I0
I1
-
7/27/2019 Improve Run Generation
12/30
4 3 6 8 1 5 7 3 2 6 9 4 5 2 5 8
4
6
8
3
5
3
7
2
5
5
8
2
6
4
9
Initialize
O0 O1
I0
I1
-
7/27/2019 Improve Run Generation
13/30
Generate Run 1
14 3 6 8 5 7 3 2 6 9 4 5 2 5 8
4
6
8
3
5
3
7
2
5
5
8
2
6
4
9
O0 O1
I0 I1
3
5
4
-
7/27/2019 Improve Run Generation
14/30
Generate Run 1
4 3 6 8 5 7 3 2 6 9 4 5 2 5 8
4
6
8
3
5
3
7
2
5
5
8
2
6
4
9
O0 O1
I0 I1
3
5
4
1
3
3
-
7/27/2019 Improve Run Generation
15/30
5
O0
2
3
Generate Run 1
4 3 6 8 5 7 3 6 9 4 5 2 5 8
4
6
8
3
5
3
7
2
5
5
8
3
6
4
9
O1
I0 I1
3
5
4
1
5
4
-
7/27/2019 Improve Run Generation
16/30
45
O0
2
3
Generate Run 1
4 3 6 8 5 7 3 6 9 4 5
2
5 8
4
6
8
3
5
3
7
2
5
5
8
3
6
4
9
O1
I0 I1
3
5
4
1
5
4
-
7/27/2019 Improve Run Generation
17/30
5
O0
2
34 3 6 8 5 7 3 6 9 4 5
2
5 8
4
6
8
3
5
3
7
2
5
5
8
3
6
4
9
O1
I0 I1
1
5
4
4
1
9
2
-
7/27/2019 Improve Run Generation
18/30
5
O0
2
34 3 6 8 5 7 3 6 9 4 5
2
5 8
4
6
8
3
5
3
7
4
5
5
8
3
6
4
9
O1
I0 I1
1
5
4
4
1
9
2
Continue With Run 1
-
7/27/2019 Improve Run Generation
19/30
O1
3
45
O0
2
4 3 6 8 5 7 3 6 9 4 5
2
5 8
4
6
8
3
5
3
7
4
5
5
8
4
6
4
9
I0 I1
1
5
1
1
9
2
Continue With Run 1
1
5
-
7/27/2019 Improve Run Generation
20/30
1
O1
3
45
O0
2
4 3 6 8 5 7
3
6 9 4 5
2
5 8
4
6
8
3
5
3
7
4
5
5
8
4
6
4
9
I0 I1
1
5
1
1
9
2
Continue With Run 1
5
9
9
5
7
-
7/27/2019 Improve Run Generation
21/30
91
O1
3
45
O0
2
4
3
6 8 5 7
3
6 9 4 5
2
5 8
4
6
8
3
5
3
7
4
5
5
8
4
6
4
9
I0 I1
1
5
1
1
9
2
Continue With Run 1
5
9
5
7
2
-
7/27/2019 Improve Run Generation
22/30
91
O1
3
45
O0
4
3
6 8 5 7
3
6 9 4 5 5 8
4
6
8
3
5
3
7
4
5
5
8
4
6
4
9
I0 I1
5
1
6
1
3
5
9
5
7
2
-
7/27/2019 Improve Run Generation
23/30
91
O1
3
45
O0
4
3
6 8 5 7
3
6 9 4 5 5 8
4
6
8
3
5
3
7
4
5
5
8
4
6
4
9
I0 I1
5
1
6
1
3
5
9
5
7
2
Continue With Run 1
2
-
7/27/2019 Improve Run Generation
24/30
2 91
O1
3
45
O0
4
3
6 8 5 7
3
6 9 4 5 5 8
4
6
8
3
5
3
7
4
5
5
8
4
6
4
9
I0 I1
5
1
6
1
3
5
9
5
7
Continue With Run 1
2
6
6
5
-
7/27/2019 Improve Run Generation
25/30
2 91
O1
3
45
O0
4
3
6 8 5 7
3
6 9
4
5 5 8
4
6
8
3
5
3
7
4
5
5
8
4
6
4
9
I0 I1
5
1
6
1
3
5
9
5
7
Continue With Run 1
2
6
6
5
1
1
9
5
-
7/27/2019 Improve Run Generation
26/30
Let kbe number of external nodes in loser
tree.
Run size >= k.
Sorted input => 1 run.
Reverse of sorted input => n/kruns.
Average run size is ~2k.
-
7/27/2019 Improve Run Generation
27/30
Memory capacity = m records.
Run size using fill memory, sort, and output
run scheme = m.
Use loser tree scheme.
Assume block size isb records.
Need memory for4 buffers (4b records).
Loser tree k = m 4b.
Average run size = 2k = 2(m 4b).
2k >= m when m >= 8b.
-
7/27/2019 Improve Run Generation
28/30
Assumeb = 100.
m 600 1000 5000 10000
k 200 600 4600 9600
2k
400 1200 9200 19200
-
7/27/2019 Improve Run Generation
29/30
Total internal processing time using fillmemory, sort, and output run scheme
= O((n/m) m log m) = O(n log m).
Total internal processing time using losertree = O(n log k).
Loser tree scheme generates runs that differ
in their lengths.
-
7/27/2019 Improve Run Generation
30/30
4 3 6 9
Merging Runs Of Different Length
4 3
6
9
7 15
22
7
13
22