Chapter 7 Space and Time Tradeoffs James Gain & Sonia Berman (jgain,[email protected])

19
Chapter 7 Space and Time Tradeoffs James Gain & Sonia Berman (jgain,[email protected])

Transcript of Chapter 7 Space and Time Tradeoffs James Gain & Sonia Berman (jgain,[email protected])

Page 1: Chapter 7 Space and Time Tradeoffs James Gain & Sonia Berman (jgain,sonia@cs.uct.ac.za)

Chapter 7

Space and Time Tradeoffs

James Gain & Sonia Berman(jgain,[email protected])

Page 2: Chapter 7 Space and Time Tradeoffs James Gain & Sonia Berman (jgain,sonia@cs.uct.ac.za)

Objectives

To introduce the mind set of trading space for timeTo show a variety of tradeoff solutions:

Sorting by CountingInput Enhancement for String Matching

To discuss the strengths and weaknesses of space and time tradeoffs

Page 3: Chapter 7 Space and Time Tradeoffs James Gain & Sonia Berman (jgain,sonia@cs.uct.ac.za)

Space-Time Tradeoffs

For many problems some extra space really pays off cf. log tables, stats tablesSimilarly, electronic computers can preprocess inputs and store additional info to accelerate the problemInput Enhancement

Sorting by CountingHorspool’s and Boyer-Moore’s Algorithms for String Matching

PrestructuringHashingB-trees

Dynamic Programming

Page 4: Chapter 7 Space and Time Tradeoffs James Gain & Sonia Berman (jgain,sonia@cs.uct.ac.za)

Sorting by Counting

elements in [L..U]Constraint: we cannot overwrite the original listDistribution Counting:

compute the frequency of each element Algorithm:

for j 0 to U-L do D[ j ] 0 // init frequencies for i 0 to n-1 do D[ A[i]-L ] D[ A[i] – L ] + 1 // compute frequencies for j 1 to U-L do D[ j ] D[ j-1 ] + D[ j ] // reuse for distribution for i n-1 downto 0 do j A[ I ] - l S[ D[j] – 1 ] A[i] D[j] D[j] - 1 return S

Page 5: Chapter 7 Space and Time Tradeoffs James Gain & Sonia Berman (jgain,sonia@cs.uct.ac.za)

Sorting by Counting

Example: A =

A[5] = 12A[4] = 12A[3] = 13A[2] = 12A[1] = 11A[0] = 13

Efficiency: (n)Best so far but only for specific types of input

13 11 12 13 12 12

Array Values 11 12 13

Frequencies 1 3 2

Distribution 1 4 6

D[0] D[1] D[2]

1 4 6

1 3 6

1 2 6

1 2 5

1 1 5

0 1 5

S[0]

S[1]

S[2]

S[3]

S[4]

S[5]

12

12

13

12

11

13

Page 6: Chapter 7 Space and Time Tradeoffs James Gain & Sonia Berman (jgain,sonia@cs.uct.ac.za)

Reminder: String Matching

Pattern: m characters to search forText: n characters to search inBrute force algorithm:1. Align pattern at beginning of text2. Moving left to right, compare each

character of pattern to the corresponding character in text until• All characters are found to match (successful

search); or• A mismatch is detected

3. While pattern is not found and the text is not yet exhausted, realign pattern one position to the right and repeat step 2.

Page 7: Chapter 7 Space and Time Tradeoffs James Gain & Sonia Berman (jgain,sonia@cs.uct.ac.za)

Horspool’s Algorithm

As before, start the pattern at the left and keep moving it rightwards until found/mismatchBut, at each position, compare pattern characters to text characters from right to leftUse shift table to determine how far to shift up when mismatch occursShift table:

Has a shift value for each possible characterinput enhancement

Page 8: Chapter 7 Space and Time Tradeoffs James Gain & Sonia Berman (jgain,sonia@cs.uct.ac.za)

How far to shift a pattern of length m?

look to last character C in mismatched text segmentC is not in the pattern shift by m

.....D...................... (D not in pattern) BAOBAB

BAOBABC is in the first m-1 positions of the pattern align rightmost occurrence in pattern with text

.....O...................... (O occurs once in pattern)

BAOBAB BAOBAB

Another: .....A...................... (A occurs 2ce in

pattern) BAOBAB

BAOBAB (right) .....A...................... (A occurs 2ce in pattern)

BAOBAB BAOBAB (wrong!)

Page 9: Chapter 7 Space and Time Tradeoffs James Gain & Sonia Berman (jgain,sonia@cs.uct.ac.za)

How far to shift a pattern of length m?

look to last character C in mismatched text segmentC is not in the pattern shift by m

.....D...................... (D not in pattern) BAOBAB

BAOBABC is in the first m-1 positions of the pattern align rightmost occurrence in pattern with text

.....O...................... (O occurs once in pattern)

BAOBAB BAOBAB

Another: .....A...................... BAOBAB

BAOBAB .BAOBAB.....................

BAOBAB BAOBAB

Page 10: Chapter 7 Space and Time Tradeoffs James Gain & Sonia Berman (jgain,sonia@cs.uct.ac.za)

How far to shift a pattern of length m?

look to last character C in mismatched text segmentC is not in the pattern shift by m

.....D...................... (D not in pattern) BAOBAB

BAOBABC is in the first m-1 positions of the pattern align rightmost occurrence in pattern with text

.....O...................... (O occurs once in pattern)

BAOBAB BAOBAB

Why don’t we include in the last position? ......B......................

CONFLAB

CONFLAB.....B......................

BAOBABBAOBAB

Page 11: Chapter 7 Space and Time Tradeoffs James Gain & Sonia Berman (jgain,sonia@cs.uct.ac.za)

Shift Table

Stores number of characters to shift by depending on first character compared (i.e. rightmost)Construct shift table before searching startsIndexed by the alphabet of textAll entries are initialized to pattern length. Eg, BAOBAB:

For c occurring in pattern, update table entry to distance of rightmost occurrence of c from end of patternshould we do this LR or R L ?

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6

Page 12: Chapter 7 Space and Time Tradeoffs James Gain & Sonia Berman (jgain,sonia@cs.uct.ac.za)

Horspool’s Algorithm

1. Construct the Shift Table 2. Align the pattern against the beginning of

the text3. Repeat until a match is found or the

pattern reaches beyond the text:Starting from the pattern end, compare corresponding characters until all m are matched (success!) or a mismatch is foundOn a mismatch, retrieve the shift table entry T(c), where c is the text character aligned with the end of the pattern. Shift the pattern right by T(c)

Page 13: Chapter 7 Space and Time Tradeoffs James Gain & Sonia Berman (jgain,sonia@cs.uct.ac.za)

Example: Horspool’s Algorithm

Pattern: ZIGZAGText: A ZIG, A ZAG, AGAIN A ZIGZAGShift Table:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

1 6 6 6 6 6 3 6 4 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 2

Page 14: Chapter 7 Space and Time Tradeoffs James Gain & Sonia Berman (jgain,sonia@cs.uct.ac.za)

Boyer-Moore Algorithm

Based on same two ideas:Compare pattern characters to text from right to leftGiven a pattern, create a shift table that determines how much to shift the pattern

Except:Uses an additional good-suffix shift table with same idea applied to the number of matched characters

Efficiency:Horspool’s approach is simpler and at least as efficient in the average case

Page 15: Chapter 7 Space and Time Tradeoffs James Gain & Sonia Berman (jgain,sonia@cs.uct.ac.za)

Boyer-Moore AlgorithmGood-suffix shift:If k symbols matched before failing, shift the

pattern up G(k) positionsBad-symbol shift:If k symbols matched before failing, shift the

pattern up T(c) – k positions (where c is the character that didn’t match)actually shift = max ( T(c) – k, 1)

Algorithm: 1. Build tables T and G 2. When searching, use either bad-symbol

shift or good-suffix shift, whichever is larger

Page 16: Chapter 7 Space and Time Tradeoffs James Gain & Sonia Berman (jgain,sonia@cs.uct.ac.za)

Example good-suffix tables

E.g. 1 Pattern: BIGWIG

E.g. 2Pattern: BAOBAB

k=1

k=2

k=3

k=4

k=5

3 3 6 6 6

k=1

k=2

k=3

k=4

k=5

2 5 5 5 5

Page 17: Chapter 7 Space and Time Tradeoffs James Gain & Sonia Berman (jgain,sonia@cs.uct.ac.za)

Find BAOBAB in:BESS KNEW ABOUT

BAOBABSBad-Symbol Shift Table:

Good-Suffix Shift Table:

A B … O …

1 2 6 3 6

k=1

k=2

k=3

k=4

k=5

2 5 5 5 5

Page 18: Chapter 7 Space and Time Tradeoffs James Gain & Sonia Berman (jgain,sonia@cs.uct.ac.za)

Find ZIGZAG in:A ZIG, A ZAG, AGAIN A ZIGZAG

Bad-Symbol Shift Table:

Good-Suffix Shift Table:

A … G … I … Z

1 6 3 6 4 6 2

k=1

k=2

k=3

k=4

k=5

3 6 6 6 6

Page 19: Chapter 7 Space and Time Tradeoffs James Gain & Sonia Berman (jgain,sonia@cs.uct.ac.za)

Strengths and Weaknesses of

Space-Time TradeoffsStrengths:

Suited to amortised situationsParticularly effective in accelerating access to data

Weaknesses:Can be space expensive and this might have empirical effects on speed