22c: 21 Data Structures

39
22c: 21 Data Structures Lecture on 03/13/2009

description

22c: 21 Data Structures. Lecture on 03/13/2009. Outline. Infix to Postfix Conversion with Stack B-Trees. Infix to Postfix. Why do we need the stack?. a + b ab +. a + b. a. b. +. +. Example. Infix String : a+b*c-d. Postfix String : abc*+d-. b*c -> bc*. a+bc* -> abc*+. - PowerPoint PPT Presentation

Transcript of 22c: 21 Data Structures

Page 1: 22c: 21 Data Structures

22c: 21 Data Structures

Lecture on 03/13/2009

Page 2: 22c: 21 Data Structures

Outline

• Infix to Postfix Conversion with Stack

• B-Trees

Page 3: 22c: 21 Data Structures

Infix to Postfix

(A + B) / D A B + D /

(A + B) / (D + E) A B + D E + /

(A - B / C + E)/(A + B)

A B C / - E + A B + /

A * ( B + C ) / D ABC+*D/

Page 4: 22c: 21 Data Structures

Why do we need the stack?

a + b ab +

a + b a b

+

+

Page 5: 22c: 21 Data Structures

Example

Infix String : a+b*c-d

Postfix String : abc*+d-

b*c -> bc* a+bc* -> abc*+ abc*+-d -> abc*+d-

Page 6: 22c: 21 Data Structures

Algorithm

• Initially the Stack is empty and our Postfix string has no characters.

• Now, the first character scanned is 'a'. 'a' is added to the Postfix string.

• The next character scanned is '+'. It being an operator, it is pushed to the stack.

Page 7: 22c: 21 Data Structures

+

a

Page 8: 22c: 21 Data Structures

Algorithm

• Next character scanned is 'b' which will be placed in the Postfix string.

• Next character is '*' which is an operator.

• Now, the top element of the stack is '+' which has lower precedence than '*', so '*' will be pushed to the stack.

Page 9: 22c: 21 Data Structures

+

a

*

b

Page 10: 22c: 21 Data Structures

Algorithm

• The next character is 'c' which is placed in the Postfix string.

• Next character scanned is '-'. The topmost character in the stack is '*' which has a higher precedence than '-'.

Page 11: 22c: 21 Data Structures

Algorithm

• Thus '*' will be popped out from the stack and added to the Postfix string.

• Even now the stack is not empty.

• Now the topmost element of the stack is '+' which has equal priority to '-'. So pop the '+' from the stack and add it to the Postfix string.

• The '-' will be pushed to the stack.

Page 12: 22c: 21 Data Structures

+a

-

bc *

Page 13: 22c: 21 Data Structures

Algorithm

• Next character is 'd' which is added to Postfix string.

• Now all characters have been scanned so we must pop the remaining elements from the stack and add it to the Postfix string.

Page 14: 22c: 21 Data Structures

-a +abc * d

Page 15: 22c: 21 Data Structures

Algorithm

• scan the Infix string from left to right.• Initialize an empty stack.

• If the scanned character is an operand, add it to the Postfix string.

• If the scanned character is an operator and if the stack is empty push the character to stack.

Page 16: 22c: 21 Data Structures

Algorithm

• If the scanned character is an operator and the stack is not empty, compare the precedence of the character with the element on top of the stack (topStack).

• If topStack has higher precedence over the scanned character Pop the stack else Push the scanned character to stack.

Page 17: 22c: 21 Data Structures

Algorithm

• Repeat this step as long as stack is not empty and topStack has precedence over the character.

• Repeat this step till all the characters are scanned.

Page 18: 22c: 21 Data Structures

Algorithm

• (After all characters are scanned, we have to add any character that the stack may have to the Postfix string.)

• If stack is not empty add topStack to Postfix string and Pop the stack.

• Repeat this step as long as stack is not empty.

• Return the Postfix string.

Page 19: 22c: 21 Data Structures

B-Trees

• In computer science, a B-tree is a tree data structure that keeps data sorted and allows searches, insertions, and deletions in logarithmic time.

Page 20: 22c: 21 Data Structures

B -Tree

• Unlike self-balancing binary search trees, it is optimized for systems that read and write large blocks of data. It is most commonly used in databases and file systems.

• Example?

Page 21: 22c: 21 Data Structures

Secondary Storage Access

• Binary Search Trees• AVL Trees• M-ary Search Trees• B- Trees

Lesser the height of the tree,

quicker is an element access

Page 22: 22c: 21 Data Structures

B-tree of order 5

41 66 87

8 18 26 35

810121416

246

18202224

2628303132

3536373839

41424446

484950

515253

54565859

66686970

72737476

787981

838485 87

8990

929395

979899

48 51 54 72 78 83 92 97

Page 23: 22c: 21 Data Structures

Properties

• Data items are stored at the leaves

• The non leaf nodes store up to M-1 keys

• The root is either a leaf or has between two and M children

Page 24: 22c: 21 Data Structures

Properties

• All non leaf nodes (except the root) have at least M/2 up to M children.

• All leaves are at the same depth and have at least L/2 up to L data items, for some L.

Page 25: 22c: 21 Data Structures

Properties

• Each node represents a disk block

• So we choose M and L on the basis of the size of the items that are being stored

Page 26: 22c: 21 Data Structures

Inserting an Element

• Search if it already exists (no duplicates allowed)

Page 27: 22c: 21 Data Structures

Insert 57

41 66 87

8 18 26 35

810121416

246

18202224

2628303132

3536373839

41424446

484950

515253

5456575859

66686970

72737476

787981

838485 87

8990

929395

979899

48 51 54 72 78 83 92 97

Page 28: 22c: 21 Data Structures

Insert 57

• We had to rearrange the data in the leaf.

• Cost of doing this is negligible compared to a disk access.

Page 29: 22c: 21 Data Structures

Insert 55

• Leaf is already full• Since we now have L+1 items we

split them into two leaves• Distribute data evenly between

leaves• Update parent

Page 30: 22c: 21 Data Structures

Insert 55

41 66 87

8 18 26 35

810121416

246

18202224

2628303132

3536373839

41424446

484950

515253

545556

66686970

72737476

787981

838485 87

8990

929395

979899

48 51 54 57 72 78 83 92 97

575859

Page 31: 22c: 21 Data Structures

Splitting

• Splitting is time consuming, but it is a rare occurrence.

• For every split, there are roughly L/2 non splits.

Page 32: 22c: 21 Data Structures

Insert 40

• The leaf is full, so we need to split.• But there is no place to add an

extra key.• So, we need to add an extra child

under root.• But, root cannot have more than

M=5 children!

Page 33: 22c: 21 Data Structures

Insert 40

• Hence, the solution is to split parent.

• Then, update all the values

Page 34: 22c: 21 Data Structures

Insert 40

26 41 66 87

8 18 3538

810121416

246

18202224

2628303132

353637

383940

Page 35: 22c: 21 Data Structures

Increasing height

• When a non leaf node is split, its parent gains a child.

• What if the parent is already full?• We continue splitting nodes up the tree

till no splitting is required.• If we split the root, we have two roots,

so add another single root at the top.

Page 36: 22c: 21 Data Structures

Deletion

• Find the item, and then remove it• What if the leaf already had

minimum number of elements?• Adopt a neighbor item if the

neighbor is not itself at its minimum

• Otherwise, combine neighbors.

Page 37: 22c: 21 Data Structures

Delete 99

26 41 66 87

8 18 3538 72 7883 9297

810121416

246

18202224

2628303132

353637

383940

66686970

72737476

787981

838485

878990

929395

979899

Page 38: 22c: 21 Data Structures

Delete 99

26 41 66 87

8 18 3538 72 78 8792

810121416

246

18202224

2628303132

353637

383940

66686970

72737476

787981

838485

878990

9293959798

Page 39: 22c: 21 Data Structures

Questions?