22c: 21 Data Structures

22c: 21 Data Structures

Lecture on 03/13/2009

Outline

• Infix to Postfix Conversion with Stack

• B-Trees

Infix to Postfix

(A + B) / D A B + D /

(A + B) / (D + E) A B + D E + /

(A - B / C + E)/(A + B)

A B C / - E + A B + /

A * ( B + C ) / D ABC+*D/

Why do we need the stack?

a + b ab +

a + b a b

+

+

Example

Infix String : a+b*c-d

Postfix String : abc*+d-

b*c -> bc* a+bc* -> abc*+ abc*+-d -> abc*+d-

Algorithm

• Initially the Stack is empty and our Postfix string has no characters.

• Now, the first character scanned is 'a'. 'a' is added to the Postfix string.

• The next character scanned is '+'. It being an operator, it is pushed to the stack.

Algorithm

• Next character scanned is 'b' which will be placed in the Postfix string.

• Next character is '*' which is an operator.

• Now, the top element of the stack is '+' which has lower precedence than '*', so '*' will be pushed to the stack.

+

a

*

b

Algorithm

• The next character is 'c' which is placed in the Postfix string.

• Next character scanned is '-'. The topmost character in the stack is '*' which has a higher precedence than '-'.

Algorithm

• Thus '*' will be popped out from the stack and added to the Postfix string.

• Even now the stack is not empty.

• Now the topmost element of the stack is '+' which has equal priority to '-'. So pop the '+' from the stack and add it to the Postfix string.

• The '-' will be pushed to the stack.

+a

-

bc *

Algorithm

• Next character is 'd' which is added to Postfix string.

• Now all characters have been scanned so we must pop the remaining elements from the stack and add it to the Postfix string.

-a +abc * d

Algorithm

• scan the Infix string from left to right.• Initialize an empty stack.

• If the scanned character is an operand, add it to the Postfix string.

• If the scanned character is an operator and if the stack is empty push the character to stack.

Algorithm

• If the scanned character is an operator and the stack is not empty, compare the precedence of the character with the element on top of the stack (topStack).

• If topStack has higher precedence over the scanned character Pop the stack else Push the scanned character to stack.

Algorithm

• Repeat this step as long as stack is not empty and topStack has precedence over the character.

• Repeat this step till all the characters are scanned.

Algorithm

• (After all characters are scanned, we have to add any character that the stack may have to the Postfix string.)

• If stack is not empty add topStack to Postfix string and Pop the stack.

• Repeat this step as long as stack is not empty.

• Return the Postfix string.

B-Trees

• In computer science, a B-tree is a tree data structure that keeps data sorted and allows searches, insertions, and deletions in logarithmic time.

B -Tree

• Unlike self-balancing binary search trees, it is optimized for systems that read and write large blocks of data. It is most commonly used in databases and file systems.

• Example?

Secondary Storage Access

• Binary Search Trees• AVL Trees• M-ary Search Trees• B- Trees

Lesser the height of the tree,

quicker is an element access

B-tree of order 5

41 66 87

8 18 26 35

810121416

246

18202224

2628303132

3536373839

41424446

484950

515253

54565859

66686970

72737476

787981

838485 87

8990

929395

979899

48 51 54 72 78 83 92 97

Properties

• Data items are stored at the leaves

• The non leaf nodes store up to M-1 keys

• The root is either a leaf or has between two and M children

Properties

• All non leaf nodes (except the root) have at least M/2 up to M children.

• All leaves are at the same depth and have at least L/2 up to L data items, for some L.

Properties

• Each node represents a disk block

• So we choose M and L on the basis of the size of the items that are being stored

Inserting an Element

• Search if it already exists (no duplicates allowed)

Insert 57

41 66 87

8 18 26 35

810121416

246

18202224

2628303132

3536373839

41424446

484950

515253

5456575859

66686970

72737476

787981

838485 87

8990

929395

979899

48 51 54 72 78 83 92 97

Insert 57

• We had to rearrange the data in the leaf.

• Cost of doing this is negligible compared to a disk access.

Insert 55

• Leaf is already full• Since we now have L+1 items we

split them into two leaves• Distribute data evenly between

leaves• Update parent

Insert 55

41 66 87

8 18 26 35

810121416

246

18202224

2628303132

3536373839

41424446

484950

515253

545556

66686970

72737476

787981

838485 87

8990

929395

979899

48 51 54 57 72 78 83 92 97

575859

Splitting

• Splitting is time consuming, but it is a rare occurrence.

• For every split, there are roughly L/2 non splits.

Insert 40

• The leaf is full, so we need to split.• But there is no place to add an

extra key.• So, we need to add an extra child

under root.• But, root cannot have more than

M=5 children!

Insert 40

• Hence, the solution is to split parent.

• Then, update all the values

Insert 40

26 41 66 87

8 18 3538

810121416

246

18202224

2628303132

353637

383940

Increasing height

• When a non leaf node is split, its parent gains a child.

• What if the parent is already full?• We continue splitting nodes up the tree

till no splitting is required.• If we split the root, we have two roots,

so add another single root at the top.

Deletion

• Find the item, and then remove it• What if the leaf already had

minimum number of elements?• Adopt a neighbor item if the

neighbor is not itself at its minimum

• Otherwise, combine neighbors.

Delete 99

26 41 66 87

8 18 3538 72 7883 9297

810121416

246

18202224

2628303132

353637

383940

66686970

72737476

787981

838485

878990

929395

979899

Delete 99

26 41 66 87

8 18 3538 72 78 8792

810121416

246

18202224

2628303132

353637

383940

66686970

72737476

787981

838485

878990

9293959798

Questions?

22c: 21 Data Structures

Documents

Transcript of 22c: 21 Data Structures