What is the benefit of using BST?

31
05/11/22 ITK 275 1 What is the benefit of using BST? 20 30 25 10 17 35 23 22 21 19 1 5 7 4 3 14 12 13 0 15 11 18 27 26 28 32 34 38 36 37 N=30 log 2 30 = 5 Search 17, 29, 3

description

What is the benefit of using BST?. N=30. Search 17, 29, 3. 19. 10. 28. 4. 14. 23. 35. 1. 7. 12. 17. 26. 32. 21. 37. 0. 3. 5. 11. 13. 15. 18. 20. 22. 25. 27. 30. 34. 36. 38. log 2 30 = 5. O(log 2 n). What is the real benefit of using BST?. Search 18, 2, 26. - PowerPoint PPT Presentation

Transcript of What is the benefit of using BST?

Page 1: What is the benefit of using BST?

04/21/23 ITK 275 1

What is the benefit of using BST?

20 3025

10

17

3523

22

21

19

1

5

7

4

3

14

12

130 1511 18 27

26

28

32

34 3836

37

N=30

log230 = 5

Search 17, 29, 3

Page 2: What is the benefit of using BST?

04/21/23 ITK 275 2

What is the real benefit of using BST?

20

30

25

10

17 35

23

22

21

19

1

5

74

3

14

12

13

0

15

11

18 27

26

28

32

34 38

36

37

O(log2n)

Search 18, 2, 26

Page 3: What is the benefit of using BST?

04/21/23 ITK 275 3

What is the benefit of using BST?

O(log2n)

The average height of a binary search tree of n nodes is

if the n nodes are arrived based on a uniform distribution on the space of possible keys.

This is as good as the binary search.

Proof: skip

f(n) : The average internal path length of an n-node BST

Page 4: What is the benefit of using BST?

04/21/23 ITK 275 4

O(log2n)The average height of a BST :

f(n) : The average internal path length of an n-node BST

f(3) : (3+2+3) / 3 = 2.67

0

1

2

0+1+2=3

1 1

0

0+1+1=2

1

2

0+1+2=3

Page 5: What is the benefit of using BST?

04/21/23 ITK 275 5

The sequence of node insertion will affect the shape of the BST

1

7

12

15

17

23

25

13 27

1, 7, 12, 25, 27, 13, 23, 17, 15

This situation is not uncommon

e.g., the data is roughly sorted.

Highly unbalanced BST

Page 6: What is the benefit of using BST?

04/21/23 ITK 275 6

Highly unbalanced BST1

7

12

15

17

23

25

13 27 1

7

12

13

15

17

23

25

27

Balanced BST

Page 7: What is the benefit of using BST?

04/21/23 ITK 275 7

If R is too big, then Shift a Node from R to L

is a BST

LR

1. Insert to L

2. Find the min in R,

3. Copy to the root and delete it from R

Page 8: What is the benefit of using BST?

04/21/23 ITK 275 8

How big is “too big”?

How to measure the unbalance?

How unbalanced do we allow a BST to be?

Chung-Chih Li. An immediate approach to balancing nodes in binary search trees. Journal of Computing Sciences in Colleges , 21(3):238--245 April 2006.

Page 9: What is the benefit of using BST?

04/21/23 ITK 275 9

Definition: NBk

Node-Balanced of degree K

kRL

kn

nnn

..

nL.nR.

nnnn RL ..

Page 10: What is the benefit of using BST?

04/21/23 ITK 275 10

Insert a node to an NBkk

RLk

nnn

n ..

Insert(, t) : // Insert t to 1.If ( = ) then

(a)new ;(b) t;(c)return ;

2.If (.data > t) then // Insert t to .L(a)If (.Ln - .Rn + 1> n / k) then

(1)ShiftToRight();(2)If ((.data < t) then t;

(b).L = Insert(.L, t);else // Insert t to .R

(a) If (.Rn - .Ln + 1> n / k) then(1)ShiftToLeft();(2)If ((.data > t) then t;

(b) .R = Insert(.R, t);3. Return ;

nL.nR.

n

Page 11: What is the benefit of using BST?

04/21/23 ITK 275 11

Insert a node to an NBk

Insert(, t) : // Insert t to 1.If ( = ) then

(a)new ;(b) t;(c)return ;

2.If (.data > t) then // Insert t to .L(a)If (.Ln - .Rn + 1> n / k) then

(1)ShiftToRight();(2)If ((.data < t) then t;

(b).L = Insert(.L, t);else // Insert t to .R

(a) If (.Rn - .Ln + 1> n / k) then(1)ShiftToLeft();(2)If ((.data > t) then t;

(b) .R = Insert(.R, t);3. Return ;

Page 12: What is the benefit of using BST?

04/21/23 ITK 275 12

Delete a node from an NBkk

RLk

nnn

n ..

Delete(, t): // Delete t from 1.If ( = ) then return ;2.If (.data > t) then // Search t in .L

().L = Delete(.L, t);(b)If (.Rn - .Ln + 1> n/k) then ShiftToLeft();(c)Return ;

3.If (.data < t) then // Search t in .R().R = Delete(.R, t);(b)If (.Ln - .Rn + 1> n/k) then ShiftToRight();(c)Return ;

4. // t = .data, i.e., needs to be deleted If (n = 1 ) then Delete and return ;

5.If (.Ln > .Rn ) then ()b = the maximum node in .L;(b) b.data;(c)Delete(.L, b.data);

else ()b = the minimum node in .R;(b) b.data;(c)Delete(.R, b.data);

6. Return ;

Page 13: What is the benefit of using BST?

04/21/23 ITK 275 13

Delete a node from an NBk

Delete(, t): // Delete t from 1.If ( = ) then return ;2.If (.data > t) then // Search t in .L

().L = Delete(.L, t);(b)If (.Rn - .Ln + 1> n/k) then ShiftToLeft();(c)Return ;

3.If (.data < t) then // Search t in .R().R = Delete(.R, t);(b)If (.Ln - .Rn + 1> n/k) then ShiftToRight();(c)Return ;

4. // t = .data, i.e., needs to be deleted If (n = 1 ) then Delete and return ;

5.If (.Ln > .Rn ) then ()b = the maximum node in .L;(b) b.data;(c)Delete(.L, b.data);

else ()b = the minimum node in .R;(b) b.data;(c)Delete(.R, b.data);

6. Return ;

Page 14: What is the benefit of using BST?

04/21/23 ITK 275 14

Analysis

BST Average Heights on n Random Keys

)(log nO1.

2. ...31107.4),log(loglog nOn

Devroye and Reed,, SIAM J. Comput. ‘95

Page 15: What is the benefit of using BST?

04/21/23 ITK 275 15

Analysis of NBk with n Random Keys

knaX /1 nXa n

a

X

knaX /nnknaa 1/

k

knknaX

2

1/

k

kna

2

1

2

2 2

1

k

knX

X2a2

3

3 2

1

k

knX

the worst case

X3

a3

k

naX

k

n

Page 16: What is the benefit of using BST?

04/21/23 ITK 275 16

Analysis of NBk with n Random Keys

the worst case

At depth h

nk

k h 1)

2

1(

0hX

.log))2log()1(log( nkkh

.log)1log()2log(

logn

kk

nh

hh k

knX )

2

1(

0hX

Page 17: What is the benefit of using BST?

04/21/23 ITK 275 17

Analysis of NBk with n Random Keys

the worst case

At depth h .log)1log()2log(

logn

kk

nh

0hX

k h <

2 2.4094 log(n)

3 1.7095 log(n)

4 1.4748 log(n)

8 1.2047 log(n)

16 10958 log(n)

--- log(n)

...3.4),log(loglog nOnBST:

b )2log(n

AVL:

33.0,44.1 b

Page 18: What is the benefit of using BST?

04/21/23 ITK 275 18

Experimental results

Computational Cost

1. AVL --- algorithm is complicated

2. NBk --- shifting operations

3. BST --- traveling long paths

Page 19: What is the benefit of using BST?

04/21/23 ITK 275 19

Random and no duplication in data

2123

25

28

3133

36

4042

1315

1819

2122

2425

26

1112

1314

1617

1819

2021

45

28

1011

1213

1516

1718

1920

9

14

19

24

29

34

39

44

9 10 11 12 13 14 15 16 17 18

Log (n), n = number of nodes -- all keys are distinct

Hei

ghts

BST

NB2

AVL

NB8

log(n)

1.2 M

Page 20: What is the benefit of using BST?

04/21/23 ITK 275 20

Random and no duplication in data

0.701.77

2.834.00

5.21

6.47

7.80

9.22

10.63

12.10

13.60

15.14

0.571.30

2.153.06

3.994.97

5.926.94 7.98

9.03

10.4011.56

0.461.10

1.822.55

3.324.15

5.055.88

6.757.63

8.569.45

0.410.98

1.612.35

3.103.91

4.685.49

6.377.26

8.179.21

05

1015

20

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2

Sec

onds

Number of nodes -- all keys are distinct

Millions

NB8

AVL

NB2

BST

2.66 GHz P41GB MMS XPVisual C++

Page 21: What is the benefit of using BST?

04/21/23 ITK 275 21

Random, each key has n/100 duplicates

log(n)

1.2 M

1822

2836

52

82

134

247

469

905

1317 19 22 24 26 29 31 33

1013 14 15 16 1714

1218 19 20

1112

19 20 21

110

100

1000

9 10 11 12 13 14 15 16 17 18

Log (n), n = number of nodes -- each key has n/100 duplicates

Hei

ghts

BST

NB2

AVL

NB8

Page 22: What is the benefit of using BST?

04/21/23 ITK 275 22

Random, each key has n/100 duplicates2.66 GHz P41GB MMS XPVisual C++

4.3

18.8

43.7

79.4

126.0183.4

252.5336.9

453.8572.4

707.2856.3

11.515.9

20.525.5

30.8 36.1 41.7 47.6 53.6

19.524.3

29.4 34.5 39.9 45.5 51.3

1.92.6

3.34.1

4.9 5.7 6.6 7.5 8.4 9.37.6

4.2

15.110.9

7.2

4.0

1.41.1

110

100

1000

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2

Millions

Number of nodes -- each key has n/100 duplicates

Seconds BST

NB2

NB8

AVL

Page 23: What is the benefit of using BST?

04/21/23 ITK 275 23

Random, arrive in batches of 32 sorted records

log(n)

1.2 M

130

226293 327 360 391 454 522 586 619

1518

22 22 23 25 27 28 31 32

11 12 13 14 1518 19 20 21

17

10 11

16 17 18 19

110

100

1000

9 10 11 12 13 14 15 16 17 18

Log (n), n = number of nodes -- data arrives in batches of size 32

Hei

ghts

BST

NB2

AVL

NB8

Page 24: What is the benefit of using BST?

04/21/23 ITK 275 24

Random, arrive in batches of 32 sorted records

2.66 GHz P41GB MMS XPVisual C++

18.64

27.70

35.33

45.42

57.59

69.02

77.15

87.34

98.25

108.28

117.11

2.6 3.3 3.9 4.5 5.2 5.8 6.5 7.113.9 16.7 19.4 22.2 25.1

28.531.7

35.3

7.3 9.0 10.7 12.4 14.2 15.9 17.7 19.5

5.58 2.31.20.6

5.52.60.9

8.34.21.5

11.15.72.00

2040

6080

100

120

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2

Millions

Number of nodes -- data arrives in batches of size 32

Sec

onds

NB8

AVL

NB2

BST

Page 25: What is the benefit of using BST?

04/21/23 ITK 275 25

Conclusion NBk

1. can build a near optimal BST even when k is small

2. easy to analyze3. easy to implement4. practical in most conditions

While NBk's computational cost is much

better than BST and close to or better than AVL, there is no guarantee on any data. In other words, it is not as robust as AVL.

Page 26: What is the benefit of using BST?

04/21/23 ITK 275 26

AVL Tree: A BST in which the height difference between the two children of any node is always less than 2.

h

h+1

+1

h’+1h’

+1-1

Page 27: What is the benefit of using BST?

04/21/23 ITK 275 27

Rotations: RR

+1+2

+1

Page 28: What is the benefit of using BST?

04/21/23 ITK 275 28

Rotations: RL

h+1

h+1h h

+1

0

+0

+2

-1

-1

0

0+1

Page 29: What is the benefit of using BST?

04/21/23 ITK 275 29

Rotations: LL

h+1h+1

h

h-1

+1

-1

-1

0

+2

-2

-2

-1

h-1h-1

-1

0

0

+1

Page 30: What is the benefit of using BST?

04/21/23 ITK 275 30

Possible complications

h+1

h+1h h

+2

-1

-1

Re-assign the links

Tracking the heightsand balance-factors

Page 31: What is the benefit of using BST?

04/21/23 ITK 275 31

,)2log()1log( b nhn

32824.0,4402.1 b

h: Average Heights

n Random Keys

AVL