A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

190
INTRODUCTION TO ALGORITHMS AND DATA STRUCTURES A.E. Csallner Department of Applied Informatics University of Szeged Hungary

Transcript of A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Page 1: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

INTRODUCTION TO ALGORITHMS AND DATA

STRUCTURES A.E. Csallner

Department of Applied Informatics University of Szeged

Hungary

Page 2: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 2

Algorithms

Algorithm: Finite sequence of finite steps Provides the solution to a given

problem

Properties: Finiteness Definiteness Executability

About algorithms

Communication: Input Output

Page 3: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 3

Structured programming

Design strategies:

Bottom-up: synthesize smaller algorithmic parts into bigger ones

Top-down: formulate the problem and repeatedly break it up into smaller and smaller parts

About algorithms

Page 4: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 4

Example : Shoe a horse

shoe a horse

shoe a hoof

drive a coginto a hoof

hammer ahorseshoe

hammera cog

Structured programming

a horse has four hooves

need a horseshoe need to fasten the horseshoe to the hoof

need cogs

Page 5: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 5Structured programming

Basic elements of structured programming

Sequence: series of actions

Selection: branching on a decision

Iteration: conditional repetition

All structured algorithms can be defined using only these three elements (E.W. DIJKSTRA 1960s)

Page 6: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 6

Algorithm description methods

An algorithm description method defines an algorithm so that the description code should

be unambiguous;

programming language independent;

still easy to implement;

state-of-the-art

Algorithm description

Page 7: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 7

Some possible types of classification:

Age (when the description method was invented)

Purpose (e.g. structural or object-oriented)

Formulation (graphical or text code, etc.)

...

Algorithm description

Page 8: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 8

Most popular and useful description methods

Flow diagram

old

not definitely structured(!)

graphical

very intuitive and easy to use

Algorithm description

Page 9: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 9Algorithm description

START

STOP

STOP

A possible notation of flow diagrams

Circle:

Page 10: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 10Algorithm description

Any action execution can be

given here

Rectangle:

A possible notation of flow diagrams

Page 11: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 11Algorithm description

Any yes/no questio

n

yes

no

Diamond:

A possible notation of flow diagrams

yes

Page 12: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 12

Iteration

Algorithm description

An example:

A possible notation of flow diagrams

START

Need more horseshoes

?

Hammer a horseshoe

Shoe a hoofSTOP

yes

no

Selection Sequence

Page 13: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 13

Most popular and useful description methods

Pseudocode

old

definitely structured

text based

very easy to implement

Algorithm description

Page 14: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 14Algorithm description

Assignment instruction:

Looping constructs as in Pascal:

for-do instruction (counting loop)for variable initial value to/downto final

valuedo body of the loop

Properties of a possible pseudocode

Page 15: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 15Algorithm description

while-do instruction (pre-test loop)while stay-in test

do body of the loop

repeat-until instruction (post-test loop)repeat body of the loop

until exit test

Properties of a possible pseudocode

Page 16: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 16Algorithm description

Conditional constructs as in Pascal:

if-then-else instruction (else clause is optional)if test

then test passed clauseelse test failed clause

Blocks are denoted by indentation

Properties of a possible pseudocode

Page 17: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 17Algorithm description

Object identifiers are references

Field of an object separator is a dot:object.fieldobject.methodobject.method(formal parameter list)

Empty reference is NIL

Properties of a possible pseudocode

Page 18: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 18Algorithm description

Arrays are objects

Parameters are passed by value

Properties of a possible pseudocode

Page 19: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 19

An example:

ShoeAHorse(Hooves)hoof 1while hoof ≤ Hooves.Count

do horseshoe HammerAHorseshoeHooves[hoof] horseshoehoof hoof + 1

Algorithm description

Properties of a possible pseudocode

Sequence

Iteration

Page 20: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 20

Type algorithms

Algorithm classification on the I/O structure

Sequence → Value

Sequence → Sequence

More sequences → Sequence

Sequence → More sequences

Type algorithms

Page 21: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 21

Sequence → Value

sequence calculations (e.g. summation, product of

a series, linking elements together, etc.),

decision (e.g. checking whether a sequence

contains any element with a given property),

selection (e.g. determining the first element in a

sequence with a given property provided we know

that there exists at least one),

Type algorithms

Page 22: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 22

Sequence → Value (continued)

search (e.g. finding a given element),

counting (e.g. counting the elements

having a given property),

minimum or maximum search (e.g.

finding the least or the largest element).

Type algorithms

Page 23: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 23

Sequence → Sequence

selection (e.g. collect the elements with a

given property of a sequence),

copying (e.g. copy the elements of a

sequence to create a second sequence),

sorting (e.g. arrange elements into an

increasing order).Type algorithms

Page 24: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 24

More sequences → Sequence

union (e.g. set union of sequences),

intersection (e.g. set intersection of

sequences),

difference (e.g. set difference of sequences),

uniting sorted sequences (merging /

combing two ordered sequences).

Type algorithms

Page 25: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 25

Sequence → More sequences

filtering (e.g. filtering out elements of a

sequence having given properties).

Type algorithms

Page 26: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 26

Special algorithms

Iterative algorithm

Consists of two parts:

Initialization (usually initializing data)

Iteration (repeated part)

Special algorithms

Page 27: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 27

Recursive algorithms

Basic types:

direct (self-reference)

indirect (mutual references)

Two alternative parts depending on the base

criterion:

Base case (if the problem is small enough)

Recurrences (direct or indirect self-reference)

Special algorithms

Page 28: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 28

An example of recursive algorithms:

Towers of Hanoi

Special algorithms

Aim:

Move n disks from a rod to another, using a third one

Rules:

One disk moved at a time

No disk on top of a smaller one

Page 29: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 29Special algorithms

1st step: move n–1 disks

2nd step: move1 disk 3rd step:

move n–1 disks

Recursive solution of the problem

Page 30: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 30

Pseudocode of the recursive solution

TowersOfHanoi(n,FirstRod,SecondRod,ThirdRod)1 if n > 02 then TowersOfHanoi(n – 1,FirstRod,ThirdRod,SecondRod)3 write “Move a disk from ” FirstRod “ to ” SecondRod4 TowersOfHanoi(n – 1, ThirdRod,SecondRod,FirstRod)

Special algorithms

line 2

line 3

line 4

Page 31: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 31

Backtracking algorithms

Backtracking algorithm:

Sequence of systematic trials

Builds a tree of decision branches

Steps back (backtracking) in the tree if no

branch at a point is effective

Special algorithms

Page 32: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 32

An example of the backtracking

algorithms

Eight Queens Puzzle:

Special algorithms

eight chess queens to be

placed on a chessboard

so that no two queens

attack each other

Page 33: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 33

Pseudocode of the iterative solution

EightQueens1 column  12 RowInColumn[column]  03 repeat4 repeat inc(RowInColumn[column])5 until IsSafe(column, RowInColumn)6 if RowInColumn[column] > 87 then column  column – 18 else if column < 89 then column  column + 110 RowInColumn[column]  011 else draw chessboard12 until column = 0

Special algorithms

Page 34: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 34

Complexity of algorithms

Questions regarding an algorithm: Does it solve the problem? How fast does it solve the problem? How much storage place does it occupy

to solve the problem?

Analysis of algorithms

Complexity issuesof the algorithm

Page 35: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 35

Elementary storage or time: independent from the size of the input.

Example 1If an algorithm needs 500 kilobytes to store

some internal data, this can be considered as elementary.

Example 2If an algorithm contains a loop whose body

is executed 1000 times, it counts as an elementary algorithmic step.

Analysis of algorithms

Page 36: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 36

Hence a block of instructions count as a single elementary step if none of the particular instructions depends on the size of the input.

A looping construct counts as a single elementary step if the number of iterations it executes does not depend on the size of the input and its body is an elementary step.

⇒ to shoe a horse can be considered as an elementary step ⇔ it takes constant time (one step) to shoe a horseAnalysis of algorithms

Page 37: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 37

The time complexity of an algorithm is a function depending on the size of the input.

Notation: T(n) where n is the size of the input

Function T can depend on more than one variable, e.g. T(n,m) if the input of the algorithm is an n⨯m matrix.

Analysis of algorithms

Page 38: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 38

Example: Find the minimum of an array.

Minimum(A)1 min  A[1]2 i  13 repeat 4 i  i + 15 if A[i] < min 6 then min  A[i]7 until i  A.Length8 return min

Analysis of algorithms

1

1 n − 1

Page 39: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 39

Hence T(n) = n (where n = A.Length)

Does this change if line 8 (return min) is considered as an extra step?

In other words: n ≈ n + 1

It does not change!Proof:n + 1 = (n − 1) + 2

Analysis of algorithms

?this counts asa singleelementary step

≈ (n − 1) + 1 = n

Page 40: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 40

This so-called asymptotic behavior can be formulated rigorously in the following way:

We say that f (x) = O(g(x)) (big O notation) if

(∃C, x0 > 0) (∀x ≥ x0) 0 ≤ f (x) ≤ C∙g(x)

means that g is an asymptotic upper bound of f

Analysis of algorithms

Page 41: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 41Analysis of algorithms

f (x)

g(x)

C∙g(x)

x0

Page 42: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 42

The O notation denotes an upper bound.

If g is also a lower bound of f then we say that

f (x) = θ (g(x)) if

(∃c, C, x0 > 0) (∀x ≥ x0) 0 ≤ c∙g(x) ≤ f (x) ≤ C∙g(x)

means that f asymptotically equals gAnalysis of algorithms

Page 43: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 43Analysis of algorithms

f (x)

g(x)

C∙g(x)

x0C x0c

c∙g(x)

=x0

Page 44: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 44

What does the asymptotic notation show us?

We have seen:T(n) = θ (n) for the procedure Minimum(A)

where n = A.Length

However, due to the definition of the θ function T(n) = θ (n), T(2n) = θ (n), T(3n) = θ (n) ...

Minimum does not run slower on more data?Analysis of algorithms

?

Page 45: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 45

What does the asymptotic notation show us?

Asymtotic notation shows us the tendency:

Analysis of algorithms

T(n) = θ (n) linear tendencyn data → a certain amount of time t2n data → time ≈ 2t3n data → time ≈ 3t

T(n) = θ (n2) quadratic tendencyn data → a certain amount of time t2n data → time ≈ 22t = 4t3n data → time ≈ 32t = 9t

Page 46: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 46

Analyzing recursive algorithms

Recursive algorithm – recursive function TExample: Towers of Hanoi

TowersOfHanoi(n,FirstRod,SecondRod,ThirdRod)1 if n > 02 then TowersOfHanoi(n – 1,FirstRod,ThirdRod,SecondRod)3 write “Move a disk from ” FirstRod “ to ” SecondRod4 TowersOfHanoi(n – 1, ThirdRod,SecondRod,FirstRod)

Analysis of algorithms

T(n)=

T(n−1) +T(n−1)+1

=2T(n−1)+1

Page 47: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 47

T(n) = 2T(n−1) + 1 is a recursive function

In general it is very difficult (sometimes insoluble) to determine the explicit form of an implicit (recursive) formula

If the algorithm is recursive, the solution can be achieved using recursion trees.

Analysis of algorithms

T(n)=

=2T(n−1)+1

Page 48: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 48

Recursion tree of TowersOfHanoi:

Analysis of algorithms

1

2

4

n

n−1 1 n−1

n−2 1 n−2 n−2 1 n−2

1 1 1 1

1 1 1 1 2n−1

2n−1

Page 49: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 49

Time complexity:T(n) = 2n − 1 = θ (2n) − exponential time

(very slow)

Example: n = 64 (from the original legend)T(n) = 2n − 1 = 264 − 1 ≈ 1.8∙1019 seconds =≈ 3∙1017 minutes =≈ 5.1∙1015 hours =≈ 2.1∙1014 days =

≈ 5.8∙1011 years > half a trillion years

Analysis of algorithms

= (assuming one disk

move per second)

Page 50: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 50

Different cases

Problem (example): search a given element in a sequence (array).

LinearSearch(A,w)1 i  02 repeat i  i + 13 untilA[i] = w  or  i = A.Length4 if A[i] = w then return i5 else return NIL

Analysis of algorithms

Page 51: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 51

Array:

Best caseElement wanted: 8Time complexity: T(n) = 1 = θ (1)

Worst caseElement wanted: 2Time complexity: T(n) = n = θ (n)

Analysis of algorithms

8 1 3 9 5 6 2

Average case?

Page 52: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 52

Array:

The mean value of the time complexities on all possible inputs:

T(n) = = n∙(n + 1) / 2n = (n + 1) / 2 = θ

(n)

(The same as in the worst case)

Analysis of algorithms

8 1 3 9 5 6 2

Average case?

1+ 2+ 3+ 4+ ...+ n( ) / n =

8 1 3 9 5 6 2

Page 53: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 53

Basic data structures

To store a set of data of the same type in a linear structure, two basic solutions exist:

Arrays: physical sequence in the memory

Linked lists: the particular elements are linked together using links (pointers or indices)

Arrays and linked lists

18

29

22

18

29

22h

ead

key link

Page 54: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 54

Search Insert Delete Minimum Maximum Successor

Predecessor

Array O(n) O(n) O(n) O(n) O(n) O(n) O(n)

Linked list O(n) O(1) O(1) O(n) O(n) O(n) O(n)

Arrays vs. linked lists

Time complexity of some operations on arrays and linked lists in the worst case

Arrays and linked lists

Page 55: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 55

Doubly linked lists:

Dummy head lists:

Indirection (indirect reference): pointer.key Double indirection: pointer.link.key

18

29

22h

ead

18

29

22

dum

my

head

X

pointer

to be continued...

Arrays and linked lists

Page 56: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 56

Array representation of linked lists

18

29

22

dum

my

head

X

22

X18

29

1 2 3 4 5 6 7 8

key

0 5 7 2link

3dummyhead Problem: a lot of garbage

Arrays and linked lists

Page 57: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 57

Garbage collection for array-represented lists

The empty cells are linked to a separate garbage list using the link array:

22

X18

29

1 2 3 4 5 6 7 8

key

8 0 5 0 7 1 2 4link

3dummyhead 6garbage

Arrays and linked lists

Page 58: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 58

To allocate place for a new key and use it: the first element of the garbage list is

linked out from the garbage and linked into the proper list with a new

key (33 here) if necessary.

22

X18

33

29

1 2 3 4 5 6 7 8

key

8 0 5 0 7 1 2 4link

3dummyhead 6garbage 1

6

6new

5

Arrays and linked lists

Page 59: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 59

Pseudocode for garbage managementAllocate(link)

1 if link.garbage = 02 then return  03 else new  link.garbage4link.garbage  link[link.garbage]5 return  new

Free(index,link)1 link[index]  link.garbage2 link.garbage  index

Arrays and linked lists

Page 60: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 60

Dummy head linked lists (...continued) FindAndDelete for simple linked lists

FindAndDelete(toFind,key,link)1 if key[link.head] = toFind2 then toDelete  link.head3 link.head  link[link.head]4 Free(toDelete,link)5 else toDelete  link[link.head]6 pointer  link.head7 whiletoDelete  0 and key[toDelete]  toFind8 do pointer  toDelete9 toDelete  link[toDelete]10 if toDelete  011 then link[pointer]  link[toDelete]12 Free(toDelete,link)

extra case:the first elementis to be deleted

an additional pointer is neededto step forward

Arrays and linked lists

Page 61: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 61

Dummy head linked lists (...continued) FindAndDelete for dummy head linked

lists FindAndDeleteDummy(toFind,key,link)

1 pointer  link.dummyhead2 whilelink[pointer]  0  and  key[link[pointer]]  toFind3 do pointer  link[pointer]4 if link[pointer]  05 then toDelete  link[pointer]6 link[pointer]  link[toDelete]7 Free(toDelete,link)

Arrays and linked lists

Page 62: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 62

Common properties: only two operations are defined:

store a new key (called push and enqueue, resp.) extract a key (called pop and dequeue, resp.)

all (both) operations work in constant time

Different properties: stacks are LIFO structures queues are FIFO (or pipeline) structures

Stacks and queues

Stacks and queues

Page 63: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 63

Two erroneous cases:

an empty data structure is intended to be extracted from: underflow

no more space but insertion attempted: overflow

Stacks and queues

Page 64: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 64

Stack management using arrays

push(8)

top

Stack: push(1)push(3)push(9) Stack overflowpoppoppoppop Stack underflow

3

1

8

Stacks and queues

Page 65: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 65

Stack management using arrays

Push(key,Stack)1 if Stack.top = Stack.Length2 then return  Overflow error3 else Stack.top  Stack.top + 14 Stack[Stack.top]  key

stack overflow

Stacks and queues

Page 66: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 66

Stack management using arrays

Pop(Stack)1 if Stack.top = 02 then return  Underflow error3 else Stack.top  Stack.top − 14return  Stack[Stack.top + 1]

stack underflow

Stacks and queues

Page 67: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 67

?

Queue management using arrays

Queue:

138 24 56 7 9

end ↓

← beginning

Empty queue:• beginning = n• end = 0

Stacks and queues

Page 68: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 68

Queue management using arrays

Enqueue(key,Queue)1 if Queue.beginning = Queue.end2 then return  Overflow error3 else ifQueue.end = Queue.Length4 thenQueue.end  15 elseQueue.end  Queue.end + 16 Queue[Queue.end]  key

¬ queueoverflow

Stacks and queues

Page 69: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 69

Queue management using arrays

Dequeue(Queue)1 if Queue.end = 02 then return  Underflow error3 else ifQueue.beginning = Queue.Length4 then Queue.beginning  15 else inc(Queue.beginning)6 key  Queue[Queue.beginning]7 if Queue.beginning = Queue.end8 thenQueue.beginning  Queue.Length9 Queue.end  010 return  key

queue underflow

Stacks and queues

Page 70: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 70

Binary search trees

Linear data structures cannot provide better time complexity than n in some cases

Idea: let us use another kind of structure

Solution: rooted trees (especially binary trees) special order of keys (‘search trees’)

Binary search trees

Page 71: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 71

A binary tree:Notions:

Binary search trees

vertex (node)

edge

root

twins(siblings)

parent - child

leaf

levels

depth (height)

Page 72: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 72

A binary

Binary search trees

28

12 30

21

14 26

49

50

7

all keys in theleft subtreeare smaller

tree:search

all keys in theright subtreeare greater

for all vertices

Page 73: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 73

Implementation of binary search trees:

Binary search trees

28

12 30

21

14 26

49

50

7

key and other data

link to the left child

link to the right

child

link to the parent

Page 74: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 74

Binary search tree operations: tree walk

inorder:1. left2. root3. right

Binary search trees

28

12 30

21

14 26

49

50

7

71214212628304950

incre

asin

g o

rder

Page 75: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 75

InorderWalk(Tree)1 if Tree  NIL2 thenInorderWalk(Tree.Left)3 visit Tree, e.g. check it or list it4 InorderWalk(Tree.Right)

The so-called preorder and postorder tree walks only differ by the order of lines 2-4: preorder: root → left → right postorder: left → right → root

Binary search trees

Page 76: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 76

Binary search tree operations: tree search

Binary search trees

28

12 30

21

14 26

49

50

7

TreeSearch(14)

<

<

<

TreeSearch(45)<

<

<

Page 77: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 77

TreeSearch(toFind,Tree)1 whileTree  NIL  and  Tree.key  toFind2 do if toFind < Tree.key3 thenTree  Tree.Left4 elseTree  Tree.Right5 return  Tree

Binary search trees

Page 78: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 78

Binary search tree operations: insert

Binary search trees

28

12 30

21

14 26

49

50

7

TreeInsert(14)

<

<

<

new vertices are always inserted as leaves

Page 79: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 79

Binary search tree operations: tree minimum tree maximum

Binary search trees

28

12 30

21

14 26

49

50

7

Page 80: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 80

TreeMinimum(Tree)1 while Tree.Left  NIL2 do Tree  Tree.Left3 return  Tree

TreeMaximum(Tree)1 while Tree.Right   NIL2 do Tree  Tree.Right3 return  Tree

Binary search trees

Page 81: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 81

Binary search tree operations: successor of an

element

Binary search trees

28

12 30

21

14 26

49

50

7

TreeSuccessor(12)

treeminimu

m

TreeSuccessor(26)

if the element has no right child:

parent-left childrelation

Page 82: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 82

TreeSuccessor(Element)1 if Element.Right  NIL2 thenreturn  TreeMinimum(Element.Right)3 else Above  Element.Parent4 while Above  NIL  and 

Element = Above.Right5 do Element  Above6Above  Above.Parent7 return  Above

Finding the predecessor is similar.

Binary search trees

Page 83: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 83

Binary search tree operations: delete

Binary search trees

28

12 30

21

14 26

49

50

7

TreeDelete(26)

1. if the element has no children:

Page 84: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 84

Binary search tree operations: delete

Binary search trees

28

12 30

21

14 26

7 49

50

TreeDelete(30)

2. if the element has only one child:

Page 85: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 85

7

Binary search tree operations: delete

Binary search trees

28

12 30

21

2614

49

50

TreeDelete(12)

3. if the element has two children:

12 is substituted for a close key, e.g. the successor, 14

the successor, found in the right subtree has at most one child

treeminimu

m

Page 86: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 86

The case if Element has no children:

TreeDelete(Element,Tree)1 ifElement.Left = NIL  and  Element.Right = NIL2 then if Element.Parent = NIL3 then Tree  NIL4 else ifElement = (Element.Parent).Left5 then(Element.Parent).Left  NIL6 else(Element.Parent).Right  NIL7 Free(Element)8 return  Tree9- next page

Binary search trees

Page 87: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 87

The case if Element has only a right child:

-8 previous page

9 if Element.Left = NIL  and  Element.Right  NIL10 then if Element.Parent = NIL11 then Tree  Element.Right12 (Element.Right).Parent  NIL13 else(Element.Right).Parent  Element.Parent14 ifElement = (Element.Parent).Left15 then(Element.Parent).Left  Element.Right16 else(Element.Parent).Right  Element.Right17 Free(Element)18 return  Tree19- next page

Binary search trees

Page 88: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 88

The case if Element has only a left child:

-18 previous page

19 if Element.Left  NIL  and  Element.Right = NIL20 then if Element.Parent = NIL21 then Tree  Element.Left22 (Element.Left).Parent  NIL23 else(Element.Left).Parent  Element.Parent24 ifElement = (Element.Parent).Left25 then(Element.Parent).Left  Element.Left26 else(Element.Parent).Right  Element.Left27 Free(Element)28 return  Tree29- next page

Binary search trees

Very similar to the

previous case

Page 89: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 89

The case if Element has two children:

-28 previous page

29 if Element.Left  NIL  and  Element.Right  NIL30 then Substitute  TreeSuccessor(Element)31 if Substitute.Right  NIL32 then (Substitute.Right).Parent  Substitute.Parent33 if Substitute = (Substitute.Parent).Left34 then (Substitute.Parent).Left  Substitute.Right35 else (Substitute.Parent).Right  Substitute.Right36 Substitute.Parent  Element.Parent37 if Element.Parent = NIL38 then Tree  Substitute39 else if Element = (Element.Parent).Left40 then (Element.Parent).Left  Substitute41 else (Element.Parent).Right  Substitute42 Substitute.Left  Element.Left43 (Substitute.Left).Parent  Substitute44 Substitute.Right  Element.Right45 (Substitute. Right).Parent  Substitute27 Free(Element)28 return  Tree

Binary search trees

Substitute is linked outfrom its place

Substitute is linked intoElements place

Page 90: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 90

Time complexity ofbinary search tree operations

T(n) = O(d) for all operations (except for the walk), where d denotes the depth of the tree

The depth of any randomly built binary search tree is d = O(log n)

Hence the time complexity of the search tree operations in the average case is

T(n) = O(log n)

Stacks and queues

Page 91: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 91

Binary search

If insert and delete is used rarely then it is more convenient and faster to use an oredered array instead of a binary search tree.

Faster: the following operations have T(n) = O(1) constant time complexity: minimum, maximum, successor, predecessor.

Binary search

Search has the same T(n) = O(log n) time complexity as on binary search trees:

Page 92: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 92

Let us search key 29 in the ordered array below:

Binary search

Search has the same T(n) = O(log n) time complexity as on binary search trees:

2 3 712

29

31

45

search here

central element<

Page 93: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 93

Let us search key 29 in the ordered array below:

Binary search

Search has the same T(n) = O(log n) time complexity as on binary search trees:

2 3 712

29

31

45

search here

central element<

Page 94: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 94

Let us search key 29 in the ordered array below:

Binary search

Search has the same T(n) = O(log n) time complexity as on binary search trees:

2 3 712

29

31

45

search here

central element

= found!

Page 95: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 95

This result can also be derived from:if we halve n elements k times, we get 1 ⇔

n / 2k = 1 ⇔ k = log2 n = O(log n)

Binary search

Search has the same T(n) = O(log n) time complexity as on binary search trees:

2 3 712

29

31

45

O(log n)

Page 96: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 96

Sorting

ProblemThere is a set of data from a base set with a given order over it (e.g. numbers, texts). Arrange them according to the order of the base set.

Example

Sorting

12

2 7 3 sorting

Page 97: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 97

Sorting sequencesWe sort sequences in a lexicographical order: from two sequences the sequence is ‘smaller’ which has a smaller value at the first position where they differ.

Example (texts)

Sorting

g o o dg o n e ?

n < o in the alphabet

<

Page 98: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 98

75

69

22

14

8

Insertion sort

Principle

Insertion sort

Page 99: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 99

Implementation of insertion sort with arrays

insertion step:

Insertion sort

22

69

75

38

14

sorted part unsorted part

Page 100: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 100

InsertionSort(A)1 for i  2 to A.Length2 do ins  A[i]3 j  i – 14 whilej > 0  and  ins < A[j]5 do A[j + 1]  A[j]6 j  j – 17 A[j + 1]  ins

Insertion sort

Page 101: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 101

Time complexity of insertion sort

Best caseIn each step the new element is inserted to the end of the sorted part:T(n) = 1 + 1 + 1 +...+ 1 = n − 1 = θ (n)

Worst caseIn each step the new element is inserted to the beginning of the sorted part:T(n) = 2 + 3 + 4 +...+ n = n(n + 1)/2 − 1 = θ (n2)

Insertion sort

Page 102: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 102

Time complexity of insertion sort

Average caseIn each step the new element is inserted somewhere in the middle of the sorted part:

T(n) = 2/2 + 3/2 + 4/2 +...+ n/2 == (n(n + 1)/2 − 1) / 2 = θ (n2)

The same as in the worst case

Insertion sort

Page 103: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 103

Another implementation of insertion sort

The input is providing elements continually (e.g. file, net)

The sorted part is a linked list where the elements are inserted one by one

The time complexity is the same in every case.

Insertion sort

Page 104: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 104

Another implementation of insertion sort

The linked list implementation delivers an on-line algorithm: after each step the subproblem is

completely solved the algorithm does not need the whole

input to partially solve the problem

Cf. off-line algorithm: the whole input has to be known prior to

the substantive procedureInsertion sort

Page 105: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 105

Merge sort

Principle

Merge sort

6914 8 75 2 2225 36

sort the parts recursively

148 69 75 22 252 36

Page 106: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 106

22 25 3669 75

Merge sort

merge (comb) the parts

28 14

ready

Page 107: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 107

Time complexity of merge sort

Merge sort is a recursive algorithm, and so is its time complexity function T(n)

What it does: First it halves the actual (sub)array: O(1) Then calls itself for the two halves:

2T(n/2) Last it merges the two ordered parts:

O(n)

Hence T(n) = 2T(n/2) + O(n) = ?Merge sort

Page 108: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 108

Recursion tree of merge sort:

n

2(n/2)

n

n/2 n/2

n/4 n/4 n/4 n/4

1 1 1 1

n∙log n

4(n/4)

n

Merge sort

Page 109: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 109

Time complexity of merge sort is

T(n) = θ (n∙logn)

This worst case time complexity is optimal among comparison sorts (using only pair comparisons)

⇒ fastbut unfortunately merge sort does not sort

in-place, i.e. it uses auxiliary storage of a size comparable with the input

Merge sort

Page 110: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 110

Heapsort

An array A is called heap if for all its elementsA[i] ≥ A[2i] and A[i] ≥ A[2i + 1]

This property is called heap propertyIt is easier to understand if a binary tree is

built from the elements filling the levels row by row

Heapsort

45

27

34

20

23

31

18

19

314

Page 111: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 111Heapsort

45

27

34

20

23

31

18

19

314

Page 112: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 112Heapsort

45

27

34

20

23

31

18

19

314

1

2 3

4 5 6 7

8 9 10

The heap property turns into a simple parent-child relation in the tree representation

Page 113: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 113

An important application of heaps is realizing

priority queues:

A data structure supporting the operations

insert maximum (or minimum) extract maximum (or extract minimum)

Heapsort

Page 114: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 114

First we have to build a heap from an array.

Let us suppose that only the kth element infringes the heap property.

In this case it is sunk level by level to a place where it fits. In the example k = 1 (the root):

Heapsort

Page 115: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 115Heapsort

15

37

34

20

23

31

18

19

314

1

2 3

4 5 6 7

8 9 10

k = 1•The key and its children are compared•It is exchanged for the greater child

Page 116: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 116Heapsort

37

15

34

20

23

31

18

19

314

1

2 3

4 5 6 7

8 9 10

k = 2•The key and its children are compared•It is exchanged for the greater child

Page 117: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 117Heapsort

37

23

34

20

15

31

18

19

314

1

2 3

4 5 6 7

8 9 10

k = 5•The key and its children are compared•It is the greatest ⇒ ready

Page 118: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 118

Sink(k,A)1 if2*k ≤ A.HeapSize  and  A[2*k] > A[k]2 then greatest  2*k3 else greatest  k4 if 2*k + 1 ≤ A.HeapSize  and

A[2*k + 1] > A[greatest]5 then greatest  2*k + 16 if greatest  k7 thenExchange(A[greatest],A[k])8 Sink(greatest,A)

Heapsort

Page 119: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 119

To build a heap from an arbitrary array, all elements are mended by sinking them:

BuildHeap(A)1 A.HeapSize  A.Length2 for k  A.Length / 2  downto  13 do Sink(k,A)

Heapsort

this is the array’s last element that has any children

we are stepping backwards; this way every visited element has only ancestors which fulfill the heap property

Page 120: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 120

Time complexity of building a heap

To sink an element costs O(logn) in the worst case

Since n/2 elements have to be sunk, an upper bound for the BuildHeap procedure is

T(n) = O(n∙logn)

It can be proven that the sharp bound isT(n) = θ (n)

Heapsort

Page 121: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 121

Time complexity of the priority queue operations if the queue is realized using heaps

insert append the new element to the array O(1) exchange it for the root O(1) sink the root O(logn)

The time complexity isT(n) = O(logn)

Heapsort

Page 122: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 122

Time complexity of the priority queue operations if the queue is realized using heaps

maximum read out the key of the root O(1)

The time complexity isT(n) = O(1)

Heapsort

Page 123: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 123

Time complexity of the priority queue operations if the queue is realized using heaps

extract maximum exchange the root for the array’s last element

O(1) extract the last element O(1) sink the root O(logn)

The time complexity isT(n) = O(logn)

Heapsort

Page 124: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 124

The heapsort algorithm

build a heap θ (n) iterate the following (n−1)∙O(logn) =

O(n∙logn): exchange the root for the array’s last element

O(1) exclude the heap’s last element from the heap

O(1) sink the root O(logn)

The time complexity isT(n) = O(n∙logn)

Heapsort

Page 125: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 125

HeapSort(A)1 BuildHeap(A)2 for k  A.Length  downto  23 doExchange(A[1],A[A.HeapSize])4A.HeapSize  A.HeapSize – 15 Sink(1,A)

Heapsort

Page 126: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 126

Quicksort

Principle

Quicksort

6922 8 75 12 1425 36

Rearrange and part the elements so that every key in the first part is smaller than any in the second part.

Page 127: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 127

Quicksort

Principle

Quicksort

1214 8 75 69 2225 36

Rearrange and part the elements so that every key in the first part is smaller than any in the second part.

Page 128: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 128

Quicksort

Principle

Quicksort

1214 8 75 69 2225 36

Sort each part recursively,

128 14 22 36 6925 75

this will result in the whole array being sorted.

128 14 22 36 6925 75

Page 129: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 129

The partition algorithm

choose any of the keys stored in the array; this will be the so-called pivot key

exchange the large elements at the beginning of the array to the small ones at the end of it

6922 8 75 12 1425 3622

pivot keynot less than the pivot key

not greater than the pivot key

Quicksort

Page 130: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 130

Partition(A,first,last)1 left  first – 12 right  last + 13 pivotKey  A[RandomInteger(first,last)]4 repeat5 repeat left  left + 16 until A[left] ≥ pivotKey7 repeat right  right – 18 until A[right] ≤ pivotKey9 if left < right10 thenExchange(A[left],A[right])11 else return  right12 until  false

Quicksort

Page 131: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 131

The time complexity of the partition algorithm is

T(n) = θ (n)because each element is visited exactly

once.

The sorting is then:

QuickSort(A,first,last)1 if first < last2 thenborder  Partition(A,first,last)3 QuickSort(A,first,border)4QuickSort(A,border+1,last)

Quicksort

Page 132: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 132

Quicksort is a divide and conquer algorithm like merge sort, however, the partition is unbalanced (merge sort always halves the subarray).

The time complexity of a divide and conquer algorithm highly depends on the balance of the partition.

In the best case the quicksort algorithm halves the subarrays at every step ⇒

T(n) = θ (n∙logn)

Quicksort

Page 133: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 133

Recursion tree of the worst case

Quicksort

n

n − 1

n

1 n − 1

1 n − 2

1 1

n∙(n + 1) / 2

n − 2

0

Page 134: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 134

Thus, the worst case time complexity ofsort is

T(n) = θ (n2)

The average case time complexity isT(n) = θ (n∙logn)

the same as in the best case!

The proof is difficult but let’s see a special case to understand quicksort better.

Quicksort

quick

Page 135: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 135

Let λ be a positive number smaller than 1:0 < λ < 1

Assumption: the partition algorithm never provides a worse partition ratio than

(1− λ) : λ

Example 1: Let λ := 0.99The assumption demands that the partition algorithm does not leave less than 1% as the smaller part.

Quicksort

Page 136: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 136

Example 2: Let λ := 0.999 999 999Due to the assumption, if we have at most one billion(!) elements then the assumption is fulfilled for any functioning of the partition algorithm.

(Even if it always cuts off only one element from the others).

In the following it is assumed for the sake of simplicity that λ ≥ 0.5, i.e. always the λ part is bigger.

Quicksort

Page 137: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 137Quicksort

Recursion tree of the λ ratio case

n

(1 − λ)n

λn

(1 − λ)λn

λ2n

λdn

≤ n∙logn

n

≤ n

n

≤ n

Page 138: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 138

In the special case if none of the parts arising at the partitions are bigger than a given λ ratio (0.5 ≤ λ < 1), the time complexity of quicksort is

T(n) = O(n∙logn)

The time complexity of quicksort is practically optimal because the number of elements to be sorted is always bounded by a number N (finite storage). Using the value λ = 1 − 1/N it can be proven that quicksort finishes in O(n∙logn) time in every possible case.

Quicksort

Page 139: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 139

Greedy algorithms

Problem

Optimization problem: Let a function f(x) be given. Find an x where f is optimal (minimal or maximal) ‘under given circumstances’

‘Given circumstances’: An optimization problem is constrained if functional constraints have to be fulfilled such as g(x) ≤ 0

Greedy algorithms

Page 140: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 140

Feasible set: the set of those x values where the given constraints are fulfilled

Constrained optimization problem:

minimize f(x)subject to g(x) ≤ 0

Greedy algorithms

Page 141: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 141

Example

Problem: There is a city A and other cities B1,B2,...,Bn which can be reached from A by bus directly. Find the farthest of these cities where you can travel so that your money suffices.

Greedy algorithms

A

B1 B2 Bn...

Page 142: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 142

Model: Let x denote any of the cities: x ∊

{B1,B2,...,Bn}, f(x) the distance between A and x, t(x) the price of the bus ticket from A to x, m the money you have, and g(x) = t(x) − m the constraint function.

The constrained optimization problem to solve:

minimize (− f(x))s.t. g(x) ≤ 0

Greedy algorithms

Page 143: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 143

In general, optimization problems are much more difficult!

However, there is a class of optimization problems which can be solved using a step-by-step simple straightforward principle:

greedy algorithms:

at each step the same kind of decision is made, striving for a local optimum, and

decisions of the past are never revisited.

Greedy algorithms

Page 144: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 144

Question:Which problems can be solved using greedy algorithms?

Answer:Problems which obey the following two rules: Greedy choice property: If a greedy choice

is made first, it can always be completed to achieve an optimal solution to the problem.

Optimal substructure property: Any substructure of an optimal solution provides an optimal solution to the adequate subproblem.

Greedy algorithms

Page 145: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 145

Counter example

Find the shortest route from Szeged to Budapest.

The greedy choice property is infringed:

You cannot simply choose the closest town first

Greedy algorithms

Page 146: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 146Greedy algorithms

Budapest

Szeged

Deszk

Deszk is the closest to Szeged but situated in the opposite direction

Page 147: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 147

Proper example

Activity-selection problem:Let’s spend a day watching TV.

Aim: Watch as many programs (on the wole) as you can.

Greedy strategy:Watch the program ending first, then the

next you can watch on the whole ending first, etc.

Activity-selection problem

Page 148: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 148Activity-selection problem

Let’s sort the programs by their ending timesInclude the first oneExclude those which have already begunNo more programs left: ready

The optimum is 4 (TV programs)

Page 149: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 149Activity-selection problem

Check the greedy choice property: The first choice of any optimal

solution can be exchanged for the greedy one

Page 150: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 150Activity-selection problem

Check the optimal substructure property: The part of an optimal solution is

optimal also for the subproblemIf this was not optimal for the subproblem,

the whole solution could be improved by improving the subproblem’s solution

Page 151: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 151

Huffman codes

Notions

C is an alphabet if it is a set of symbols

F is a file over C if it is a text built up of the characters of C

Huffman codes

Page 152: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 152

Assume we have the following alphabetC = {a, b, c, d, e}

Code it with binary codewords of equal lengthHow many bits per codeword do we need at

least?2 are not enough (only four codewords: 00, 01,

10, 11)Build codewords using 3 bit coding

Huffman codes

a = 000b = 001c = 010d = 011e = 100

Page 153: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 153

Build the T binary tree of the coding

Huffman codes

a = 000b = 001c = 010d = 011e = 100

0 1

a = 000b = 001c = 010d = 011e = 100

0 01

a = 000b = 001c = 010d = 011e = 100

a b c d e

0 0 01 1

a = 000b = 001c = 010d = 011e = 100

a = 000b = 001c = 010d = 011e = 100

a = 000b = 001c = 010d = 011e = 100

c

a = 000b = 001c = 010d = 011e = 100

Page 154: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 154

Further notation

For each cC character its frequency in the file is denoted by f(c)

For each cC character its length is defined by its depth in the T tree of coding, dT(c)

Hence the length of the file (in bits) equals B(T)=c C f(c)dT(c)

Huffman codes

Page 155: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 155

Problem

Let a C alphabet and a file over it given. Find a T coding of the alphabet with minimal B(T)

Huffman codes

Page 156: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 156

Example

Consider an F file of 20,000 characters over the alphabet C = {a, b, c, d, e}

Assume the frequencies of the particular characters in the file are

Huffman codes

f(a) = 5,000f(b) = 2,000f(c) = 6,000f(d) = 3,000f(e) = 4,000

Page 157: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 157

Using the 3 bit coding defined previously, the bit-length of the file equals

B(T)=c C f(c)dT(c)=

5,0003+2,0003+6,0003+3,0003+4,0003=

(5,000+2,000+6,000+3,000+4,000)3=20,0003=60,000

This is a so-called fixed-length code since for all x,yC dT(x)=dT(y) holds

Huffman codes

Page 158: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 158

The fixed-length code is not always optimal

Huffman codes

0 1

0 01

e

0

a b c d

0 01 1

B(T’)=B(T)−f(e)1=60,000−4,000

1 =56,000

Page 159: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 159

Idea

Construct a variable-length code, i.e., where the code-lengths for different characters can differ from each other

We expect that if more frequent characters get shorter codewords then the resulting file will become shorter

Huffman codes

Page 160: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 160

Problem: How do we recognize when a codeword ends and a new begins. Using delimiters is too “expensive”

Solution: Use prefix codes, i.e., codewords none of which is also a prefix of some other codeword

Result: The codewords can be decoded without using delimiters

Huffman codes

Page 161: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 161

For instance if

then the following codes’ meaning is1000010000010010 =

However, what if a variable-length code was not prefix-free:

Huffman codes

a c b c c a b

a = 10b = 010

c = 00

Page 162: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 162

Then if

then100= b or 100= a c ?

An extra delimiter would be needed

Huffman codes

a = 10b = 100

c = 0

a = 10b = 100

c = 0

Page 163: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 163

Realize the original idea with prefix codes

Huffman codes

f(a) = 5,000f(b) = 2,000f(c) = 6,000f(d) = 3,000f(e) = 4,000

rare

frequent

Frequent codewords should be shorter, e.g.,a = 00, c = 01, e = 10

Rare codewords can be longer, e.g.,b = 110, d = 111

Page 164: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 164

Question: How can such a coding be done algorithmically?

Answer: The Huffman codes provide exactly this solution

Huffman codes

Page 165: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 165

The bitlength of the file using this K prefix code is

B(K)=c C f(c)dK(c)=

5,0002+2,0003+6,0002+3,0003+4,0002=

(5,000+6,000+4,000)2+(2,000+3,000 )3=

30,000+15,000=45,000

(cf. the fix-length codes gave 60,000,the improved one 56,000)

Huffman codes

Page 166: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 166

The greedy method producing Huffman codes

1. Sort the characters of the C alphabet in increasing order according to their frequency in the file and link them to an empty list

2. Delete the two leading characters, some x and y from the list and connect them with a common parent z node. Let f(z)=f(x)+f(y), insert z into the list and repeat step 2 until the the list runs empty.

Huffman codes

Page 167: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 167

List:

Example

Huffman codes

a : 5 b : 2 c : 6 d : 3 e : 4

character frequency (thousands)

Page 168: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 168

List:

Example1. Sort

Huffman codes

a : 5 b : 2 c : 6 d : 3 e : 4

Page 169: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 169

List:

Example2. Merge and rearrange

Huffman codes

e : 4 a : 5 c : 6b : 2 d : 3

5

Page 170: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 170

List:

Example2. Merge and rearrange

Huffman codes

e : 4 a : 5 c : 6

b : 2 d : 3

5

9

Page 171: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 171

List:

Example2. Merge and rearrange

Huffman codes

a : 5 c : 6

e : 4

b : 2 d : 3

5

9

11

Page 172: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 172

List:

Example2. Merge and rearrange

Huffman codes

e : 4

b : 2 d : 3

5

9

a : 5 c : 6

11

200

0

0

0

1

11

1

Page 173: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 173

ExampleReady

Huffman codes

e : 4

b : 2 d : 3

5

9

a : 5 c : 6

11

200

0

0

0

1

11

1

a = 10b = 010c = 11d = 011e = 00

Page 174: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 174

ExampleLength of file in bits

Huffman codes

a = 10b = 010c = 11d = 011e = 00

B(H)=c C f(c)dH(c)=

5,0002+2,0003+6,0002+3,0003+4,0002=

(5,000+6,000+4,000)2+(2,000+3,000 )3=

30,000+15,000=45,000

f(a) = 5,000f(b) = 2,000f(c) = 6,000f(d) = 3,000f(e) = 4,000

Page 175: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 175

Optimality of the Huffman codes

Assertion 1. There exists an optimal solution where the two rarest characters are deepest twins in the tree of the coding

Assertion 2. Merging two (twin) characters leads to a problem similar to the original one

Corollary. The Huffman codes provide an optimal character coding

Huffman codes

Page 176: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 176

Proof of Assertion 1 (There exists an optimal solution where the two rarest characters are deepest twins in the tree of

the coding).

Huffman codes

Two rarest characters

Changing nodes this way the total lenght does not increase

Page 177: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 177

Proof of Assertion 2 (Merging two (twin) characters

leads to a problem similar to the original one).

Huffman codes

Twin characters

The new problem is smaller than the original one but similar to it

Page 178: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 178

Graph representations

Graphs can represent different structures, connections and relations

Graphs

1

4

2

3

Weighted graphs can represent capacities or actual flow rates

7

2

4

5

1

4

2

3

7

2

4

5

Page 179: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 179

1: there is an edge leading from ‘row’ to ‘column’0: there is no such edge

1

4

2

3

7

2

4

5

Adjacency-matrix

Graphs

1 2 3 4

1 0 1 0 1

2 1 0 0 1

3 0 0 0 1

4 1 1 1 0

1 2 3 4

1 0 2 0 7

2 2 0 0 4

3 0 0 0 5

4 7 4 5 0

1 2 3 4

1 2 0 7

2 0 4

3 5

4

Drawback 1: redundant elementsDrawback 2: superfluous elements

1 2 3 4

1 2 0 7

2 0 4

3 5

4

Page 180: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 180

Optimal storage usage Drawback: slow search operations

1

4

2

3

Adjacency-list

Graphs

1 2 4

2 4 1

3 4

4 1 3 2

Page 181: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 181

Single-source shortest path methods

Problem: find the shortest path between two vertices in a graph

Source: the starting point (vertex)

Single-source shortest path method: algorithm to find the shortest path to all vertices in a graph running out

Graphs

Page 182: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 182

Walk a graph:

choose an initial vertex as the source

visit all vertices starting from the source

Graph walk methods:

depth-first search

breadth-first search

Graph walk

Page 183: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 183

Depth-first search

Backtrack algorithm It goes as far as it can without revisiting

any vertex, then backtracks

source

Graph walk

Page 184: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 184

Breadth-first search

Like an explosion in a mine The shockwave reaches the adjacent

vertices first, and starts over from them

Graph walk

Page 185: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 185

The breadth-first search is not only simpler to implement but it is also the basis for several important graph algorithms (e.g. Dijkstra)

Notation in the following pseudocode: A is the adjacency-matrix of the graph s is the source D is an array containing the distances from

the source P is an array containing the predecessor

along a path Q is the queue containing the unprocessed

vertices already reached

Graph walk

Page 186: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 186

BreadthFirstSearch(A,s,D,P)1 for i  1 to A.CountRows2 do P[i]  03 D[i]  ∞4 D[s]  05 Q.Enqueue(s)6 repeat7 v  Q.Dequeue8 for j  1 to A.CountColumns9 do if A[v,j] > 0  and  D[j] = ∞10 then D[j]  D[v] + 111 P[j]  v12 Q.Enqueue(j)13 until Q.IsEmpty

Graph walk

Page 187: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 187

The D,P pairs are displayed in the figure.

Graph walk

1

4

2

3

5

6

8

9

7

10

0,0

1,4

1,4

1,4

1,4

2,6

2,6

3,9

2,6

3,9

D is the shortest distance from the source The shortest paths can be reconstructed

using P

Page 188: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 188

Dijkstra’s algorithm

Problem: find the shortest path between two vertices in a weighted graph

Idea: extend the breadth-first search for graphs having integer weights:

Dijkstra’s algorithm

3

virtual vertices

unweighted edges (total weight = 3∙1 = 3)

Page 189: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 189

Dijkstra(A,s,D,P)1 for i  1 to A.CountRows2 do P[i]  03 D[i]  ∞4 D[s]  05 for i  1 to A.CountRows6 do M.Enqueue(i)7 repeat8 v  M.ExtractMinimum9 for j  1 to A.CountColumns10 do if A[v,j] > 011 thenif D[j] > D[v] + A[v,j]12 thenD[j]  D[v] + A[v,j]13 P[j]  v14 until M.IsEmpty

Dijkstra’s algorithm

minimum priority queue

Page 190: A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

Algorithms and Data Structures I 190

Time complexity of Dikstra’s algorithm

Initialization of D and P: O(n) Building a heap for the priority queue:

O(n) Search: n∙O(logn + n) = O(n(logn + n)) =

O(n2)

Grand total: T(n) = O(n2)

Dijkstra’s algorithm

extracting the minimum checking all neighbors

number of loop executions