Coms w3134 Midterm Review

Post on 06-Feb-2016

28 views 4 download

Tags:

description

Review on Data Structures

Transcript of Coms w3134 Midterm Review

Data Structures in JavaMidterm Review

3/10/2015

Daniel Bauer

Midterm • Midterm on Thursday (in-class)

• Similar format to sample questions.

• Closed books/notes/electronic devices (except calculators).

• Bring a pen, water, and nothing else.

• 60 minutes. Be on time!

• If you are taking the midterm tonight (only if signed up!): 5pm in 620 CEPSR/Shapiro

Topics - Overview• Series, Proofs.

• Running Time Analysis of Algorithms. Big-Oh Notation.

• Abstract Data Types.

• Data Structure Implementations.

• Applications.

• Implementations in Java, Java Concepts.

Types of Proofs

• Proof by Induction

• Proof by Contradiction

• Proof by Counterexample

4

Remember some examples?

Goals of Algorithm Analysis

• Does the algorithm terminate?

• Does the algorithm solve the problem? (correctness)

• What resources does the algorithm use?

• Time / Space

5

Comparing Function Growth: Big-Oh Notation

if there are positive constants and such that when .

T(N) = 10N+ 100

f(N) = N2 + 2

e.g. c = 1, n0 = 16.1

Comparing Function Growth: Big-Oh Notation

if there are positive constants and such that when .

T(N) = 10N+ 100

f(N) = N2 + 2

e.g. c = 1, n0 = 16.1

“T(N) is in the order of f(N)”

Comparing Function Growth: Big-Oh Notation

if there are positive constants and such that when .

T(N) = 10N+ 100

f(N) = N2 + 2

e.g. c = 1, n0 = 16.1

“T(N) is in the order of f(N)”

“f(N) is an upper bound

on T(N)”

if there are positive constants and such that when .

Comparing Function Growth: Additional Notations

if and .

• Lower Bound:

• Tight Bound: T(N) and f(N) grow at the same rate

if for all positive constants• Strict Upper Bound:

there is some such that when .

Typical Growth Rates

logarithmiclog-squaredlinear

quadratic

cubic

exponential

constant

Data Structures for Sequences

List

Array Simple Linked List Doubly Linked List

Stack (LIFO)

Linked Lists as Stacks Array Stack

Queue (FIFO)

Linked Lists as Queue Circular Array Queue

Tree Data StructuresTree

Fixed number of children (Binary, N-Ary Tree) Sibling List Representation

Tree Data StructuresTree

Fixed number of children (Binary, N-Ary Tree) Sibling List Representation

Binary Search TreeSearch Tree

N-Ary Search Tree

Ordered Sets/Maps

Tree Data StructuresTree

Fixed number of children (Binary, N-Ary Tree) Sibling List Representation

Binary Search TreeSearch Tree

N-Ary Search Tree

AVL TreeB-Tree

Balanced Search Tree

Ordered Sets/Maps

Sets and MapsSet Map

Ordered Set Ordered MapBalanced Search Tree

Hash Table

Balanced Search Tree

Linked List entriesProbing Hash Tables

The List ADT

A0 A2 A3 A4 A5 A6A1

The List ADT

• A list L is a sequence of N objects A0, A1, A2, …, AN-1

A0 A2 A3 A4 A5 A6A1

The List ADT

• A list L is a sequence of N objects A0, A1, A2, …, AN-1

• N is the length/size of the list. List with length N=0 is called the empty list.

A0 A2 A3 A4 A5 A6A1

The List ADT

• A list L is a sequence of N objects A0, A1, A2, …, AN-1

• N is the length/size of the list. List with length N=0 is called the empty list.

• Ai follows/succeeds Ai-1 for i > 0.

A0 A2 A3 A4 A5 A6A1

The List ADT

• A list L is a sequence of N objects A0, A1, A2, …, AN-1

• N is the length/size of the list. List with length N=0 is called the empty list.

• Ai follows/succeeds Ai-1 for i > 0.

• Ai precedes Ai+1 for i < N.

A0 A2 A3 A4 A5 A6A1

Array List1 7 3 5 2 1 30 1 2 3 4 5 6 7 8 9

N=7

printList

find(x)

findKth(k)

insert(x,k)

remove(x)

O(N)

O(N)

O(1)

Worst Case Running Times

Array List1 7 3 5 2 1 30 1 2 3 4 5 6 7 8 9

insert(5,7): O(1)

5 N=7

printList

find(x)

findKth(k)

insert(x,k)

remove(x)

O(N)

O(N)

O(1)

Worst Case Running Times

Array List1 7 3 5 2 1 30 1 2 3 4 5 6 7 8 9

insert(5,7): O(1)remove(7): O(1)

N=7

printList

find(x)

findKth(k)

insert(x,k)

remove(x)

O(N)

O(N)

O(1)

Worst Case Running Times

Array List1 7 3 5 2 1 3

0 1 2 3 4 5 6 7 8 9insert(5,7): O(1)remove(7): O(1)

insert(5,0): O(N)

57 moves

N=7

printList

find(x)

findKth(k)

insert(x,k)

remove(x)

O(N)

O(N)

O(1)

Worst Case Running Times

Array List1 7 3 5 2 1 3

0 1 2 3 4 5 6 7 8 9insert(5,7): O(1)remove(7): O(1)

insert(5,0): O(N)remove(0): O(N)

57 moves

N=7

printList

find(x)

findKth(k)

insert(x,k)

remove(x)

O(N)O(N)

O(N)

O(N)

O(1)

Worst Case Running Times

Need to copy entire list to larger array if array becomes full.

Simple Linked Lists42 23 5 9

null

printListfind(x)

findKth(k)insert(x,k)remove(k)

next()

Sequence of nodes linked by “next” pointers.

Worst case Running Times

head

Simple Linked Lists42 23 5 9

null

printListfind(x)

findKth(k)insert(x,k)remove(k)

next()

O(N)

Sequence of nodes linked by “next” pointers.

Worst case Running Times

head

Simple Linked Lists42 23 5 9

null

printListfind(x)

findKth(k)insert(x,k)remove(k)

next()

O(N)O(N)

Sequence of nodes linked by “next” pointers.

Worst case Running Times

head

Simple Linked Lists42 23 5 9

null

printListfind(x)

findKth(k)insert(x,k)remove(k)

next()

O(N)O(N)O(N)

Sequence of nodes linked by “next” pointers.

Worst case Running Times

head

Simple Linked Lists42 23 5 9

null

printListfind(x)

findKth(k)insert(x,k)remove(k)

next()

O(N)O(N)O(N)

Sequence of nodes linked by “next” pointers.

O(N)

Worst case Running Times

head

Simple Linked Lists42 23 5 9

null

printListfind(x)

findKth(k)insert(x,k)remove(k)

next()

O(N)O(N)O(N)

Sequence of nodes linked by “next” pointers.

O(N)O(N)

Worst case Running Times

head

Simple Linked Lists42 23 5 9

null

printListfind(x)

findKth(k)insert(x,k)remove(k)

next()

O(N)O(N)O(N)

O(1)

Sequence of nodes linked by “next” pointers.

In many applications we can use an iterator instead of findKth(k).

O(N)O(N)

Worst case Running Times

head

Doubly Linked Lists

printListfind(x)

findKth(k)insert(x,k)remove(k)

next()

O(N)O(N)O(N)

O(1)O(N)O(N)

Worst case Running Times

A0 A1 A2 A3head tail

Actually a little faster in practice, because we only have to search at most half the list.

Sequence of nodes linked by “next” and “prev” pointers.

The Stack ADTLast In First Out (LIFO).

5Top

push(x) O(1)pop() O(1)peek() O(1)empty() O(1)

Operations have the same running time in all implementations:

• Implementations discussed: • Using an Array List, Using a LinkedList • Hardware Stacks (memory abstraction, stack machine)

The Stack ADTLast In First Out (LIFO).

5

42Top

push(x) O(1)pop() O(1)peek() O(1)empty() O(1)

Operations have the same running time in all implementations:

• Implementations discussed: • Using an Array List, Using a LinkedList • Hardware Stacks (memory abstraction, stack machine)

The Stack ADTLast In First Out (LIFO).

5

42

Top

23

3push(x) O(1)pop() O(1)peek() O(1)empty() O(1)

Operations have the same running time in all implementations:

• Implementations discussed: • Using an Array List, Using a LinkedList • Hardware Stacks (memory abstraction, stack machine)

The Stack ADTLast In First Out (LIFO).

5

42

Top 23push(x) O(1)pop() O(1)peek() O(1)empty() O(1)

Operations have the same running time in all implementations:

• Implementations discussed: • Using an Array List, Using a LinkedList • Hardware Stacks (memory abstraction, stack machine)

Stack Applications

Stack Applications• Method call stacks.

Stack Applications• Method call stacks.

• Evaluating postfix expressions.

Stack Applications• Method call stacks.

• Evaluating postfix expressions.

• Converting infix to postfix notation.

Stack Applications• Method call stacks.

• Evaluating postfix expressions.

• Converting infix to postfix notation.

• Constructing an expression tree from a postfix expression.

Stack Applications• Method call stacks.

• Evaluating postfix expressions.

• Converting infix to postfix notation.

• Constructing an expression tree from a postfix expression.

• Perform a tree traversal without recursion (relation to recursion).

Stack Applications• Method call stacks.

• Evaluating postfix expressions.

• Converting infix to postfix notation.

• Constructing an expression tree from a postfix expression.

• Perform a tree traversal without recursion (relation to recursion).

• Implementing Queue.

Stack Applications• Method call stacks.

• Evaluating postfix expressions.

• Converting infix to postfix notation.

• Constructing an expression tree from a postfix expression.

• Perform a tree traversal without recursion (relation to recursion).

• Implementing Queue.

• Re-arranging subway cars.

The Queue ADTFirst In First Out (FIFO) storage.

enqueue(x) O(1)dequeue() O(1)

empty() O(1)

Operations have the same running time in all implementations:

• Implementations discussed: • Using a linked list • Using a “circular array” 5

front back

The Queue ADTFirst In First Out (FIFO) storage.

enqueue(x) O(1)dequeue() O(1)

empty() O(1)

Operations have the same running time in all implementations:

• Implementations discussed: • Using a linked list • Using a “circular array” 5

front back

2 17 23

The Queue ADTFirst In First Out (FIFO) storage.

enqueue(x) O(1)dequeue() O(1)

empty() O(1)

Operations have the same running time in all implementations:

• Implementations discussed: • Using a linked list • Using a “circular array”

front back

2 17 23

The Queue ADTFirst In First Out (FIFO) storage.

enqueue(x) O(1)dequeue() O(1)

empty() O(1)

Operations have the same running time in all implementations:

• Implementations discussed: • Using a linked list • Using a “circular array”

front back

17 23

Circular Array Implementation of Queue

• Problem: In naive array implementation, dequeues cause empty space at the beginning of the array.

• Circular array re-uses empty space by allowing back-pointer to wrap around.

5 17

front back

23 7

Circular Array Implementation of Queue

• Problem: In naive array implementation, dequeues cause empty space at the beginning of the array.

• Circular array re-uses empty space by allowing back-pointer to wrap around.

5 17

frontback

23 7

Need to copy entire queue to larger array if array becomes full.

Tree ADT

• A tree T consists of

• A root node r.

• zero or more nonempty subtrees T1, T2, … TN,

• each connected by a directed edge from r.

• Support typical collection operations: size, get, set, add, remove, find, …

T

Tree ADT

• A tree T consists of

• A root node r.

• zero or more nonempty subtrees T1, T2, … TN,

• each connected by a directed edge from r.

• Support typical collection operations: size, get, set, add, remove, find, …

r

T1 T2 Tn

Representing Trees• Option 2: Organize siblings as a linked list.

n0

n1 n2 n3

1st child next sibling

• Problem: Takes longer to find a node from the root.

1st child next sibling

Representing Trees• Option 1: Every node has fixed number of

references to children.

n0

n1 n2 n3

• Problem: Only reasonable for small or constant number of children.

M-ary Trees• Each node can have M subnodes.

• Height of a complete M-ary tree is .

Binary Trees

• For binary trees, the number of children is at most two.

• Binary trees are very common in data structures and algorithms.

• They are convenient to analyze.

Tree Traversals: In-order

+

+ *

a

b c

d e

f

g* +

*

(a + b * c) + (d * e + f) * g1. Process left child 2. Process root 3. Process right child

Tree Traversals: Post-order

+

+ *

a

b c

d e

f

g* +

*

1. Process left child 2. Process right child 3. Process root

a b c * + d e * f + g * +

Tree Traversals: Pre-order

+

+ *

a

b c

d e

f

g* +

*

1. Process root 2. Process left child 3. Process right child

+ + a * b c * + * d e f g

Binary Search Trees• BST property:

• For all nodes s in Tl, sitem < ritem. • For all nodes t in Tl, titem > ritem.

r

Tl Tr

contains(x) O(height(T))insert(x) O(height(T))findMin() O(height(T))findMax() O(height(T))remove() O(height(T))

Worst and Best Case Height of a Binary Search Tree

• Assume we have a BST with N nodes.

1

2

3

4

• Worst case: T does not branch. height(T)=N

• Best case: height(T)=log N

1

2

3

5

4

AVL Tree Condition• An AVL Tree is a Binary Search Tree in which the

following balance condition holds after each operation:

• For every node, the height of the left and right subtree differs by at most 1.

1

2

4

8

5

3

1

2

4

5

7

3

7

8

2

1 0

3

1 2

1 0 1 1

1 2

3 1

not an AVL tree

Maintaining Balance in an AVL Tree

• Assume the tree is balanced. • After each insertion, find the lowest node k that violates

the balance condition (if any). • Perform rotation to re-balance the tree. • Rotation maintains original height of subtree under k

before the insertion. No further rotations are needed.

Single Rotation

xy

k1

k2

z

Single Rotation

xy

k1

k2

z

Double Rotation

x k2

k3

zk1

yl yr

Double Rotation

x

k2

k3

z

k1

yl yr

B-Trees• A B-Tree is an M-Ary search tree.

• Every internal node (except for the root) has children and contains values.

• All leaves contain values (usually L=M-1)

• All leaves have the same depth.

• Often used to store large tables on hard disk drives.(databases, file systems)

3827

2516 3633 4641 4834

OrderedSet ADT

A BA∩B

A ∪ B 1

2

3 45

67 8

9

• A set with a total order defined on the items (all pairs of items are in a ‘>’ or ‘<‘ relation to each other).

• Supported operations: all Set operations and

• findMin()

• findMax()

Set ADT• A Set is a collection of data that does not allow

duplicates.

• Supported operations: • insert(x)

• remove(x)

• contains(x)

• isEmpty()

• size()1

2

3 4

7

Set ADT• A Set is a collection of data that does not allow

duplicates.

• Supported operations: • insert(x)

• remove(x)

• contains(x)

• isEmpty()

• size()1

2

3 4

7

• addAll(s) / union(s)

• removeAll(s)

• retainAll(s) / intersection(s)

A BA∩B

A ∪ B

5

6

89

Map ADT• A map is collection of (key, value) pairs.

• Keys are unique, values need not be (keys are a Set!).

• Two operations:

• get(key) returns the value associated with this key • put(key, value) (overwrites existing keys)

key1key2key3key4

value1value2value3

Hash Tables

0

1Alice

• Define a table (an array) of some length TableSize.

• Define a function hash(key) that maps key objects to an integer index in the range 0 … TableSize -1

• Assuming hash(key) takes constant time, get and put run in O(1).

2

TableSize - 1

hash(key)555-341-1231 Alice 555-341-1231

Separate Chaining• Keep all items with the same hash value on a linked

list.

• Slow if load factor becomes > 1.

0

12

TableSize - 1

Alice 555-341-1231

Bob 555-341-1231

Anna 555-521-2973

hash(key)

Separate Chaining• Keep all items with the same hash value on a linked

list.

• Slow if load factor becomes > 1.

0

12

TableSize - 1

Alice 555-341-1231

Bob 555-341-1231

Anna 555-521-2973

hash(key) Anna 555-521-2973

Hash Tables without Linked Lists: Probing

01234567

• When a collision occurs put item in an empty cell of the hash table itself.

4089

10

hash(key)40x % 11

7

Linear Probing

01234567 4089

10

hash(key)17x % 11

6

5118

• Can always find alternative cell if there is still space. • Search becomes slow because of primary clustering.

39

17

Quadratic Probing

01234567

25

89

10

hash(key)47x % 11

3

f(3) = 93

14

47

• No primary clustering. • If table size is not prime or table is more than half full it is

possible that no empty cell can be found for a key, even if there is still space in the table.

Double Hashing

01234567 4089

10

hash(key)62x % 11 7

hash2(key)

5 - x % 5

3

f(1) = 1 · hash2(x) =3

84

62

Compute a second hash function to determine a linear offset for this key.