Jim DSA Questions

8/22/2019 Jim DSA Questions

1/26

Questions and exercises for Review

1. What steps should one take when solving a problem using a computer?First construct an exact model in terms of which we can express allowed solutions. Finding such a model is already halfthe solution. (Any branch of mathematics or science can be called into service to help model the problem domain.)Once we have a suitable mathematical model, we can specify a solution in terms of that model.

2. Explain some issues when dealing with the representation of real world objects in a computer program.Issues when dealing with the representation of real world objects in a computer program how real world objects are modeled as mathematical entities, the set of operations that we define over these mathematical entities, how these entities are stored in a computer's memory (e.g. how they are aggregated as fields in records and how theserecords are arranged in memory, perhaps as arrays or as linkedstructures), and the algorithms that are used to perform theseoperations.3. Explain the notions: model of computation; computational problem; problem instance; algorithm; and programModel of Computation: An abstract sequential computer, called a Random Access Machine or RAM. Uniform cost model.

Computational Problem: A specification in general terms of inputs and outputs and the desired input/output relationship.Problem Instance: A particular collection of inputs for a given problem.Algorithm: A method of solving a problem which can be implemented on a computer. Usually there are many algorithmsfor a given problem.Program: Particular implementation of some algorithm.4. Show the algorithm design algorithm.Algorithm design (informal problem)1 formalize problem (mathematically) [Step 0]2 repeat3 devise algorithm [Step 1]4 analyze correctness [Step 2]5 analyze efficiency [Step 3]6 refine7 until algorithm good enough8 return algorithm5. What might be the resources considered in algorithm analysis?Determine the amount of some resource required by an algorithm (usually depending on the size of the input).The resource might be: running time memory usage (space) number of accesses to secondary storage number of basic arithmetic operations network traffic

Formally, we define the running time of an algorithm on a particular input instance to be the number of computationsteps performed by the algorithm on this instance.6. Explain the big-oh class of growth.Typically, problems become computationally intensive as the input size grows. Hence we are studying the asymptoticefficiency of algorithms.Informally, time to solve a problem of size, n, T(n) is O(log n) if T(n)=clog(2)nFormally:O(g(n)) is the set of functions f s.t. f(n)0 and n>N(sufficiently large N)Alternatively we may write lim(n->inf)f(n)/g(n)


2/26

lg^kn=O(n) for all k in N

7. Explain the big-omega class of growth.Typically, problems become computationally intensive as the input size grows. Hence we are studying the asymptoticefficiency of algorithms.Informally, time to solve a problem of size, n, T(n) is O(log n) if T(n)=clog(2)nFormally:O(g(n)) is the set of functions f s.t. f(n)0 and n>N(sufficiently large N)Alternatively we may write lim(n->inf)f(n)/g(n)cg(n) for some constant c and n>Nomega(g(n)): class of functions f(n) that grow at least as fast as g(n)g(n) describes the best case behaviour of an algorithm that is omega(g)8. Explain the big-theta class of growth.Typically, problems become computationally intensive as the input size grows. Hence we are studying the asymptoticefficiency of algorithms.Informally, time to solve a problem of size, n, T(n) is O(log n) if T(n)=clog(2)nFormally:O(g(n)) is the set of functions f s.t. f(n)0 and n>N(sufficiently large N)Alternatively we may write lim(n->inf)f(n)/g(n)0, c2>0, n0 in N s.t.: c2n^2


3/26

14. Suppose we have two algorithms to solve the same problem. One runs in time T1(n) = 400n, whereas the otherruns in time T2(n) = n2. What are the complexities of these two algorithms? For what values of n might we consider usingthe algorithm with the higher complexity?T1-->O(n) T2-->O(n^2) n


4/26

(Size emmeber to know permanently how many records do we have inside the list. First and last elements are needed inorder to know where to start and where to stop the operations on the lists if we traverse them entirely)22. When would you use a doubly-linked list instead of a singly-linked one? Why?

A doubly linked list makes sense when you need to traverse the list in both directions. You aren't able to do that with a

singly linked list.A doubly linked list can be traversed in both directions (forward and backward). A singly linked list can only betraversed in one direction.A node on a doubly linked list may be deleted with little trouble, since we have pointers to the previous and next nodes.A node on a singly linked list cannot be removed unless we have the pointer to its predecessor.

On the flip side however, a doubly linked list needs more operations while inserting or deleting and it needsmore space (to store the extra pointer).

23. Show the result of inserting the numbers 32, 11, 22, 15, 17, 2, -3 n a doubly linked list with a sentinel.

-3 -> 2 -> 17 -> 15 -> 22 ->11 -> 32

24. Show the result of inserting the numbers 32, 11, 22, 15, 17, 2, -3 n a circular queue of capacity 9.

-3,2,17,15,22,11,32,0,0

25. Show the result of inserting the numbers 32, 11, 22, 15, 17, 2, -3 n a stack of capacity 12.

-3, 2, 17, 15, 22, 11, 32, 0,0,0,0,0

26. Determine the value returned by the function (depending on n), and the worst case running time, in big-Ohnotation, of the following programfunction mistery(n)

r:=0for i:=1 to n-1 dofor j:=i+1 to n dofor k:=1 to j dor:=r+1

return(r);

(n-1)n(n+1)/3, O(n^3)

27. Determine the value returned by the function (depending on n), and the worst case running time, in big-Ohnotation, of the following programfunction pesky(n)r:=0for i:=1 to n dofor j:=1 to i dofor k:=j to i+j do

r:=r+1;return(r);

n(n+1)(n+2)/3, O(n^3)

28. Define the term "rooted tree" both formally and informally.Rooted tree: collection of elements called nodes, one of which is distinguished as root, along with a relation("parenthood") that imposes a hierarchical structure on the nodes Formal definiton:

A single node by itself = tree. This node is also the root of the treeAssume n=node and T1,T2,....,Tk=trees with roots n1,n2,...nkconstruct a new tree by making n be the parent of nodes n1,n2,...nk


5/26

Common data structure for non-linear collections29. Define the terms ancestor, descendant, parent, child, sibling as used with rooted trees.ancestor: it is on the unique path from root to xdescendant: if x is on unique path to himparent: A node that has a child is called the child's parent node (or ancestor node, or superior). A node has at most oneparent.child: A node is a structure which may contain a value, a condition, or represent a separate data structure (which could bea tree of its own). Each node in a tree has zero or more child nodes, which are below it in the tree (by convention, treesare drawn growing downwards).

sibling: siblings are nodes that share the same parent node30. Define the terms path, height, depth, level as used with rooted trees.For a rooted tree T = (V, E) with root r V: Path: n1, n2, ..., nk such that ni = parent ni+1 for 1


6/26

26. Construct the tree whose postorder traversal is: 5, 2, 10, 6, 11, 12, 7, 3, 8, 9, 4, 1, and inoder traversal is 5, 2, 1, 10,6, 3, 11, 7, 12, 8, 4, 9.

27. Show the vector contents for an implementation of the tree in Fig. 1.1 2 3 4 5 6 7 8 9 10 11 120 1 1 1 2 3 3 4 4 6 7 728. Show the contents of the data structures (in a sketch) for an implementation of the tree in Fig. 1 using lists ofchildren.1-->2-->3-->4*2-->5*3-->6-->7*4-->8-->9*

5*6-->10*7-->11-->12*8*9*10*11*12*29. Show the contents of the data structures (in a sketch) for an implementation of the tree in Fig. 1 using leftmostchild - right sibling method.1 2 1 *2 5 2 33 6 3 44 8 4 *5 * 5 *6 10 6 77 11 7 *8 * 8 99 * 9 *10 * 10 *11 * 11 1212 * 12 *

30. Show the binary search tree which results after inserting the nodes with keys 5, 2, 10, 6, 11, 12, 7, 3, 8, 9, 4, 1 inthat order, in an empty tree.


7/26

31. Show the binary search tree resulted after deleting keys 10, 5 and 6 in that order from the binary search tree ofFig. 2.

32. How do we find the smallest node in a binary search tree? What is the runtime complexity to do this in both anunbalanced and balanced binary search tree, in the worst case? How do we find the largest node in a binary search tree?What are the runtime complexities for this?The smallest node in a binary search tree is the node that is furthest to the left. To locate this node we descend throughthe tree by following left pointers until reaching the end of the branch. In an unbalanced bynari search tree this requiresO(n) time in the worst case, where n is the number of nodes in the tree. This occurs when the tree consists of a singlebranch to the left fo example. However, if we keep the tree balanced, no branch will be longer than lgn nodes. Thus, therunning complexity of searching for the smallest node in this case is O(lg n). Finding the largest node is a similar process,except that the largest node is the one that is furthest to the right of the tree. The runtime complexities for these are thesame as for locating the smallest node. If we are interested in determining only the smallest/largest element n a set ofdata repeatedly we use a priority queue33. Compare the performance of operations insert (add), delete and find for arrays, doubly-linked lists and BSTs.

34. What is the purpose of static BSTs and what criteria are used to build them?Reduce time of search: More frequently accessed keys kept closer to root

35. If we would have two functions: bitree_rem_left (for removing the left subtree) and bitree_rem_right (for

removing the right subtree), why should we use a postorder traversal used to remove the appropriate subtree? Could apreorder or inorder traversal have been used instead?It is essential to use a postorder traversal here because a subtree must be removed in its entirety before removing itsparent. A preorder traversal ends up removing the parent first, thus freeing the parent and making it impossible to accessits children. An inorder traversal also doesnt work because we still end up removing the parent before its right subtree.(Because in order to make a decision when removing a subtree we have to know all the information. This can be doneonly if we have already visited the left subtree and the root. Therefore the inorder or preorder traversal cannot be used)36. When might we choose to make use of a tree with a relatively large branching factor, instead of a binary tree, forexample?Larger branching factors keep a tree shorter for a given number of nodes, provided the tree remains relatively balanced.Therefore a large branching factor is desirable when an application is particularly sensitive to the height of the tree.

Search trees are a good example, although typically the difference in performance attributed to larger branching factors is


8/26

not that significant when the tree resides in memory. This is one reason that binary trees are most common for searchingin memory. However, when searching in the considerably slower world of secondary storage, a larger branching factorcan make a substantial difference. In this situation, typically some type of B-tree is used.

37. In a binary search tree, the successor of some node x is the next largest node after x. For example, in a binarysearch tree containing the keys 24, 39, 41, 55, 87, 92, the successor of 41 is 55. How do we find the successor of a node ina binary search tree? What is the runtime complexity of this operation?We take the right child. Complexity is O(1).38. A multiset is a type of set that allows members to occur more than once. How would the runtime complexities of

inserting and removing members with a multiset compare with the operations for inserting and removing members froma set?When inserting a memeber into a set in which members may not be duplicated we must search the entire set to ensurethat we don not duplicate a memeber. This is an O(n) process. Removing a memebr from a set is O(n) as well because wemay have to search the enitire set again. In a multiset, inserting a member is considerably more efficient because we donot have to traverse the memebers looking for duplicates. Therefore, we can insert a new memeber in O(1) time. In amultiset removing a member remains an O(n) process because we still must search for the member we want to remove.39. The symmetric difference of two sets consists of those members that are in either of the two sets, but not both.The notation for the symmetric difference of two sets, S1 and S2, is S1 ? S2. How could we implement a symmetricdifference operation using the set operations union, intersection and difference? Could this operation be implementedmore efficiently some other way?

A B= (A\B) U (B\A) {= (A U B) (A B)}Therefore we could implement this operation using 2 calls to set_differenece followed by a call to set_union. Thisproduces a worst-case runningtime of T(m,n)=3mn times some constant, for a complexity of O(mn) where m is the size ofA and n is the size of B.40. Sketch the algorithm for HashInsert in a hash table using open addressing.Hashinsert (B,k)1 i


9/26

operation, where m is the nr of position in the table. This case can occur with any hash function. To ensure reasonableperformance in an open adressed hash table, we should not let the table become more than 80% full.44. Explain the generation of hash codes using memory addresses, integer cast and component sum.Memory address: We reinterpret the memory address of the key object as an integer Good in general, except for numeric and string keys

Integer cast: We reinterpret the bits of the key as an integer Suitable for keys of length less than or equal to the number of bits of the integer type (e.g., byte, short, int, and float in

C)Component sum: We partition the bits of the key into components of fixed length (e.g., 16 or 32 bits) and we sum the components(ignoring overflows) Suitable for numeric keys of fixed length greater than or equal to the number of bits of the integer type (e.g., long anddouble in C)45. Explain the generation of hash codes using polynomial accumulation.Polynomial accumulation:We partition the bits of the key into a sequence of components of fixed length (e.g., 8, 16 or 32 bits) a0 a1 an1We evaluate the polynomialp(x) = a0 + a1 x + a2 x2 ++ an1xn1

at a fixed value x, ignoring overflowsEspecially suitable for strings (e.g., the choice x = 33 gives at most 6 collisions on a set of 50,000 English words)Polynomial p(x) can bevevaluated in O(n) timevusing Horners rule: The following polynomials are successively computed, each from the previous one in O(1) timep0(x) = an1pi (x) = ani1 + xpi1(z)(i = 1, 2, , n 1) We have p(x) = pn1(x)46. How can one implement a compression function using the MAD technique?Division: h2 (y) = y mod m The size m of the hash table is usually chosen to be a prime The reason has to do with number theory and is beyond the scope of this courseMultiply, Add and Divide (MAD): h2 (y) = (ay + b) mod m a and b are nonnegative integers such that a mod m 0 Otherwise, every integer would map to the same value b47. Explain the quadratic hashing rehashing strategy.Linear hashingh(k,i)=(h(k)+i) mod m 0


10/26

Double hashing

h(k,i)=(h1(k)+ih2(k)) mod m h1, h2: auxiliary hash functions; initially, checks position B[h1(k)] is checked; successive positions are h2(k) mod m away from the previous positions (sequence depends in two ways on key k) h2(k) and m must be relatively prime (to allow for the wholetable to be searched). To ensure this condition: take m=2k and make h2(k) generate an odd number or take m prime make h2(k) return a positive integer m smaller than mh1(k)=k mod m

h2(k)=1+(k mod m)49. Show the hash table which results after inserting the values 5, 2, 10, 6, 11, 12, 7, 3, 8, 9, 4, 1 in a chained hashtable with N=5 and hash function h(x)=x mod N.0 5-->10*1 6-->11-->1*2 2-->12-->7*3 3-->8*4 9-->4*

50. Show the hash table which results after inserting the values 5, 2, 10, 6, 11, 12, 7, 3, 8, 9, 4, 1 in an open addressinghash table with N=16, N'=13 using double hashing, The hash functions are h1(x)=x mod N, and h2(x)=1+ (x mod N').

51. What are the operations for the priority queue ADT?Priority queue: an ADT based on the set model with the operations: insert and deletemin (as well as the usual createEmpty for initialization of the data structure). Additional support operations:

min() returns, but does not remove, an entry with smallest key size() isEmpty()52. Compare the performance of priority queues using sorted and unsorted lists.Unsorted list Performance:

insert takes O(1) time (we can insert the item at the beginning or end of the list) deleteMin and min take O(n) time (we have to scan the entire list to find the smallest key)Sorted list Performance:

insert takes O(n) time (we have to fiind a place where to insert the item) deleteMin and min take O(1) time (the item is at the beginning of the list)53. What is a partially ordered tree?

Partially ordered tree: Binary tree At the lowest level, where some leaves may be missing, we require that all missing leaves are to the right of allleaves that are not on the lowest level. Tree is partially ordered: the priority of node v is no greater than the priority of the children of v

54. Show the result of inserting the value 14 in the POT of Fig. 2.


11/26

Fig. 2 A partially ordered tree.

55. Explain the notion "heap".

Complete binary tree of height, h, iff:

it is empty or its left subtree is complete of height h-1 and its right subtree is completely full of height h-2 or its left subtree is completely full of height h-2 and its right subtree is complete of height h-1. A complete tree is filled from the left:

all the leaves are either on the same level or two adjacent ones, and all nodes at the lowest level are as far to the left as possible. Heaps are based on the notion of a complete tree

A binary tree has the heap property if and only if: it is empty or the key in the root is larger than that in either child and both subtrees have the heap property. A heap can be used as a priority queue highest priority item is at the root

value of the heap structure: we can both extract the highest priority item and insert a new one in O(log n) time56. What is an AVL tree?Problem with BSTs: worst case operation may take O(n) time. One solution: AVL tree: binary search tree with a balance condition:

For every node in an AVL tree T, the height of the left (TL) and right (TR) subtrees can differ by at most 1: |hL - hR|


12/26

Fig. 3. An AVL tree.

59. Describe the left-right double rotation in an AVL tree.Left-right: k1 < k2, k1 < k3, k2 < k3 left rotation around the left child of a node followed by a right rotation around the node itself Rotate to make k2 topmost node

60. Draw the AVL tree resulting after deleting node 35 from the tree of Fig. 4.

Fig. 4. Another AVL tree.

61. What can you say about the running time for AVL tree operations?

a single restructure is O(1) using a linked-structure binary tree find is O(log n) height of tree is O(log n), no restructures needed

insert is O(log n) initial find is O(log n) restructuring up the tree, maintaining heights is O(log n)remove is O(log n) initial find is O(log n) restructuring up the tree, maintaining heights is O(log n)62. What is a 2-3 tree?2-3 tree properties:


13/26

Each interior node has two or three children. Each path from the root to a leaf has the same length. A tree with zero or one node(s) is a special case of a 2-3 tree.63. Show the 2-3 tree which results after inserting the key 13 in the tree of Fig. 5.

Fig. 5. A 2-3 tree.

64. Show the 2-3 tree which results after deleting the key 8 in the tree of Fig. 5

65. What is a 2-3-4 tree?2-3-4 tree refer to how many links to child nodescan potentially be contained in a given node. Fornon-leaf nodes, three arrangements : A node with one data item always has two children A node with two data items always has three children A node with three data items always has four children In short, a non-leaf node must always have one more childthan it has data items.Empty nodes are not allowed.66. What were the disjoint sets with union and find designed for?Applicable to problems where:

start with a collection of objects, each in a set by itself; combine sets in some order, and from time to time ask which set a particular object is in Equivalence classes:

If set S has an equivalence relation (reflexive, symmetrical, transitive) defined on it, then the set S can be partitionedinto disjoint subsets S1, S2,, ... S with U(k)Sk=S Equivalence problem:

given a set S and a sequence of statements of the form a b process the statements in order in such a way that at any time we are able to determine in which equivalence class agiven element belongs67. Define the operations of the union-find set ADT.Operations: union(A, B) takes the union of the components A and B and calls the result either A or B, arbitrarily. find(x), a function that returns the name for the component of which x is a member. initial(A, x) creates a component named A that contains only the element x.68. Draw a sketch showing a lists implementation for the union-find set ADT with sets: 1: {1, 4, 7}; 2: {2, 3, 6, 9}; 8:{8,11, 10, 12}.


14/26

69. Draw a sketch showing a tree forest implementation for the union-find set ADT with sets: 1: {1, 4, 7}; 2: {2, 3, 6, 9};8:{8, 11, 10, 12}.

70. How can one speed up union-find ADT operations?Union by size (rank): When performing a union, make the root of smaller tree point to the root of the larger Implies O(n log n) time for

performing n unionfind operations:Each time we follow a pointer, we are going to a subtree of size at least double the size of the previous subtreeThus, we will follow at most O(log n) pointers for any find.

Path compression: After performing a find, compress all the pointers on the path just traversed so that they all point to the rootImplies O(n log* n) time for performing n union- find

71. Give an adjacency matrix representation for the graph of Fig. 6.

Fig. 6. A graph.72. Give an adjacency list representation for the graph of Fig 6.

73. How do adjacency list/matrix representations for graphs compare and when should one ore another be used?It is faster to work with adjacency matrices, but they use more space than adjacency lists, so you will choose one or theother depending on whivh resource is more important to you. Adj list usefull when e


15/26

Fig 7. Another graph.1,4,7,6,5,8,10,11,13,15,12,14,3,2

76. Apply Kruskals algorithm to the graph of Fig. 8

Fig. 8. Yet another graph.(4,6) (4,7) (7,8) (9,12) (12,13) (2,3) (7,5) (4,7) (8,10) (8,11) (4,1)

77. Apply Bellman-Ford algorithm to the graph of Fig. 9.

Fig.9. A digrapha,b,c,e,d

78. Apply Floyds algorithm to the graph of Fig 10.


16/26

Fig. 10. Another graph.79. Use breadth-first search to determine the articulation points of the graph of Fig. 10.

80. Use depth-first search to determine the articulation points of the graph of Fig. 10.

81. Determine the transitive closure of the graph of Fig. 10.

82. Apply topological sort to the dag of Fig. 11.

Fig. 11. A Directed Acyclic Graph.0,9,6,1,2,3,4,5,7,883. Develop a short algorithm for finding cycles in a graph.84. Consider the following MAXMIN algorithm. How many comparisons does it use? Is it likely to be faster or slower thanthe divide-and-conquer algorithmin practice?procedure maxmin2(S)comment computes maximum and minimum of S[1..n] in max and min resp.1. if n is odd then max:=S[n]; min:=S[n]

2. else max:=; min:=3. for i := 1 to n/2 do4. if S[2i 1] S[2i]5. then small:=S[2i 1]; large:=S[2i]6. else small:=S[2i]; large:=S[2i 1]7. if small < min then min:=small8. if large > max then min:=small85. A sequence of numbers < a1, a2, a3,, an > is oscillating if ai < ai+1 for every odd index i and ai > ai+1 for everyeven index i. Describe and analyze an efficient algorithm to compute the longest oscillating subsequence in a sequence ofn integers.86. Find the following spanning trees for the weighted graph shown in Figure 12, below.


17/26

(a) A breadth-first spanning tree rooted at s.(b) A depth-first spanning tree rooted at s.(c) A shortest-path tree rooted at s.(d) A minimum spanning tree.You do not need to justify your answers; just clearly indicate the edges of each spanning tree. Yes, one of the edges hasnegative weight.

Figure12. A weighted graph.

87. Describe and analyze an algorithm to compute the size of the largest connected component of black pixels in an n nbitmap B[1..n; 1..n]. For example, given the bitmap in Figure 12 (below) as input, your algorithm should return thenumber 9, because the largest conected black component (marked with white dots on the right) contains nine pixels.

Figure 13. Bitmap example.88. Describe and analyze an algorithm that determines whether a given graph is a tree, where the graph is represented byan adjacency list.

89. Solve the recurrence T(n) = 5T(n/17) + O(n^(4/3))90. Solve the recurrence T(n) = 1/n + T(n 1), where T(0) = 0.91. Suppose you are given an array of n numbers, sorted in increasing order.(a) Describe an O(n)-time algorithm for the following problem: Find two numbers from the list that add up to zero, orreport that there is no such pair. In other words, find two numbers a and b such that a + b = 0.(b) Describe an O(n2)-time algorithm for the following problem:Find three numbers from the list that add up to zero, orreport that there is no such triple. In other words, find three numbers a, b, and c, such that a+b+c = 0. [Hint: Usesomething similar to part (a) as a subroutine.]92. Sketch the selection sort algorithm and show how it works on the array containing the keys 82, 31, -13, 45, 99, -1, -7,

22.void selection(ITEM[] a, int l, int r)

{for (int i = l; i < r; i++){int min = i;for (int j = i+1; j


18/26

For each i from l to r-1, exchange a[i] with the minimum element in a[i],..., a[r]. As the index i travels from left to right, theelements to its left are in their final position in the array (and will not be touched again), so the array is fully sorted when ireaches the right end.

93. Sketch the insertion sort algorithm and show how it works on the array containing the keys 82, 31, -13, 45, 99, -1, -7,22.void insertion(ITEM[] a, int l, int r){

int i;for (i = r; i > l; i--)compExch(a, i-1, i);for (i = l+2; i


19/26

Quicksort(A,p,q-1)Quicksort(A,q+1,r)

PARTITION(A,p,r)x


20/26

Can elements have the same keys? If so, do we require a stable sort? O(n2) algorithms tend to be stable, O(n log n) in place algorithms not. However, we can make any unstable algorithm stable by adding a key with the position of the elements in the originalarray. This costs extra space and extra time for the comparisons. If we decided anyway to sort the sequence of pointers rather than the elements, we can use the position of theelements in the unsorted sequence in comparisons. In this case, no additional space is required. Do we require guarantees on the sorting time, e.g. in a hard realtime environment (e.g. in control systems, networks)? This rules out Quicksort because of its Q(n2) worst case behaviorDo we have a limited amount of space available, like in embedded processor?

This rules out MergeSort since it requires in the order of n extra space and it makes Quicksort questionable since itrequires also in the order of n extra space in the worst case. However, we can improve Quicksort to require only in theorder of lg n extra space. Can the sequence be so large that it does not completely fit into main memory and virtual memory is going to be used? If so, sorting algorithms with good local behavior are to be preferred. If we are at element A[i] in the Heapify procedure of HeapSort, then the next element accessed with be A[2 i] or A[2i+1], and so forth, so elements are accessed all over the array in quick succession. The Partition procedure of Quicksort accesses A[i], then A[i+1], etc., as well as A[j], then A[j1], etc., so has a good localbehavior. Most O(n^2) algorithms have good local behavior.Is the input so big that is cannot fit into mainmemory and too big for virtual memory? Then we have to use external sorting algorithms anyway.

Do we know more about the input which we can exploit for so rting in Q(n)? If the keys are in a small range of integers (e.g. the age of a person, year of printing), we can use CountingSort. If each key is a sequence of keys which can be compared on their own we can use RadixSort. If the keys are real number over an interval and are distributed evenly, we can use BucketSort.101. What strategies ca be in volved for selecting a new E-vertex with branch and bound?

A: General strategy: generate all children of thecurrent E-vertex before selecting a new Evertex.Strategies for selecting a new E-vertex: LIFO order: depth first, using a stack. FIFO order: breadth first, using a queue. Best-first order: use a priority queue.In each case, a bounding function is alsoused to kill vertices.

102. Compare the backtracking and branch and bound search strategies.Backtracking easy to implement little memory required slow to run Branch & Bound

difficult to implement large amounts of memory maybe faster than backtracking103. Describe the local search strategy.

Sometimes an optimal solution may be obtain if: Start with a random solution. Apply to the current solution a transformation from some given set of transformations to improve the solution. Theimprovement becomes the new "current" solution. Repeat until no transformation in the set improves the current solution. Note: the method makes sense if we we can restrict our set of transformations to a small set, so we can consider all

transformations in a short time (e.g. for a problem of size n, O(n2)O(n3) transformations) Transformations are called local transformations, and the method is called local search

104. Apply the local search strategy to the problem of finding the MST of the graph shown in Figure 14, below:


21/26

Fig. 14. Another weighted graph.

105. Describe the Divide and Conquer method for algorithm design.Divide and conquer method for algorithm design: Divide: If the input size is too large to deal with in a straightforward manner, divide the problem into two or moredisjoint subproblems Conquer: Use divide and conquer recursively to solve the subproblems Combine: Take the solutions to the subproblems and merge these solutions into a solution for the original problem

106. Describe the steps taken when applying dynamic programming strategy when developing an algorithm.Development of a dynamic programming algorithm involves four steps: Characterize the structure of an optimal solution. Recursively define the value of an optimal solution. Compute the value of an optimal solution in a bottom-up fashion. Construct the optimal solution.107. Suppose we have 220 128-bit elements that we would like to sort. What would be the efficiency of sorting theseusing quicksort? What would be the efficiency of sorting these as radix-216 numbers using radix sort? Which approachwould be better? Suppose we have 210 128-bit elements rather than 220 elements. How do quicksort and radix sortcompare in this case?A: Sorting with quicksort requires O (n lg n) = (220)(20) = (2.10)(107) times some constant amount oftime. Considering the elements as radix-216 numbers, the number of digit positions, p, is 8, and the number of possible

digit values, k, is 216

. Therefore, sorting with radix sort requires O (pn + pk) = (8)(220

) + (8)(216

) = (8.91)(106

) times someconstant amount of time. If the space requirements of radix sort are acceptable, radix sort is more than twice as efficientas quicksort. In the second case, sorting with quicksort requires O (n lg n) = (210)(10) = 10,240 times some constantamount of time. Radix sort requires O (pn + pk) = (8)(210) + (8)(216) = 532,480 times some constant amount of time, or 50times as much time as quicksort! Here is an example of why k is typically chosen to be close to and no more than n. Hadwe used a radix of 28, radix sort would have required O (pn + pk) = (16)(28) + (16)(28) = 8160 times some constant amountof time, and would have been slightly better than quicksort. However, it is worth noting that the space requirement ofradix sort may negate small benefits in time in many cases.

108. In a sorted set, the successor of some node x is the next largest node after x. For example, in a sorted set containingthe keys 24, 39, 41, 55, 87, 92, the successor of 41 is 55. How do we find the successor of an element x using binarysearch? What is the runtime complexity of this operation?A: In a sorted set, to determine the successor of some element x using binary search, first we locate x. Next, we simplymove one element to the right. The runtime complexity of locating either x or its successor is O (lg n).109. Suppose we model an internet using a graph and we determine that the graph contains an articulation point. Whatare the implications of this?A: Graphs have many important uses in network problems. If in a graph modeling an internet we determine that there isan articulation point, the articulation point represents a single point of failure. Thus, if a system residing at an articulationpoint goes down, other systems are forced into different connected components and as a result will no longer be able tocommunicate with each other. Therefore, in designing large networks in which connectivity is required at all times, it isimportant that there be no articulation points. We can curb this problem by placing redundancies in the network.


22/26

110. Consider a graph that models a structure of airways, highways in the sky on which airplanes are often required to fly.The structure consists of two types of elements: navigational facilities, called navaids for short, and airways that connectnavaids, which are typically within a hundred miles of each other. Airways may be bidirectional or one-way. At certaintimes some airways are not available for use. Suppose during one of these times we would like to determine whether wecan still reach a particular destination. How can we determine this? What is the runtime complexity of solving thisproblem?A: If we perform breadth-first search from our starting point in the airway structure, we can reach any destination if wediscover it during the search. Otherwise, the destination must reside in a component of the graph that becameunreachable when an airway was made unavailable. The closed airway constitutes a bridge in the graph. This problem can

be solved in O (V + E ) time, where V is the number of navaids and E is the number of airways in the structure. This is theruntime complexity of breadth-first search.

111. Suppose we would like to use a computer to model states in a system. For example, imagine the various states of atraffic-light system at an intersection and the decisions the system has to make. How can we use a graph to model this?A: Directed graphs are good for modeling state machines, such as the traffic-light system mentioned here. In a directedgraph, we let vertices represent the various states, and edges represent the decisions made to get from one state toanother. Edges in the graph are directed because a decision made to get from one state to the next does not imply thatthe decision can be reversed.

112. The transpose of a directed graph is a graph with the direction of its edges reversed. Formally, for a directed graph G

= (V, E ), its transpose is indicated as GT. How could we form the transpose of a graph assuming an adjacency-listrepresentation? What is the runtime complexity of this?A: To form the transpose G T of a graph G = (V, E ), we traverse the adjacency list of each vertex u in V. As we traverseeach list, we make sure that vertex v and u have both been inserted into G T by calling graph_ins_vertex for each vertex.Next, we call graph_ins_edge to insert an edge from v to u into G T. Each call to graph_ins_vertex runs in O (V ) time. Thisoperation is called 2E times, where E is the number of edges in G. Of course, some of these calls will not actually insertthe vertex if it was inserted previously. Each call to graph_ins_edge runs in O (V ) time. This operation is called once foreach edge in G as well. Thus, using this approach, the overall time to transpose a graph is O (V E ).

113. The following recursive definition has an error. What is it, and how can we fix it? For a positive integer n, thedefinition, in its proper form, is common in formally computing the running time of divide-and-conquer algorithms, suchas merge sort. Merge sort divides a set of data in half, then divides the halves in half, and continues this way until eachdivision contains a single element. Then, during the unwinding phase, the divisions are merged to produce a final sortedset.

A: The problem with this definition is that it never reaches the terminating condition, n = 0, for any initial value of n greaterthan 0. To fix the problem, it needs an obtainable terminating condition. The condition n = 1 works well, which means we

should also change the second condition in the function. A recursive definition with an acceptable terminating condition ispresented here:

This happens to be the correct definition for the running time of merge sort. Such a function is called a recurrence. In moreformal analysis, recurrences are used frequently to describe the running times of recursive algorithms.

114. Analyze the asymptotic time complexity T(n) for the following two divide-and-conquer algorithm. You may assumethat n is a power of 2.int foo(A){

n = A.length;if (n==1)


23/26

{return A[0];

}int half = (int) n/2

int[] A1 = new int[half];int[] A2 = new int[n-half];for (int i=0; i max:max = x[2*i]

if n is even, the loop starts at i=2, if odd i=1. This results in (3(n-2)/2)+1 comparisons if even or 3(n-1)/2 if odd.117. Suppose you have n integers in the range from 0 to n3-1. Explain how to sort them in O(n) time.118. Develop an algorithm to solve the basic Tower of Hanoi problem, i.e., 3 towers, N disks, and the given rules.

119. How could you make an algorithm for finding the longest path in a graph with only negative weights?120. Suppose there are three alternatives for dividing a problem of size n into subproblems of smaller size: if you solve 3subproblems of size n/2 , then the cost for combining the solutions of the subproblems to obtain a solution for theoriginal problem is Theta(n2 sqrt(n)); if you solve 4 subproblems of size n/2, then the cost for combining the solutions isTheta(n2 ); if you solve 5 subproblems of size n/2 , then the cost for combining the solutions is Theta(n log n). Whichalternative do you prefer and why?121. A palindrome is a word w1w2 . . .wk whose reverse wkwk1 . . .w1 is the same string, e.g. abbabba. Consider astring A = a1a2 . . . an. A partitioning of the string is a palindrome partitioning if every substring of the partition is apalindrome. For example, aba|b|bbabb|a|b|aba is a palindrome partitioning of ababbbabbababa. Design a dynamicprogramming algorithm to determine the coarsest (i.e. fewest cuts) palindrome partition of A.122. Consider a directed graph G = (V,E) where each edge is labeled with a character from an alphabet Sigma, and we

designate a special vertex s as the start vertex, and another f as the final vertex. We say that G accepts a string A = a1a2 . .


24/26

. an if there is a path from s to f of n edges whose labels spell the sequence A. Design an O((|V | + |E|)n) dynamicprogramming algorithm to determine whether or not A is accepted by G.123. Give the pseudocode for a greedy algorithm that solves the following optimization problem. Justify briefly why youralgorithm finds th optimum solution. What is the asymptotic running time of your algorithm in terms of n?124. There are n gas stations S1,Sn on E60 highway from Cluj-Napoca to Oradea. On a full tank of gas, your car goes forD miles. Gas station S1 is in Cluj-Napoca, and each gas station Si for 2


25/26

135. What is a multi-way search tree of order m?A: Multiway Search Trees (MWSTs) are a generalization ofBSTsMWST of order n:Each node has n or fewer sub-trees: S1 S2.Sm, m n

Each node has n 1or fewer keys

k1 k2 km1 : m1 keys in ascending order k(Si) ki k(Si+1) ,

k(Sm1) < k(Sm)Suitable for disks:Nodes correspond to disk pagesPros: tree height is low for large n fewer disk accessesCons:

low space utilization if non-full MWSTs are non-balanced in general!

136. Use Use the alpha/beta procedure to identify which parts of the tree can be pruned. Clearly identify alpha and betapruning.


26/26

137. An architect has been commissioned to design a new building, called the Data Center. Gary wants his toparchitectural protg to design a scale model of the Data Center using precision-cut sticks, but he wants to preclude themodel from inadvertently containing any right angles. Gary fabricates a set of n sticks, labeled 1,2,...,n, where stick i haslength xi. Before giving the sticks to the protg, he shows them to you and asks you whether it is possible to create aright triangle using any three of the sticks. Give an efficient algorithm for determining whether there exist three sticks a,b, and c such that the triangle formed from them having sides of lengths xa, xb, and xc is a right triangle (that is, xa2+ xb2 = xc2).138. Argue that any comparison based sorting algorithm can be made to be stable, without affecting the running time bymore than a constant factor.

139. Define a linked list with a loop as a linked list in which the tail element points to one of the lists elements and not toNULL. Assume that you are given a linked list L, and two pointers P1, P2 to the head. Write an algorithm that decideswhether the list has a loop without modifying the original list. The algorithm should run in time O(n) and additionalmemory O(1), where n is the number of elements in the list.140. Use a min-heap to give an O(n log k)-time algorithm which merges k sorted lists into one sorted list, where n is thetotal number of elements in all the input lists.141. Let G = (V,E) be an undirected graph, and s be a node in V . Edge (u, v) 2 E is called a circular edge if the distance froms to u is identical to the distance from s to v. Give a linear algorithm that given s finds all circular edges in graph. Explainwhat data structures you use, and provide complexity analysis.142. Give (in pseudocode) an O(|V | + |E|) algorithm that takes as input a directed acyclic graph and two vertices s, t, andreturns the number of paths from s to t in G. Your algorithm needs only to count the paths, not list them. Hint: Use

topological sort as a part of your solution.143. Give an algorithm to either output a message No counterfeit coin, or identify which of three coins: A, B and C iscounterfeit.144. Construct a divide and conquer algorithm that divides a set S set of coins into three equal subsets, and uses part (a)to solve small (i.e. 3-coin) sets.145. Suppose you have analysed a file and determined that the frequency of occurrence of certain characters is asfollows:

Character a b c d e fOccurences 15 7 5 8 30 10

a) Construct the Huffman tree for the charactersb) List the codes for each characterc) Use the tree to compress the following strings:i) faded ii) abediii) feed

Jim DSA Questions

Documents

Transcript of Jim DSA Questions