Searching. Motivation Find parts for a system Find an address for name Find a criminal...
-
Upload
kerry-underwood -
Category
Documents
-
view
215 -
download
0
Transcript of Searching. Motivation Find parts for a system Find an address for name Find a criminal...
Searching
2
Motivation
• Find parts for a system• Find an address for name• Find a criminal
– fingerprint/DNA match
• Locate all employees in a dept.• based on a collection of criteria• across multiple tables
• Find shortest path (network, roads, etc.)
3
Linear Search
• Items to be searched are in a list x0, x1, … xn-1
– Need = = and < operators defined for the type
• Start with item 0– until (end of list or target found)– compare another item
• Best case, found on 1st comparison• Worst case, found on nth comparison
4
Linear Search – vector/array
#include <vector>int LinearSearch (const vector int &v, const int &item){
for (i=0; i< v.size ;i++ ) if (item == v[i]) return 1;
return 0; // not found}
// # of compares for all possible cases of searching: 1+2+3+…+n = ½ * n (n+1)
// average = ½*n*(n+1)/n = (n+1)/2 ≈ n/2// average search time is O(n/2)= O(n)
5
Linear Search – single-linked list
int LinearSearch (NodePointer first, const int &item){loc = first; for ( ; loc != NULL ; loc=loc->next ) { if (item == loc->data) return 1; }
return 0; // not found}
Worst case computing time is
still O(n)
Worst case computing time is
still O(n)
6
Binary search
• Significantly faster than linear• Repeatedly "halving" the problem• We can divide a set of n items in half at
most log2 n times
• For performance COMPARISONS, we ignore the base for the log (2 in this case)
• Complexity of binary search is O(log n)
7
Some Observations
• Binary usually outperforms linear search• Both require sequential storage• Data must be ordered (sorted)• Searching is done on a "key"
– a piece of data unique to each item– often smaller than the actual data– e.g.; your B-number vs. whole name
• A non-linear linked structure is better– there are several kinds of "tree" structures
8
Big Oh - Formal Definition (again)
• f(n)=O(g(n))
• Thus, g(n) is an upper bound on f(n)• Note:
f(n) = O(g(n)) "f(n) has a complexity of g(n)"this is NOT the same as
O(g(n)) = f(n)• The '=' is not the usual "=" operator (it is not reflexive)
iff ᴲ {c, n0 | f(n) <= c g(n) for all n >= n0}
9
Trees
• A data structure which consists of – a finite set of elements called nodes or vertices– a finite set of directed arcs which connect the
nodes
• If the tree is nonempty– one of the nodes (the root) has no incoming arc– all other nodes can be reached by following
unique sequences of consecutive arcs
10
Trees
• Each node has n >= 0 children• Topmost node is the "root"• Binary tree: each node has 0, 1, 2 children• Engineering uses:
– Huffman encoding (data compression)– expression evaluation– sorting & searching– electric power distribution grid– go/no-go decision-making
11
• Consider an ordered list of integers
1. Examine middle element
2. Examine left, right sublist (maintain pointers)
3. (Recursively) examine left, right sublists
Binary Search Tree
52 756345225 90
12
• Redraw as a treelike shape – this is a binary tree
Binary Search Tree
52
63 9045
22
5
75
root
children of 75
parent of 63, 90
leaves
subtree
13
Binary Search Tree (BST)
• A binary tree– Left-child <= Parent value <= Right child
• Several tree operations available– construction– test for empty– search for item– insert– delete– traverse (visit a node exactly once)
14
Binary Tree terms
• Full tree (proper tree, 2-tree, strictly binary)– all nodes have exactly 0 or 2 children
• Complete tree– all levels (except maybe the last) are filled– all leaves are "pushed" left
• Balanced tree– L/R sub-trees of EVERY node differ by no
more than 1 level.
• Perfect tree – all leaves at same depth
15
Implementations
• An array can be used – insertion, deletion, re-arranging VERY messy– searching, sorting inefficient– not useful for "sparse" trees (missing data)– very hard to traverse recursively
• Linked tree– nodes like those in Stacks, Queues & Lists
• pointer to left-child• pointer to right-child• data
16
Recursive Descent
• A binary tree is either empty
or• Has a data-node (root) with 2 subtree ptrs
– left-tree– right-tree– the subtrees are disjoint
• Each sub-tree follows the same definition• Leads to simple recursive search programs
17
Recursive Tree Traversal
void Traverse (node* ptr){if the binary tree is empty (ptr==NULL) then
return;
else // recursion here{ Process root data (ptr → data) Traverse (ptr → left);
Traverse (ptr → right);
}
18
3 possible traversals
• Pre-order– data, left-sub-tree, right-sub-tree– first-touch
• In-order– left-sub-tree, data, right-sub-tree– 2nd touch
• Post-order– left-sub-tree, right-sub-tree, data– last-touch
19
Traversal Order
• Given expression A – B * C + D
• Operator precedence is: ^ * / + -• This is normal infix order• Each operand is
– The child of a parent node
• Parent node, – for the corresponding operator
20
The expression tree
+
D-
B
A *
C
21
Remaining traversals
• Prefix + - A * B C D
• Postfix (Reverse Polish Notation – RPN)A B C * - D +
Stack Applications• base-ten to base-two conversion
– remainders need to be printed in reverse order in which they are calculated
• run-time stack of function activation records– push when a function is called– pop when a function exits
• arithmetic expression evaluation– easier to evaluate when stored in postfix (RPN)
• infix to postfix conversion algorithm uses a stack • evaluating postfix is easy using a stack for operands
Infix to Postfixinfix expression: (3 + 4) * 5 - 2
postfix expression: 3 4 + 5 * 2 -
A. scan input from L to RB. if operand, output itC. else // must be an operator or "(" or ")"
1. if "(" then push & loop2. if operator & prec(top)< prec(input)
a. pop & output until > = prec(top)<prec(input)
b. push the input3. if ")"
a. pop & output until "("b. remove & discard the "("c. discard the incoming ")"
4. if end of input, pop & output until empty
step in out stacktop is
on right
C2 ( (
B 3 3 (
C2b + +(
2 4 4 +(
3a,b,c ) +
2 * *
B 5 5
C2b - * -
B 2 2 -
null -
24
Evaluating Postfix
• use a stack to evaluate a postfix expression• read values into stack until operator reached• pop 2 values and apply operator
– be sure to maintain order of operands– a-b is not the same as b-a
• push result value onto stack• repeat until no input and stack is empty
postfix expression: 3 4 + 5 * 2 -push 3 and 4 see the +, pop 3, 4 add, push 7push 5see the * pop 7, 5 multiply, push 35push the 2, then see the – so pop 35, 2 and subtract