Solving the Graph-partitioning Problem with Heuristic Search
-
Upload
melvin-lester -
Category
Documents
-
view
24 -
download
1
description
Transcript of Solving the Graph-partitioning Problem with Heuristic Search
1
Solving the Graph-partitioning Problem with Heuristic Search
Ariel Felner
Bar-Ilan University
Ramat-Gan
ISRAEL
2
The Graph Partitioning ProblemGiven a graph G(E,V) the
problem is to partition the graph into two equal sized subsets of vertices.
The number of edges that are crossing the partition should me minimized.
The partition in the graph on the right is of cost 2.
3
Related Work for the GPP The GPP is NP-Complete.Most Algorithms for GPP are designed for
finding sub-optimal solutions and use local search techniques.
A large portion of them, start with a feasible solution and then start swapping pairs of vertices between the two partitions.
The famous ones are KL (1970) and XLS(1991).
4
A Search ProblemA search space consists states and
operators, an initial state, set of goal states. A solution: a path from the initial state to
one of the goal states.Optimal solution: A path of minimal cost.Best-first search algorithm: sorts all
generated nodes in an OPEN-LIST and chooses the node with the best heuristic value (cost) for expansion.
5
Search Algorithms criteria
Solution quality: Optimal , Near optimal, or Sub optimal.
Time Complexity: number of generated nodes.
Constant time per node: time spent in each node
6
Heuristic functionsHeuristic function: A function that gives
each state an estimation of the real distance (cost) from that state to the goal.
A heuristic function is admissible if it never over estimates the real distance.
An admissible heuristic is always a lower bound on the real solution.
Example: air distance in road navigation.A heuristic function should be as accurate
as possible and as fast as possible to compute.
7
The A* algorithmg(x): real distance from the initial state to
the current node x.h(x): the estimated remained distance from
x to the goal state. f(x)=g(x): Uniform Cost Search.f(x)=g(x)+h(x): The A* algorithm (1968).f(x) in A* is an estimation of the shortest
path to the goal via x. Theorem: “Given a heuristic function, no
other algorithm outperforms A*”.(Pearl 83).
8
Recent developments in Search.Most of the work in the past few years was on
finding more accurate heuristics functions. (Korf 96) ,(Schaefer 97) (Korf & Felner 2000)
A tradeoff: complicated versus nodes number.Observation: Many search problems can be
divided into solving several subproblems or to achieving several subgoals.
Example: in the 15 tile-puzzle we have 15 subgoals.In the GPP we have n subproblems of placing n
vertices in one of the subsets of the partition.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
9
Our claim: ”Looking more deeply into interactions between unsolved sub goals resolves with a much better heuristic function and speeds up the search”
Our main hypothesis
10
1
1
1
1,2
1,2,3 1,2 1,3
2
3 2 2,3
A sub problem in GPP is to assign a vertex to one of the subsets of the partitionEach level of the search tree corresponds to a specific vertex of the graph.Each branch assigns the vertex to another subset of the partition.
The GPP as a search problem
Each node of the tree is a partial partition including some of the vertices.
Size of the tree: 2^nLeaves of the tree are the complete partitions. One of them is the optimal.
11
DefinitionsA node of the search tree is denoted by k while
vertex of the graph is denoted by x.A vertex that is already assigned to one of the
subsets is called an assigned vertex.Each of the other vertices is a free vertex. Free
vertices are unsolved subgoals. Given a node k of the search tree we define: g(k): the number of edges that already cross the
partial partition due to assigned vertices. h(k): A lower bound on the number of edges that
will cross the given partition due to free vertices.
12
A heuristic from the free vertices
The free vertices have many edges connected to them.
Can we have an estimation on the number of such edges that must cross the partition?
Free vertices
1 3
2 4
13
More definitionsThe subsets of the partial
partition are A and B.Each of the following
heuristics completes the partition with A’ and B’
We can guess about A’ and B’
Types of the edgesI: Edges in A &A’ II: Edges from A to BIII: Edges from A to B’IV: Edges from A’ to B’
A B 1 32 4
A’={5,6} B’={7,8}
1256
3478
II
IV
IIIIA
A’
B
B’
14
f0: Uniform Cost Search
f0(k) = g(k).Edges that already
cross the partition. Edges of type II.
Mainly for comparison reasons.
1256
3478
IIB
B’
A
A’
Assigned
Free
15
f1: Adding edges of type III
An admissible heuristic for a vertex x will be h1(x)=min{d(x,A),d(x,B)}
h1(k)=summing h1(x) for all free vertices x.
f1(k)=g(k)+h1(k);
12
34
A B
x
For each free vertex x we define d(x,A) as the number of edges from x to A and d(x,B) as the number of edges from to B.
16
f2: Sorting the free verticesAssume that the cardinalities of A and B are
p and q respectively.n/2-p of the free nodes must go to A’n/2-q of the free nodes must go to B’NA(x)=d(x,A)-d(x,B).NB(x)=d(x,B)-d(x,A). (NA(x)=-NB(x))We now sort all the free vertices in
decreasing order of NA(x). The first n/2-p will go to A’. The rest to B’.
17
subsetNB(x)NA(x)d(x,B)d(x,A)vertex-2213a
A'-1112b0011c1-121d
B'2-231e3-352f
f2(k)=g(k)+h2(k). Where h2 takes d(x,B) if x is in A’
and d(x,A) if x is in B’.
h1 places d in B and takes d(d,A)=1 while h2 places d in A and takes d(d,B)=2.
h1 looks at each free vertex alone while h2 looks on interactions between the vertices.
18
f3: Adding type IV edges to f2The free graph
Nodes: free vertices
Edges: edges between free vertices that were assigned to different subsets by f2. Edges of type IV.
The graph is bipartite.
We want to add to h as many such edges without loosing admissibility.
1256
3478
A+A’ B+B’
19
Another vertex y from B’ must be swapped with x and thus NB(y) other edges will also be added.
We want such y with the smallest NB(y). We call it the swappable vertex of B and is denoted by SB’.
In the same manner SA’
12
34
A B
5 6 7 8 A’ B’SB’
Suppose that we want to move vertex x from A’ to B’.
NA(x)=d(x,A)-d(x,B) more edges will be added to the partition.
20
N(x)=NA(x)+NB(SB’) if x is in A’
NB(x)+NB(SA’) if x is in B’N(x) is a lower bound of the number of
edges that will be added to f2 if we move a free vertex from A’ to B’ or from B’ to A’
Let x be a vertex in A’ with 3 edges of type IV.
We can take as many such edges as long as it does not exceeds N(x). Because in that case it is better to swap x with SB’
12x6
34SB’
8
A+A’ B+B’
21
N(x) stands for the number of edges of type IV that are allowed for x without loosing admissibility.
We want to take as many edges from the free graph as long as for each x it does not exceeds N(x)
1
3
2
3
4
1
This is a Generalized Matching Problem (GMP) since regular matching is a special case where N(x)=1 for all x.
The GMP can be solved very easily.
22
Summary of f3 f3(k)=g(k)+h2(k)+h3(k). 1) Sort the free vertices in decreasing order
of NA(x).2) calculate h2 for each of the free vertices.3) Identify the swappable vertices and for
each free vertex x calculate N(x).4) Form the GMP with N(X).5) Calculate h3 by solving the GMP.
23
The algorithms we used.Depth-first branch
and bound.(DFBnB) DFBnB Searches the tree from left to right.
Expands only sub trees with costs smaller than the best solution found so far.
We also used Itervative Deepening A*: IDA* (Korf 85)
8 7 6
6
6
87
24
Empirical results Given the size of the graph n and a
branching factor b, we built a random graph with n nodes and each edge was added to the graph with a probability of b/n.
The nodes of the graph were sorted by decreasing order of their branching factor and thus nodes with more edges will be treated sooner.
Experiments were done on a 500MHZ pc.Data was averaged on 30 similar datapoints
25
Constant time per node for the different algorithms
The table shows the number of generated nodes per second
f3 spends more than ten times as much as f0 for each node of the tree.
nodes/secondAlg2,412,132f01,280,940f1
445,553f2194,191f3
26
The optimal cut for graphs of size 50
0
50
100
150
200
2 4 6 8 10 12 14 16 18 20
The average brancging factor
The
optim
al c
ut
As the density of the graph increase so does size of the optimal cut.
27
Time in seconds. graphs of size 50. DFBnB used
0.01
0.1
1
10
100
1000
10000
100000
2 4 6 8 10 12 14 16 18 20
The average degree
Tim
e in
sec
onds
f0 f1
f2 f3
Results for other graphs as well as using IDA* were very similar. A better heuristic solves the problem faster
28
secondsnodessolutionAlgdensity1,836.045,316,122,42036.9f06
11.7618,923,25736.9f161.47655,02736.9f260.2244,40436.9f36
20,590.7513,664,811,427184.8f1201,139.51342,200,788184.8f220
269.1633,850,497184.8f320
f3 if faster than f0 by almost 10.000 for graphs with density of 6.
f3 is faster than f1 by a factor of 100 for a graph with density of 20.
29
secondsnodescutdense0.82170,1035.572
51.855,333,67730.9741,542.94122,199,64666.636
29,214.192,004,165,640106.38227,607.0914,464,048,386144.7510
Graphs of size 100. Solved by f3 only.Once again as the density of the graph increase the optimal cut increases linearly and the time to solve the problem increases exponentially.
30
DiscussionOur approach can be combined with any
other sub optimal algorithm A, by first running A and then giving its solution as a bound to DFBnB.
(Rolland and Pirkul 99) developed an algorithm that finds the size of the optimal solution very quickly based on Lagrangian relaxation and subgradient search.
The threshold for both IDA* and DFBnB can come from their method.
31
Conclusions.We have shown an algorithm that finds
optimal solution to the GPP.We have demonstrated the claim that
finding better heuristics by looking deeply into interactions between subgoals speeds up the search.
We have developed similar heuristics to the Vertex Cover problem and the Sliding tile puzzles with again nice speedup.