Solving the Graph-partitioning Problem with Heuristic Search

1

Solving the Graph-partitioning Problem with Heuristic Search

Ariel Felner

Bar-Ilan University

Ramat-Gan

ISRAEL

2

The Graph Partitioning ProblemGiven a graph G(E,V) the

problem is to partition the graph into two equal sized subsets of vertices.

The number of edges that are crossing the partition should me minimized.

The partition in the graph on the right is of cost 2.

3

Related Work for the GPP The GPP is NP-Complete.Most Algorithms for GPP are designed for

finding sub-optimal solutions and use local search techniques.

A large portion of them, start with a feasible solution and then start swapping pairs of vertices between the two partitions.

The famous ones are KL (1970) and XLS(1991).

4

A Search ProblemA search space consists states and

operators, an initial state, set of goal states. A solution: a path from the initial state to

one of the goal states.Optimal solution: A path of minimal cost.Best-first search algorithm: sorts all

generated nodes in an OPEN-LIST and chooses the node with the best heuristic value (cost) for expansion.

5

Search Algorithms criteria

Solution quality: Optimal , Near optimal, or Sub optimal.

Time Complexity: number of generated nodes.

Constant time per node: time spent in each node

6

Heuristic functionsHeuristic function: A function that gives

each state an estimation of the real distance (cost) from that state to the goal.

A heuristic function is admissible if it never over estimates the real distance.

An admissible heuristic is always a lower bound on the real solution.

Example: air distance in road navigation.A heuristic function should be as accurate

as possible and as fast as possible to compute.

7

The A* algorithmg(x): real distance from the initial state to

the current node x.h(x): the estimated remained distance from

x to the goal state. f(x)=g(x): Uniform Cost Search.f(x)=g(x)+h(x): The A* algorithm (1968).f(x) in A* is an estimation of the shortest

path to the goal via x. Theorem: “Given a heuristic function, no

other algorithm outperforms A*”.(Pearl 83).

8

Recent developments in Search.Most of the work in the past few years was on

finding more accurate heuristics functions. (Korf 96) ,(Schaefer 97) (Korf & Felner 2000)

A tradeoff: complicated versus nodes number.Observation: Many search problems can be

divided into solving several subproblems or to achieving several subgoals.

Example: in the 15 tile-puzzle we have 15 subgoals.In the GPP we have n subproblems of placing n

vertices in one of the subsets of the partition.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

9

Our claim: ”Looking more deeply into interactions between unsolved sub goals resolves with a much better heuristic function and speeds up the search”

Our main hypothesis

10

1

1

1

1,2

1,2,3 1,2 1,3

2

3 2 2,3

A sub problem in GPP is to assign a vertex to one of the subsets of the partitionEach level of the search tree corresponds to a specific vertex of the graph.Each branch assigns the vertex to another subset of the partition.

The GPP as a search problem

Each node of the tree is a partial partition including some of the vertices.

Size of the tree: 2^nLeaves of the tree are the complete partitions. One of them is the optimal.

11

DefinitionsA node of the search tree is denoted by k while

vertex of the graph is denoted by x.A vertex that is already assigned to one of the

subsets is called an assigned vertex.Each of the other vertices is a free vertex. Free

vertices are unsolved subgoals. Given a node k of the search tree we define: g(k): the number of edges that already cross the

partial partition due to assigned vertices. h(k): A lower bound on the number of edges that

will cross the given partition due to free vertices.

12

A heuristic from the free vertices

The free vertices have many edges connected to them.

Can we have an estimation on the number of such edges that must cross the partition?

Free vertices

1 3

2 4

13

More definitionsThe subsets of the partial

partition are A and B.Each of the following

heuristics completes the partition with A’ and B’

We can guess about A’ and B’

Types of the edgesI: Edges in A &A’ II: Edges from A to BIII: Edges from A to B’IV: Edges from A’ to B’

A B 1 32 4

A’={5,6} B’={7,8}

1256

3478

II

IV

IIIIA

A’

B

B’

14

f0: Uniform Cost Search

f0(k) = g(k).Edges that already

cross the partition. Edges of type II.

Mainly for comparison reasons.

1256

3478

IIB

B’

A

A’

Assigned

Free

15

f1: Adding edges of type III

An admissible heuristic for a vertex x will be h1(x)=min{d(x,A),d(x,B)}

h1(k)=summing h1(x) for all free vertices x.

f1(k)=g(k)+h1(k);

12

34

A B

x

For each free vertex x we define d(x,A) as the number of edges from x to A and d(x,B) as the number of edges from to B.

16

f2: Sorting the free verticesAssume that the cardinalities of A and B are

p and q respectively.n/2-p of the free nodes must go to A’n/2-q of the free nodes must go to B’NA(x)=d(x,A)-d(x,B).NB(x)=d(x,B)-d(x,A). (NA(x)=-NB(x))We now sort all the free vertices in

decreasing order of NA(x). The first n/2-p will go to A’. The rest to B’.

17

subsetNB(x)NA(x)d(x,B)d(x,A)vertex-2213a

A'-1112b0011c1-121d

B'2-231e3-352f

f2(k)=g(k)+h2(k). Where h2 takes d(x,B) if x is in A’

and d(x,A) if x is in B’.

h1 places d in B and takes d(d,A)=1 while h2 places d in A and takes d(d,B)=2.

h1 looks at each free vertex alone while h2 looks on interactions between the vertices.

18

f3: Adding type IV edges to f2The free graph

Nodes: free vertices

Edges: edges between free vertices that were assigned to different subsets by f2. Edges of type IV.

The graph is bipartite.

We want to add to h as many such edges without loosing admissibility.

1256

3478

A+A’ B+B’

19

Another vertex y from B’ must be swapped with x and thus NB(y) other edges will also be added.

We want such y with the smallest NB(y). We call it the swappable vertex of B and is denoted by SB’.

In the same manner SA’

12

34

A B

5 6 7 8 A’ B’SB’

Suppose that we want to move vertex x from A’ to B’.

NA(x)=d(x,A)-d(x,B) more edges will be added to the partition.

20

N(x)=NA(x)+NB(SB’) if x is in A’

NB(x)+NB(SA’) if x is in B’N(x) is a lower bound of the number of

edges that will be added to f2 if we move a free vertex from A’ to B’ or from B’ to A’

Let x be a vertex in A’ with 3 edges of type IV.

We can take as many such edges as long as it does not exceeds N(x). Because in that case it is better to swap x with SB’

12x6

34SB’

8

A+A’ B+B’

21

N(x) stands for the number of edges of type IV that are allowed for x without loosing admissibility.

We want to take as many edges from the free graph as long as for each x it does not exceeds N(x)

1

3

2

3

4

1

This is a Generalized Matching Problem (GMP) since regular matching is a special case where N(x)=1 for all x.

The GMP can be solved very easily.

22

Summary of f3 f3(k)=g(k)+h2(k)+h3(k). 1) Sort the free vertices in decreasing order

of NA(x).2) calculate h2 for each of the free vertices.3) Identify the swappable vertices and for

each free vertex x calculate N(x).4) Form the GMP with N(X).5) Calculate h3 by solving the GMP.

23

The algorithms we used.Depth-first branch

and bound.(DFBnB) DFBnB Searches the tree from left to right.

Expands only sub trees with costs smaller than the best solution found so far.

We also used Itervative Deepening A*: IDA* (Korf 85)

8 7 6

6

6

87

24

Empirical results Given the size of the graph n and a

branching factor b, we built a random graph with n nodes and each edge was added to the graph with a probability of b/n.

The nodes of the graph were sorted by decreasing order of their branching factor and thus nodes with more edges will be treated sooner.

Experiments were done on a 500MHZ pc.Data was averaged on 30 similar datapoints

25

Constant time per node for the different algorithms

The table shows the number of generated nodes per second

f3 spends more than ten times as much as f0 for each node of the tree.

nodes/secondAlg2,412,132f01,280,940f1

445,553f2194,191f3

26

The optimal cut for graphs of size 50

0

50

100

150

200

2 4 6 8 10 12 14 16 18 20

The average brancging factor

The

optim

al c

ut

As the density of the graph increase so does size of the optimal cut.

27

Time in seconds. graphs of size 50. DFBnB used

0.01

0.1

1

10

100

1000

10000

100000

2 4 6 8 10 12 14 16 18 20

The average degree

Tim

e in

sec

onds

f0 f1

f2 f3

Results for other graphs as well as using IDA* were very similar. A better heuristic solves the problem faster

28

secondsnodessolutionAlgdensity1,836.045,316,122,42036.9f06

11.7618,923,25736.9f161.47655,02736.9f260.2244,40436.9f36

20,590.7513,664,811,427184.8f1201,139.51342,200,788184.8f220

269.1633,850,497184.8f320

f3 if faster than f0 by almost 10.000 for graphs with density of 6.

f3 is faster than f1 by a factor of 100 for a graph with density of 20.

29

secondsnodescutdense0.82170,1035.572

51.855,333,67730.9741,542.94122,199,64666.636

29,214.192,004,165,640106.38227,607.0914,464,048,386144.7510

Graphs of size 100. Solved by f3 only.Once again as the density of the graph increase the optimal cut increases linearly and the time to solve the problem increases exponentially.

30

DiscussionOur approach can be combined with any

other sub optimal algorithm A, by first running A and then giving its solution as a bound to DFBnB.

(Rolland and Pirkul 99) developed an algorithm that finds the size of the optimal solution very quickly based on Lagrangian relaxation and subgradient search.

The threshold for both IDA* and DFBnB can come from their method.

31

Conclusions.We have shown an algorithm that finds

optimal solution to the GPP.We have demonstrated the claim that

finding better heuristics by looking deeply into interactions between subgoals speeds up the search.

We have developed similar heuristics to the Vertex Cover problem and the Sliding tile puzzles with again nice speedup.

Solving the Graph-partitioning Problem with Heuristic Search

Documents

Transcript of Solving the Graph-partitioning Problem with Heuristic Search