Decision 1:: The Travelling Salesman Problem (A2 Content)

22
Decision 1:: The Travelling Salesman Problem (A2 Content) [email protected] Last modified: 2 nd March 2021

Transcript of Decision 1:: The Travelling Salesman Problem (A2 Content)

Decision 1:: The Travelling Salesman Problem (A2 Content)

[email protected]

Last modified: 2nd March 2021

1:: Algorithms

Sorting and bin packing.

4:: Route inspection

Find the shortest route which travels along all roads

2:: Graphs and networks

What is a graph and how they represent things.

5:: The Travelling Salesman

Find the shortest route which visits all places.

3:: Algorithms on graphs

What algorithms do I need to be able to apply?

6:: Linear Programming

How to find an optimal solution graphically

7:: The simplex algorithm

How to find an optimal solution algebraically.

8:: Critical path analysis

How to plan a project.

Decision 1 Overview

Suppose you wanted to make a delivery to 22 different locations, there are good road, links between all of them. How many possible routes would you need to consider?

21! routes, which is 5.1 × 1019, or just 50 Quintillion possible routes. The Washington post estimated in 2018 that this would take a laptop computer 1,000 years to compute the length of all these routes. Link

Now, suppose instead of visiting 22 houses, you wanted to visit nearly 50,000 pubs in the U.K. We are going to need a quantum computer or some pretty heavy mathematics…

http://www.math.uwaterloo.ca/tsp/uk/index.html

How this was done is well beyond the scope of this course, but it wouldn’t be sensible to go on without some understanding of the method so spend 4 minutes watching this…. http://www.math.uwaterloo.ca/tsp/uk/jmm2018_cutting.mp4

Now, suppose instead of 50,000 pubs you wanted to visit 109,399 of the closest stars to Earth. You can see we need to study the Travelling Salesman Problem!

Motivation

?

Image - Close-up view of the 109,399-star TSP

5.1 Definitions

A walk in a network is a finite sequence of edges such that the end vertex of one edge is the start vertex of the next. (Ch2)

A tour is a walk which visits every vertex, returning to its starting vertex.

A tour which visits every vertex exactly once is a Hamiltonian cycle.

What are the similarities and differences between a tour and a cycle?

Both start and finish at the same vertex, but a cycle need not visit every vertex (a tour must) and a cycle may not visit any vertex twice other than the start/finish (a tour may).?

The Travelling Salesman Problem involves finding a tour of minimum total weight.

In the classical problem, each vertex must be visited exactly once before returning to the start.

In the practical problem, each vertex must be visited at least once before returning to the start.

Bounds

There is no quick algorithm for solving The Travelling Salesman Problem, in most practical situations it is good enough to find a good solution, which may not be optimal.

We calculate ‘upper bounds’ and ‘lower bounds’ and use these to trap the optimal solution.

Upper bound

Better upper bound

Best upper bound

Optimal solution

Best lower bound

Better lower bound

Lower bound

If you find upper bounds for a problem of 143, 132 and 136, and lower bounds of 112, 130 and 115, then you find a route of length 130 do you know if you have found an optimal solution?

Yes, because 130 is equal to the best lower bound, so is therefore optimal.?

Finding a complete network of least distances.

If the network is complete and the edge weights are the least distance from 𝐴 − 𝐵 then the classical and the practical travelling salesman problems are equivalent.In the network below, vertex C must be visited more than once and the mountain pass is clearly redundant, so we take the approach to transform the network into a network of least distances then can apply the classical TSP.

B

CD E

A

14 18

2153

12

19

The triangle inequality, which states…

Doesn’t hold, and also the graph is not complete. You could travel from A to B, and the journey would take you 12 + 18 + 21 = 51 minutes

The longest side of a triangle

sum of two shorter sides

How long would it take to get from B to C?

39 minutes – you wouldn’t go over the pass to find an optimum solution to the TSP.

?

Let’s redraw the graph to make the triangle inequality hold and make the graph complete.

B C

D

E

A

51

12

21

19

39

18

14

44

32

30

??

?

???

? ?

?

?

Matrix for a complete network of least distances

B

CD E

A

14 18

2153

12

19

The longest side of a triangle

sum of two shorter sides

A B C D E

A - 51 12 44 30

B 51 - 39 19 21

C 12 39 - 32 18

D 44 19 32 - 14

E 30 21 18 14 -

? ? ? ?? ? ?

? ? ?? ? ?? ? ?

????

We have already completed the network of least distances on the previous slide, here we present an informal method to achieve the same thing using a distance matrix.

• Starting at vertex A, by inspection complete the first row of the matrix with the least distances to each other vertex.

• As the network is non-directional you can copy the first row into the first column.

• Repeat for 2nd row/column.• Continue until the matrix is complete.

Note – we only consider non-directional networks in this chapter, so the matrix will always be symmetrical.

Test your understanding

31

4

31

4

3

3

Pearson Decision 1, Page 106

Answer templates are available athttps://www.activeteachonline.com/default/player/document/id/763127/external/0/uid/1258

Exercise 5A

4

5

66

5

Order of the arcs is.DE(4), AE(5), BC(5), AD(6), BD(6), AB(7), CD(8),

Start with DE

E

D

AB

C

All vertices are connected so this is a minimum spanning tree. Its weight is 20.

Using a min. spanning tree to find an upper bound

1) Find a MST for a network using Prim or Kruskal.2) Double this (in effect you are retracing your steps) along each edge.3) Seek ‘shortcuts’ this makes use of some of the omitted arcs to enable you to bypass

some of the repeats and improve your upper bound.

E

D

AB

C

4

57

66

8

5

Use Kruskal’s algorithm to find the MST for the below network (from Chapter 3 slides).

Next

E

D

AB

C

E

D

AB

C

4

5

6

5

Initial upper bound = 40

E

D

AB

C

4

5

6

5

By inspection 𝐴 − 𝐵 looks like a shortcut

7

Improved upper bound = 32

1

Using Primm’s algorithm (from ch. 3 slides)

B

c) State the length of the initial upper bound, and use shortcuts to reduce the upper bound to below 1000

E

D

A FC

95 70 150 155

125

Initial upper bound is 𝟐 × 𝟗𝟓 + 𝟕𝟎 + 𝟏𝟓𝟎 + 𝟏𝟓𝟓 + 𝟏𝟐𝟓 = 𝟏𝟏𝟗𝟎

Many options but one is to go from F − 𝐵 & 𝐷 − 𝐸

E

D

A FC

95 70 150 155

125

B

240

Improved UB = 935

Redraw the network in a straight line 100

Test your understanding

Pearson Decision 1, Page 112

Exercise 5B

Using a minimum spanning tree to find a lower bound

You can use this method to find a LB for the classical problem.

1) Remove each vertex in turn (or whichever you are directed to remove)2) Find the residual minimum spanning tree (RMST)3) Add to the RMST the ‘cost’ of reconnecting the deleted vertex by the two

shortest, distinct arcs4) The greatest of these totals is used for the lower bound5) Make the lower bound as high as possible to reduce the interval in which the

optimal solution is know to be6) If the LB is a Hamiltonian Cycle, or if the LB is the same as the UP you have

found an optimal solution.

I had a bit of space on this slide, so I have given you this joke to work out.

https://www.explainxkcd.com/wiki/index.php/399:_Travelling_Salesman_Problem

Eg. Of using a MST to find LB

LA LV NY DC M

LA - 270 2790 2658 2733

LV 270 - 2522 2402 2535

NY 2790 2522 - 226 1280

DC 2658 2402 226 - 1054

M 2733 2535 1280 1054 -

The table below shows driving distances (in miles) between 4 U.S. Cities –Los Angeles, Las Vegas, New York, Washington DC, Miami.

By removing Los Angeles, find a lower bound for the TSP for this network.

LV NY DC M

LV - 2522 2402 2535

NY 2522 - 226 1280

DC 2402 226 - 1054

M 2535 1280 1054 -

Using Prim’s algorithm starting at LV, the order of arc selection is LV-DC, DC-NY, DC-M

LV NYDC

M

2402

226

LA

2658 Lower bound is 6610 miles

The cheapest way to connect LA with two distinct arcs is LA-LV & LA-DC

Test your understanding

Be deleting vertex B, find a lower bound for the TSP for this distance table.

Apply Primm’s Algorithm, starting at vertex A, the order of selection is AD, AE, CD, CF.

E

D

A FC

95 70 150 155 The cheapest way to connect B is BA & BD

A lower bound is 730km

B

125

b) A shorter route is opened up between town B and E of length 100km, and also between town B and F of length 95km.Calculate the new lower bound found by deleting vertex Band justify if this bound is known to be optimal or not.

New LB = EADCFBE = 665, which now gives a Hamiltonian Cycle so is optimal.

Pearson Decision 1, Page 117-118

Exercise 5C

Using the nearest neighbour algorithm to find UB

The method of finding a minimum spanning tree and selecting shortcuts is inefficient and difficult for large networks, an alternative approach is to use the nearest neighbour algorithm.

1) Select each vertex in turn as a starting point.2) Go to the nearest vertex which has not yet been visited.3) Repeat step 2 until all vertices have been visited and then return to the start

vertex using the shortest route.4) Once all vertices have been used as the starting vertex, select the tour with

the smallest length as the upper bound.

Note the differences between the nearest neighbour algorithm and Primm’s algorithm.

• Nearest neighbour finds a Hamiltonian Cycle, Primm finds a minimum spanning tree.• Nearest neighbour finds the closest vertex to the vertex last chosen, Primm finds the

closest vertex to any of the vertices already selected.

Note – you will usually be directed which vertex to start from. You will only have to check all vertices if you are told to do so.

Using the nearest neighbour algorithm to find UB

1) Select each vertex in turn as a starting point.

2) Go to the nearest vertex which has not yet been visited.

3) Repeat step 2 until all vertices have been visited and then return to the start vertex using the shortest route.

4) Once all vertices have been used as the starting vertex, select the tour with the smallest length as the upper bound.

Lets go back to our US road-trip and find an upper bound using the nearest neighbour algorithm starting at LA.

LA LV NY DC M

LA - 270 2790 2658 2733

LV 270 - 2522 2402 2535

NY 2790 2522 - 226 1280

DC 2658 2402 226 - 1054

M 2733 2535 1280 1054 -

• Look at LA column, select 270, delete row LA.

• Look at LV column, select 2402, delete row LV

• Look at DC column, select 226, delete row DC.

• Look at NY column, you have now visited every vertex except M so you must choose 1280

• Now return directly from M to LA on 2733

1 2 34 5

LA DCLV MNY

270 2402 225 12802733

An upper bound is 6910 miles

Compare this with the LB found on slide 15, the optimal solution is somewhere between the two.

LA LV NY DC M

LA - 270 2790 2658 2733

LV 270 - 2522 2402 2535

NY 2790 2522 - 226 1280

DC 2658 2402 226 - 1054

M 2733 2535 1280 1054 -

Test your understanding

Pearson Decision 1, Page 121-122

Exercise 5D

And finally…

For anyone wondering, the optimal solution to my US road trip is as follows…

LA – M - DC – NY – LV – LAFor a total of 6805 miles.

LA LV NY DC M

LA - 270 2790 2658 2733

LV 270 - 2522 2402 2535

NY 2790 2522 - 226 1280

DC 2658 2402 226 - 1054

M 2733 2535 1280 1054 -

TSP solver from table

Google Maps TSP solver