Authorship Verification Authorship Identification Authorship Attribution Stylometry.
Flight Time and Cost Minimization in Complex Routes · Declaration I declare that this document is...
Transcript of Flight Time and Cost Minimization in Complex Routes · Declaration I declare that this document is...
Flight Time and Cost Minimization in Complex Routes
Rafael Alexandre da Silva Marques
Thesis to obtain the Master of Science Degree in
Aerospace Engineering
Supervisors: Doutor Nuno Filipe Valentim RomaDoutor Luıs Manuel Silveira Russo
Examination Committee
Chairperson: Doutor Jose e Fernando Alves da SilvaSupervisor: Doutor Nuno Filipe Valentim Roma
Member of the Committee: Doutor Vasco Miguel Gomes Nunes Manquinho
October 2018
Declaration
I declare that this document is an original work of my own authorship and that it fulfills all the require-
ments of the Code of Conduct and Good Practices of the Universidade de Lisboa.
Acknowledgments
I would like to thank my parents and all my family, friends and both of my professors for the support,
patience and discussions that allowed for the development and conclusion of this work.
This work was partially supported by national funds through Fundacao para a Ciencia e a Tecnologia
(FCT) under projects UID/CEC/50021/2013 and PTDC/EEI-HAC/30485/2017.
ii
Abstract
The present work formalizes and addresses the Flying Tourist Problem (FTP), a NP-hard problem
that occurs as a generalization of the Traveling Salesman Problem (TSP), and whose goal is to find
the best schedule, route, and set of flights, for any given unconstrained multi-city flight request. In fact,
despite the current existence of numerous flight search applications, most of them lack the ability to
properly address unconstrained multi-city flight requests, since this problem is generally not tractable. In
accordance, the main goal of this research is to develop a methodology that allows an efficient resolution
of this rather demanding problem. To accomplish this, different heuristics and meta-heuristic optimiza-
tion algorithms were considered, including the Ant Colony Optimization and the Simulated Annealing,
allowing the identification of solutions in real-time, even for large instances. The developed methods
were integrated into a web application prototype, allowing a fast resolution of user-defined requests.
In particular, the implemented system was evaluated using different criteria, including the quality of its
optimization system; the utility of the devised problem; and its performance compared to other similar
systems. Furthermore, when comparing the developed system to the only known (but not-disclosed) al-
ternative, it was shown that the developed application provides the cheapest and the best-recommended
solutions, respectively 95% and 74% of the times. As a result, upon the planning of a complex multi-city
trip, the developed system showed to allow the user to save a significant amount of time and money.
Keywords
Flying Tourist Problem, Traveling Salesman Problem, Combinatorial Optimization, Ant Colony Optimiza-
tion, Simulated Annealing, Web Application.
iii
Resumo
O presente trabalho formaliza e aborda um problema a que se designou Problema do Turista Voador,
que ocorre como uma generalizacao do conhecido Problema do Caixeiro Viajante, e cujo objectivo e de-
terminar o melhor agendamento, rota e conjunto de voos que permitem cumprir um itinerario que passa
por varias cidades, sem restricoes, e realizada apenas com base em voos comerciais. O principal objec-
tivo deste trabalho e o desenvolvimento de uma metodologia eficiente para a resolucao deste problema.
Para concretizar este objectivo, considerou-se algoritmos de optimizacao baseados em tecnicas de
Optimizacao por colonia de formigas e Tempera Simulada, o que permite a determinacao de solucoes
em tempo real, mesmo para problemas de grande dimensao. Os metodos desenvolvidos foram inte-
grados e prototipados num servico de internet, permitindo a resolucao de problemas reais definidos
pelo utilizador. O sistema implementado foi avaliado usando diferentes criterios, incluindo a qualidade
do seu sistema de optimizacao; a utilidade do problema proposto; e o seu desempenho quando com-
parado com outros sistemas semelhantes. Alem disso, ao comparar o sistema desenvolvido com a
unica alternativa (nao aberta) actualmente existente, verificou-se que em 95% das vezes a solucao
encontrada e a mais barata e que em 74% das vezes corresponde a melhor solucao recomendada.
Consequentemente, o sistema desenvolvido oferece vantagens significativas no planeamento de via-
gens envolvendo varias cidades, permitindo poupar quantidades significativas de tempo e dinheiro.
Palavras Chave
Problema do Turista Voador, Problema do Caixeiro Viajante, Optimizacao Combinatoria, Optimizacao
por colonia de formigas, Tempera Simulada, Servico de internet.
v
Contents
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Existing flight search services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.1 User search tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.2 Developer search tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Main contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 Document structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 Literature review on the Traveling Salesman Problem 9
2.1 Common Combinatorial Optimization Problems . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.1 Traveling Salesman Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.1.A Problem definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1.1.B Time Dependent TSP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.1.1.C TSP with time-windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.1.1.D Multi-objective TSP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.1.2 Vehicle Routing Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.1.2.A Problem definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.1.2.B Time-dependent VRP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.1.2.C Multi-objective VRP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Common Optimization Methods overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2.1 Exact algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2.1.A Integer Linear Programming . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2.1.B Branch and Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.2.2 Heuristic algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.2.2.A Held-Karp Lower Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2.2.B Tour construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2.2.C Tour improvement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
vii
2.2.3 Meta-Heuristic algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2.3.A Ant Colony Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2.3.B Simulated Annealing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3 Flying Tourist Problem 33
3.1 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.2 Relation to the Traveling Salesman Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3 Graph representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.4 Dimensional overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.5 Optimization methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.5.1 Heuristic algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.5.1.A Pseudo-random construction procedure . . . . . . . . . . . . . . . . . . . 40
3.5.1.B Nearest neighbour procedure . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.5.2 Metaheuristic algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.5.2.A Ant Colony Optimization procedure . . . . . . . . . . . . . . . . . . . . . 42
3.5.2.B Simulated Annealing procedure . . . . . . . . . . . . . . . . . . . . . . . 44
3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4 System Design and Implementation 47
4.1 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.1.1 Client Side Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.1.2 Server Side Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.1.2.A Data Management System . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.1.2.B Optimization system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.2 Underlying technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.2.1 Heroku . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2.2 Node.js . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.2.3 Django . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.2.4 React and Redux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.3 Client Side Application implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.3.1 User Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.3.2 Communication with the SSA API . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.3.3 User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.4 Server Side Application implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.4.1 SSA dataflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.4.2 SSA interface protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.4.3 Data Management System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
viii
4.5 Optimization system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5 Experimental Results 67
5.1 Optimization module evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.1.1 Asymmetric TSP evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.1.2 Time-dependent TSP evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.1.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.2 Flying Tourist Problem evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.2.1 Quantitative evaluation and improvement . . . . . . . . . . . . . . . . . . . . . . . 72
5.2.2 Balancing the total flight price and duration . . . . . . . . . . . . . . . . . . . . . . 72
5.2.3 Impact of the trip start interval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.2.4 Response time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.3 Comparison with Kiwi ’s Nomad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.3.1 Absolute comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.3.2 Quantitative evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
6 Conclusions 79
6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
6.2 Achievements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
6.3 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
ix
x
List of Figures
2.1 Illustration of the classes P, NP, NP-complete and NP-hard . . . . . . . . . . . . . . . . . . 12
2.2 The 2-opt local search reconnects two edges, hoping to fold possible crossovers, decreas-
ing the overall tour cost. In the left image, a crossover is identified. In the middle image,
the edges belonging to this crossover are removed, and in the figure to the right, they are
reconnected, forming a new valid tour. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.3 The crossing bridges can only be solved by reordering 4 edges. The resolution of this
problem with local search is only possible with 4-opt or higher. . . . . . . . . . . . . . . . 27
2.4 The double bridge experiment. On the left, two bridges with the same length. Experimen-
tal results show that ants distribute themselves evenly amonst both bridges. On the right,
one of the bridges is longer than the other. Experimental results show that ants use the
shorter bridge more often. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.1 Illustration of a Flying Tourist Problem using a multipartite graph. To each node (A,B,C) it
is associated a waiting period of respectively (1,2,3) time units. The red arrows represent
a possible solution to the problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2 Illustration of the distribution of the initial, final and transition arcs. . . . . . . . . . . . . . . 38
4.1 Structure and data flow of the proposed application. . . . . . . . . . . . . . . . . . . . . . 48
4.2 Proposed User Interface layout for small/medium and large devices. There are two es-
sential views and one optional view. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.3 Simplified illustration of the optimization system, which utilizes different algorithms to pro-
duce a solution to a user defined request. . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.4 Technology stack used by the developed application. . . . . . . . . . . . . . . . . . . . . . 53
4.5 Block diagram of the state cycle of the Client Side Application. . . . . . . . . . . . . . . . 56
4.6 Building blocks of an application built using React And Redux. . . . . . . . . . . . . . . . 57
4.7 Structure of the developed user interface . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.8 Application rendered on a desktop computer. . . . . . . . . . . . . . . . . . . . . . . . . . 60
xi
4.9 Application rendered on a mobile device. . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.10 Server Side Application dataflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.11 The Data Management System uses concurrent HTTP requests to communicate with
third-party flight API’s. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.12 Flowchart of the implemented Ant Colony System procedure. . . . . . . . . . . . . . . . . 64
4.13 Flowchart of the implemented Simulated Annealing procedure. . . . . . . . . . . . . . . . 65
5.1 Variation of the total flight price and duration when minimizing the entropy objective function. 73
5.2 Price improvement as a function of the trip start interval. . . . . . . . . . . . . . . . . . . . 74
5.3 Required time to perform 100 HTTP requests to a third-party API, as a function of the
number of concurrent requests. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.4 Total response time to a request, as a function of the number of nodes and length of start
period. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.5 Comparison of the results provided by the proposed tool and by Kiwi ’s Nomad application. 76
5.6 Comparison of the recommended flights price and duration, as a function of the number
of nodes and the length of start interval. The presented values refer to the proposed
application response, and were normalized with respect to Kiwi ’s Nomad response value. 77
xii
List of Tables
1.1 Search results across several applications. Search tools provided according to application 4
1.2 Multi-city search results and tools. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4.1 Parallelism between User Input, Actions and Reducers. To each user defined input corre-
sponds an action, declaring the intent of changing the state with some specific data, and
a reducer, which actually modifies the state. . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.2 Keyword/value pairs of the SSA protocol to uniquely identify a resource. . . . . . . . . . . 62
4.3 Algorithm specific parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.1 Performance of the ACO and SA on the asymmetric TSP benchmarks, taken from the
TSPLib (stop-criteria = 60 seconds). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.2 Effects of increased optimization time on the final result. . . . . . . . . . . . . . . . . . . . 70
5.3 Performance of the ACO and SA on the time-dependent TSP benchmarks (stop-criteria =
60 seconds). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.4 Comparison of different Flying Tourist Problem solutions obtained with distinct algorithms
and optimization criteria, considering the Metric Nearest Neighbor approach (shaded
gray) as reference. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
List of Algorithms
1 ACO metaheuristic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2 SA metaheuristic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
xiii
xiv
Acronyms
TSP Traveling Salesman Problem
ATSP Asymmetric Traveling Salesman Problem
TDTSP Time-dependent Traveling Salesman Problem
ACO Ant Colony Optimization
DTS Direct Travel Supplier
API Application Programming Interface
FTP Flying Tourist Problem
COP Combinatorial Optimization Problem
VRP Vehicle Routing Problem
ILP Integer Linear Programming
BB Branch and Bound
SA Simulated Annealing
LK Lin-Kernighan algorithm
CSA Client-side application
SSA Server-side application
HTTP HyperText transfer protocol
AJAX Asynchronous JavaScript and XML
URI Uniform Resource Identifier
UI User Interface
xv
JSON JavaScript Object Notation
OS Optimization System
DMS Data Managemet System
HTML HyperText Markup Language
DOM Document Object Model
OTA Online Travel Agency
API Application Program Interface
HTTP Hypertext Transfer Protocol
OS Operating System
UI User Interface
xvi
List of Symbols
The next list describes several symbols that will be later used within the body of the document
α The relative pheromone influence
∆f The objective function difference
∆avg The average objective function difference
η The inverse of artificial pheromone value
λ The cooling coefficient
Ω The set of constrains amongst the decision variables
Π The a Combinatorial Optimization Problem (COP)
ρ The pheromone evaporation ratio
σ A permutation of the set of nodes
τ The artificial pheromone value
A The set of arcs
aij The arc transition from nodes i to node j
atij The arc transition from nodes i to node j at time unit t
cij The cost of transition of arc aij
ctij The cost of transition of the arc atij at time unit t
cwtk The cost of the waiting time at time k
D The set of durations of stay
f The objective function
xvii
G The graph of a COP
i, j sindex components of a node
M The Markov chain length
m The number of artificial ants
p The flight duration (processing time
py The probability of accepting a candidate solution y
q Pseudo-random value calculated at run-time
Q0 exploration rate
S Search space of a COP
s A solution to a COP
s∗ A optimal solution
t The time index
T0 The allowable start period
tk Temperature of the state at time k
TW The set of time window associated to each node
V The set of nodes
wc The relative weight of the flight price
wp The relative weight of flight duration
x, y Solutions to a COP
xviii
1Introduction
Contents
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Existing flight search services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Main contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 Document structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1
1.1 Motivation
Online Travel Agency (OTA) are online applications that sell traveling goods, as, for example, commercial
flight tickets. Although many consumers retain the option to buy flights directly from airline companies,
the majority opts to use OTA. The main reason for this is that these agencies aggregate flight data from
multiple airlines, instead of being limited to a single one, which ultimately increases the options of the
consumer. Furthermore, many OTA work as meta-search engines, searching over a variety of websites
in order to find the best flights which satisfy the consumer requirements. However, while OTA usually
provide very complete search functionalities for simple flights, the majority fails to offer the same search
options for a trip composed of multiple cities.
As an example, consider a trip starting and ending at a given city, which must visit every other city
specified in a particular list of cities. If there are no constraints associated with the order in which these
cities must be visited, this problem is already well known in the scientific community as the Traveling
Salesman Problem (TSP). This problem is considered very difficult to solve, since the total number of
possible solutions increases in an exponential way, as the number of cities increases.
However, when commercial flights are the means of transportation between every pair of cities, this
problem can no longer be considered the classic Traveling Salesman Problem (TSP), but rather its
time-dependent counterpart. This is because some of the major flight characteristics, as its price and
duration, cannot be considered constant over time. Rather, they are dependent on the particular flight
that is selected, and the characteristics of that flight may follow no apparent logical rule, at least from the
consumers perspective.
Consequently, finding the most efficient set of flights, from the consumer point of view, tends to be a
repetitive and time-consuming task. Faced with this problem, only the most persistent consumer will be
able to find the best solution for the problem, and this can only occur for a very small list of cities. As the
number of cities increases, even the most persistent consumer can hardly verify all possible solutions.
This means that, ultimately, the final consumer will pay more than necessary for the requested service.
This thesis also arises as a response to the public contest called Traveling Salesman Challenge [1],
issued by Kiwi, a well established OTA, even though the beginning of this work dates prior to the issue of
the challenge. In this challenge, Kiwi recognizes that, in most cases, users do not care about the order
in which they visit a given list of cities. The challenge was set since most OTA do not offer these type of
search tools, due to the computational complexity associated with the problem, although there exists a
market niche with interest in these type of services.
In accordance, this work intends to address the problem of solving unconstrained multi-city trips, by
studying it and by developing the necessary tools to effectively solve the problem in a time-efficient man-
ner. It also aims at the development of a proof of concept online flight search application, implementing
these technologies in order to, ultimately, provide high-quality search for complex flight requests.
2
1.2 Existing flight search services
The tourism industry dates to the 19th century, but it was impacted by some significant chapters of
human technology, which led to an increased and sustained growth of its market size. First, during the
1920’s, the development of commercial aviation had a significant positive impact on the industry, shifting
the transportation focus to the airplane. Much later, during the 90’s, the establishment of the internet led
to some changes in the market, because airlines could sell directly to the passengers [2]. More recently,
the widespread use of mobile phones led to a new increase of the markets size. In 2016, the direct
contribution of the tourism industry for the GDP was over 2.3 trillion dollars, while the total contribution
was over 7.6 trillion dollars [3].
The market size growth of the tourism industry is sustained by traveling agencies, whose main func-
tion is to serve as an agent, advertising and selling products and services on behalf of others [4]. These
services usually include, but are not limited to, transportation, accommodation, insurance, tours and
other tourism associated products. In recent years, Online Travel Agencies (OTA) became particularly
important to the industry. This is because they allow a fast, direct and rich interaction with the user.
By using only a cell-phone, most people can, in a matter of minutes, search and book a flight, hotel
services, and, if necessary, even car rental.
Most OTA operate as meta-search engines, that is, they perform a search across multiple indepen-
dent travel services providers. This is a significant difference to the services provided by Direct Travel
Supplier (DTS), which are limited to offer their own services. Examples of DTS are airlines, hotels and
car rental companies which, usually, only sell their own product or service directly to the client. On the
other hand, OTA usually do not own any travel services, but serve solely as an intermediary between
the traveler and the travel services provider. Recent reports show that OTA are increasing their market
share, but direct travel suppliers still account for 57% of the total online travel consumption [5].
Hence, this section will present a brief overview of the search tools provided by different OTA upon
the request of information regarding a single-flight, as well as round and multi-city trips. Since the search
tools provided by OTA may vary according to the entity making the request, this overview is done both
from the user and the developer point of view, in sections 1.2.1 and 1.2.2, respectively.
1.2.1 User search tools
This section will focus on providing an overview of the search tools available for users to find and book
commercial flights online. Thus, it presents a brief survey of the services provided by different meta-
search engines, in particular, Google Flights, Booking.com, Tripadvisor, Cheapflights.com, Momondo,
Expedia, Edreams, Kayak, Kiwi and Skyscanner. This overview is grouped according to the trip size.
Single-flights and round-trips are considered together, and multi-city trip are considered separately.
3
Table 1.1: Search results across several applications. Search tools provided according to application
Application Date- Price Multiple Map Price [C] Price [C]range overview results single-flight round-trip
Google Flights X X X 65 127Expedia 81 167
Booking.com X X 67 126Momondo X X 62 102TripAdvisor 56 124
cheapflights.com 75 150Adioso X X 69 137
Kiwi X X X X 76 224SkyScanner X X 65 127
Edreams X X 56 116
As to summarize the characteristics of the solutions presented by each of the tested applications,
the same flight queries were performed on each. In every application, three different queries were
performed, one for each type of flight search. First, a single-flight from Lisbon to Amsterdam, on the
date of 8/6/2018. Second, a round-trip involving the same two cities and start date, but considering
a duration of stay of 3 days. Finally, the third query considers three different single-flights: i) from
Lisbon to Amsterdam (8/6/2018); ii) from Amsterdam to Berlin (11/6/2018); and iii) from Berlin to
Lisbon (14/6/2018). Given that different applications have access to different flight data, it is expected
that each application presents a different results. This overview intends to analyze and quantify these
differences. It is also worth noting that a round-trip is not necessarely composed of two flights from the
same airline but may, instead, be composed of two completly independent flights. In general, a flight
search application will consider both cases and suggest the best option.
The results of the single-flight and round-trip queries are presented in table 1.1. The first column of
this table indicates the application in which the previously described queries were performed. The sec-
ond column denotes the ability of that application to provide a search over multiple start-dates, instead of
forcing a single one. The third column indicates if the application presents an overview of the prices for
different start-dates, usually in the form of a visual aid, as a bar chart. The fourth column indicates if the
application responds to the query with different results for different objective functions, as the cheapest,
the fastest and the recommended set of flights. In its turn, the fifth column indicates if the application
integrates a map in its user interface. Finally, the sixth column indicates the price, in euros, of the single
flight query, while the seventh and last column indicates the price of the round-trip query.
Given the considered single-flight and round-trip queries, which were performed on all applications,
and its results presented in table 1.1, it is possible to conclude that:
• different applications provide significantly different results:
– Single-flight prices vary up to 44%;
4
– Round-trip prices vary up to 120%;
• only half of the applications provide date range support;
• only half of the applications provide multiple results for different objective functions;
• some applications present an overview of the different prices as a function of time;
• some applications present an interactive map.
The results of the multi-city query are presented in table 1.2. Columns 3-6 present an overview of the
search tools available on the respective applications, as previously described. In its turn, the seventh
and final columns indicate the price, in euros, of the multi-city flight requests. There is also a column (the
second one) which indicates the applications ability to perform unconstrained multi-city flight requests.
Note that upon the execution of this set of tests, there was no application that provided this tool, but
during the development of this work, Kiwi launched a service (denoted Nomad) which addresses this
problem. As far as we know, this is the only application that provides such a service at the present time.
In fact, while the majority of the applications present a multi-city flight search option, this option usu-
ally forces the user to constrain the query to a predefined route and schedule. However, if the requested
trip involves visiting n different cities, then there are n! different routes to visit all cities. Moreover, if the
schedule of the trip considers an extended start period of d days, then there are n! ∗ d combination of
different routes and schedules for the trip. Hence, by forcing the user to a predefined route and schedule,
the application is effectively searching for only 1 of the n! ∗ d different available solutions to the request.
Given the considered multi-city queries, which were performed on all applications, and its results
presented in table 1.2, it is possible to conclude that:
• different applications provide significantly different results, with prices varying up to 135%;
• most applications only provide constrained multi-city trip search utilities;
• only one application provides unconstrained multi-city trip search utilities;
• most applications do not present different results for different search criteria;
• no application offers the ability to perform multi-city trips on a range of start dates;
• no application displays an interactive map to help the user.
1.2.2 Developer search tools
From the perspective of the developer who intends to create a flight search engine, there are several
OTA that provide access to their Application Program Interface (API), allowing the direct access to flight
data. These APIs usually offer the possibility of searching for cached and, sometimes, real-time flight
5
Table 1.2: Multi-city search results and tools.
Application Flexible Date- Price Multiple Map Price [C]multi-city range overview results multi-city
Google Flights 184Expedia 420
Booking.com 191Momondo X 178TripAdvisor X 238
cheapflights.com -Adioso -
Kiwi X X 226SkyScanner X 238
Edreams 333
data. In some cases, these APIs also extend their range of services, and include endpoints for the
access of hotel information, car-rental and railroad services, and even cruise ships itineraries [6,7].
An API may be classified as public, limited or private according to the restrictions it imposes to its
access. Private APIs are those which are only accessible to enterprises. An example of a company
whose API is private is Skyscanner. In its turn, Google offers a service which can be classified as
limited, since it only offers up to 50 free queries per day. Finally, there are other APIs whose access is
completely free and unlimited, and therefore, are classified as public. Kiwi is an example of a company
that provides a public API.
Given the academic purpose of this work, the selection of a flight API is restricted to those that
are public. Unfortunately, the number of public flight APIs is very short. Among the 10 companies
enumerated in the previous section, only three have public APIs: Expedia, Amadeus and Kiwi. The first
two companies operate mostly in North America and their flight data is mostly restricted to that continent.
In its turn, Kiwi is an European company, but does not limit its services to this continent. Instead, its API
operates as a meta-search engine, aggregating data from different content providers.
As a consequence, during the development of this work, the necessary flight data will be provided by
the Kiwi public API. The major search utilities that this particular API offers are [8]:
• access to relevant information about cities, airports and airline companies;
• access to single-flight and round-trip information, with flexible queries that include:
– flexible start dates;
– flexible durations of stay;
– queries with undefined destination;
– queries with multiple origins and destinations;
• the possibility to aggregate up to 9 single-flights and round-trip queries in a single request.
6
It is worth noting that, although Kiwi (and other companies) enable the query of multiple flights at
once, each flight must specify a particular pair of cities and date. Thus, this corresponds to a constrained
multi-city search, as discussed in the previous section, and does not actually provide information regard-
ing the best route, schedule or set of flights for an unconstrained multi-city trip.
1.3 Objectives
Th thesis has has two main goals:
• The study of the unconstrained multi-city flight routing problem, and the development of an effective
optimization strategy to address this problem;
• The development of a web application prototype, that integrates a working solution to the afore-
mentioned problem.
The first goal of this work can be accomplished by studying similar routing problems, as is the Trav-
eling Salesman Problem. This should provide the theoretical background that is necessary to better
understand the problem, and to developed and implement the necessary optimization algorithms to
solve it. Finally, to address the second goal, it is necessary to develop the necessary tools to create a
web application, and integrate it with the optimization solution devised for the resolution of the problem.
1.4 Main contributions
The main contributions of this work are included in the following article, which will be submitted to
publication in the journal Expert System with Applications, from Springer:
Rafael Marques, Luıs Russo, and Nuno Roma, ”Flying Tourist Problem: flight time and cost mini-
mization in complex routes”, submitted to Expert Systems with Applications, Springer, October 2018.
1.5 Document structure
The presented work can be divided into two main parts: the study of the unconstrained multi-city routing
problem, and the implementation of a web application that integrates a solution to this problem. Thus,
this document is structured as follows.
Chapter 2 presents a literature review on problems that are closely related to the one under study, in
particular, the Traveling Salesman Problem. Chapter 3 presents a formal definition of the problem and
7
defines the optimization strategies that will be implemented to solve it. Chapter 4 presents the develop-
ment and implementation details of a web application that allows the resolution of unconstrained multi-
city flight requests. This is achieved by implementing the previously mentioned optimization strategies
to solve the user requests. Chapter 5 evaluates the implemented system, by analyzing the performance
of its different parts, and the utility of the developed solution as a whole. Finally, Chapter 6 presents the
main conclusions of this work, and addresses the future work that may improve the developed solution
with further utilities.
8
2Literature review on the Traveling
Salesman Problem
Contents
2.1 Common Combinatorial Optimization Problems . . . . . . . . . . . . . . . . . . . . . 10
2.2 Common Optimization Methods overview . . . . . . . . . . . . . . . . . . . . . . . . . 21
9
The problem that this work wishes to address, and which will be formally introduced as the Flying
Tourist Problem (FTP) in chapter 3, is closely related to the Traveling Salesman Problem (TSP). Both
of these problems belong to broader classes of optimization problems, particularly, Combinatorial Opti-
mization Problems (COPs) and Routing problems. This chapter will start by introducing the concept of
Combinatorial Optimization Problems in section 2.1 and, in particular, the Traveling Salesman Problem
and the Vehicle Routing Problem in subsections 2.1.1 and 2.1.2, respectively. This is followed by an
overview of the most common methods to solve the Traveling Salesman Problem, presented in section
2.2.
2.1 Common Combinatorial Optimization Problems
Lawlers (1976) [9], defined combinatorial analysis as ”the mathematical study of the arrangement, group-
ing, ordering, or selection of discrete objects, usually finite in number”. Schrijver (2002) [10], following
this definition of Lawlers, improves it with the important concept of optimal solution, ”Combinatorial op-
timization searches for an optimum object in a finite collection of objects.”. This definition is followed by
a remark, stating that ”typically, the collection (...) grows exponentially in the size of the representation”,
and concluding that ”scanning all objects one by one and selecting the best is not an option”.
Following a more concise definition [11], a combinatorial optimization problem (COP) may be defined
as follows:
Definition 1) A combinatorial optimization model P = (S,Ω, f) consists of:
1. a search space S, defined by a finite set of decision variables, each with a domain;
2. a set Ω of constraints amongst the decision variables;
3. an objective function f : S → R+0 , to be minimized.
The search space is defined by a set of decision variables Xi, i = (1, ..., n), each associated to a
domain Di, which specifies the possible value of each decision variable. An instantiation of a variable is
an assignment of a value vji ∈ Di to a variable Xi. This leads to the definition of a feasible solution s ∈ S,
which corresponds to the assignment of a value to each decision variable, according to its domain, in
such a way that all constraints in Ω are satisfied. Finally, the objective of the problem is to find a global
minimum of P, that is, a solution s∗ ∈ S such that f(s∗) ≤ f(s) ∀s ∈ S. The set of all global minima is
denoted by S∗ ⊆ S.
When working on combinatorial optimization problems, it is useful to have an idea of how difficult the
problem is. This characterization is provided by a field called computation complexity. A COP, Π, is said
to have worst-case time complexity O(g(n)), if the best algorithm for solving Π finds an optimal solution
to any instance of size n of Π, in a computation time upper bounded by g(n).
10
A problem Π is said to be solvable in polynomial time if the maximum amount of computing time that
is necessary to solve any instance of size n is bounded by a polynomial in n. If k is the largest exponent
of such a polynomial, then the combinatorial optimization problem is said to be solvable in O(nk) time.
Hence, a polynomial time algorithm is characterized by a computation time bounded by O(p(n)),
for some polynomial function p, where n is the size of the problem instance. If an algorithm has a
computational time that can not be bound by a polynomial function is denoted as an exponential time
algorithm. Any problem that can be solved in polynomial time is said to be tractable, while problems that
are not solvable in polynomial time are called intractable.
In the field of computational complexity, there is also another important concept called polynomial
time reductions, which transform a problem into another problem, in polynomial time. If the latter is
solvable in polynomial time, than so is the first one. The class of problems which is solvable in polynomial
time is called P . On the other hand, there is a class of problems called NP , which stands for non-
deterministic polynomial acceptable problems, for which given a solution can be verified in polynomial
time. The class of NP-complete refers to the most difficult problems in NP.
It is worth mentioning that there exists another class, called NP-hard, for which each problem is as
hard as the hardest NP-complete problem. More precisely, a problem H is NP-hard when every problem
in NP can be reduced to H using a polynomial time transformation. This definition of the NP-hard class
leads to the logical conclusion that finding a polynomial time algorithm for any problem in NP-hard,
would imply the resolution of all NP problems in polynomial time. However, until now, no polynomial
time algorithm was found for any NP-hard problem. Note that the class NP-hard does not necessary
coincide with the class NP.
Regarding this, there is a discussion among the scientific community regarding the question ”P =
NP?”, since it is one of the major unsolved problems in computer science. Figure 2.1 illustrates this
discussion by depicting the several classes, according to both possible solutions to the aforementioned
question.
2.1.1 Traveling Salesman Problem
Given a list of cities and the distances between them, the Traveling Salesman Problem (TSP) is the
combinatorial optimization problem of finding a minimum length route which connects every city. With
this original definition, proposed by A. Punne [12], the focus of the TSP is to perform optimization on
routing problems, as the school bus problem, studied by Merrill Flood in 1942 [13], minimizing the total
distance of a tour. Some variations of the original formulation allow the adaptation of the problem to suit
different optimization goals [14]. For example, instead of distance, the focus may be the minimization
of the total cost, travel time, or some other attribute associated to the problem under consideration. It
is also possible to search for a route which minimizes two, or more, objective functions at once [15].
11
Figure 2.1: Illustration of the classes P, NP, NP-complete and NP-hard
In some routing problems, the tour under consideration must satisfy some constraints [16]. Most often,
these constraints refer to scheduling conflicts which must be satisfied [17]. A practical application of this
is the resolution of routing problems with time windows [18].
When states in the field of Graph theory, the TSP is the problem of finding a minimum cost Hamil-
tonian cycle over a complete, undirected, weighted graph [19]. The problem of finding a minimum cost
Hamiltonian cycle was shown to be NP-complete [20]. This implies the NP-hardness of the TSP [21].
Furthermore, in several graph problems, considering a symmetric cost between two points is not suit-
able. This is known as the asymmetric TSP, and it considers a directed graph instead [22].
In the flight industry, the Traveling Salesman has vast applications. As an example, it was applied to
aircraft scheduling in the terminal area, enabling the increase in the airports capacity [23]. More recently,
the TSP, and its time dependent variation, have been focus of attention in fields related to Unmanned
Aerial Vehicle routing ( [24], [25]). There is also an online website which introduces the Air Traveling
Salesman, whose goal is to find the layover airports when no direct route is available [26].
In some cases, the classic formulation of the Traveling Salesman does not adequately describe the
characteristics of the problem under consideration. To overcome this, different problem formulations are
considered. An example of this is the time dependent TSP [27]. In this formulation, the cost of each
arc is not constant, but varies as a function of time. In general, this problem is harder to solve than
the classic TSP [28]. There are several other combinatorial optimization problems which benefit from
considering a time dependent approach [29]. Vehicle routing is a field which particularly focus on this
problem, due to the characteristics of street traffic [30]. In fact, the Traveling Salesman Problem can be
regarded as a special case of the Vehicle Routing Problem, in which the fleet is composed by only one
vehicle, the salesman, [30]. Because of this, works related to the Vehicle Routing Problem may also be
relevant for the resolution of the Traveling Salesman Problem.
During the day to day activities, many people are faced with similar routing problems every day.
12
Consider the problem of walking or driving from point A to point B. This is a graph problem, in which
the arcs are the streets, the nodes the streets intersections [31]. In its turn, the weights refer to the
distance or travel time, which in its turn may be affected by other parameter, as traffic [27]. If the person
is familiar with the graph, they are capable of finding a good route mentally, in a very fast manner [32]. If
the undertaken route is to visit a set of points exactly once, before returning to the original starting point,
this is known as the Traveling Salesman Problem.
This section is structured as follows. First, we formally define the classic Traveling Salesman Prob-
lem, in section 2.1.1.A, and present some of its most common variations. This is followed by a more
in-depth overview of the time-dependent TSP, in section 2.1.1.B, as this problem is particularly relevant
for the work under development. For the same reason, the following subsections 2.1.1.C and 2.1.1.D
present the TSP with time-windows and with multiple objectives, respectively.
2.1.1.A Problem definition
As it was previously referred, the Traveling Salesman Problem may be defined both as combinatorial
optimization and as a graph problem. In either case, the TSP is defined by a graph G = (N,A), where N
is the set of nodes, and A is the set of arcs connecting those nodes. The set of nodes is of size n = |N |,
while the size of the set of arcs is n2. Each arc, aij ∈ N has an associated weight, cij , which represents,
for example, the distance between cities. The set of arcs is fully connected, that is, each node is capable
of directly reaching any other node, without visiting a third node. When two nodes can not be connected
by an arc, the cost of that node is considered as very high. In the classical TSP formulation, the graph
is undirected, which implies symmetry in the costs of the arcs, that is: cij = cji∀i, j ∈ N . Because of the
characteristics of this TSP formulation, the graph is said to be connected, weighted and undirected.
The objective of the TSP is to find the minimum cost Hamiltonian cycle, that is a path which visits
each node exactly once, and returns to the initial node, closing the path. A generic solution to the TSP
is any permutation σ = (σ(1), . . . , σ(n)) over the set of nodes N . The permutation σ is also a set, where
σi, i ∈ len(σ), represents the node in the i’th index of the cycle. The cost of a cycle is given by the sum
of the weights of each arc by which it is composed, that is C(σ) =∑ni=1 cσiσi+1
, where σn+1 = σ1
The TSP may solve different types of problems by optimizing different parameters. In the classic
formulation, the weight between an arc connecting two nodes represent the distance between two cities.
However, the weight of an arc can represent different things, particularly, travel time or travel cost.
Although changing the parameter may lead the TSP formulation intact, in some cases, it changes the
problem. This occurs, for example, when considering that the costs are time-dependent and this will be
approached in section 2.1.1.B.
Up until now, only the symmetric Traveling Salesman Problem was addressed. However, the Travel-
ing Salesman Problem usually refers to a broader class of problems, which include, but are not limited,
13
to the symmetric case. These problems are often called variations of the symmetric TSP. Below, the
most common variations will be presented briefly, by providing a description for the asymmetric, metric,
euclidean and the bottleneck TSP, as well as the messenger problem. Other TSP variations, as the
time-dependent one, will be discussed in its own subsection, as they are particularly relevant for the
work under development. The definitions here provided are with respect to those formalized by with
respect to those formalized by [33].
Asymmetric TSP
In the Asymmetric Traveling Salesman Problem (ATSP), the weight matrix associated to the problem
is not symmetric. That is, there is no constraint imposing that cij = cji, ∀i, j ∈ N , i 6= j, as happens with
the classical TSP.
For some particular real world problems, the ATSP may be more adequately describe the problem
than the TSP. For example, when considering a routing problem over a city, some roads may not be
connected in both ways. In this case, the weight of an arc connecting two points is different, depending
on the direction of traversal of the arc.
Metric TSP
The metric TSP is a special case of the TSP, in which the arcs cost, in addition to being symmetric,
also respect the triangle inequality. That is, cij ≤ cik + ckj , ∀ i, j, k ∈ N .
Euclidean TSP
In the Euclidean TSP, the set of nodes is placed in a d-dimensional space, and the weight of each
arc is given by the euclidean distance. This distances is calculated based on equation 2.1, for two points
x = (x1, x2, ..., xd) and y = (y1, y2, ..., yd).
dij =
( d∑i=1
(xi − yi)2)1/2
(2.1)
The euclidean TSP is a variation which is both symmetric and metric.
Bottleneck TSP
In the Bottleneck TSP, the objective is to find a valid route which minimizes the cost of the highest
cost arc of the tour. According to the characteristic of the graph, the Bottleneck TSP may either be
symmetric, asymmetric, metric or time-dependent.
The Messenger Problem
The Messenger problem, also known as the wondering traveling salesman, is the problem of finding
a minimum cost Hamiltonian path connecting edges u and v of the graph G. It can be seen as a Traveling
Salesman Problem in which the tour is not closed, but ends on a specific node, different from the initial
one. The Messenger problem can be transformed into the TSP, by considering a cost of −M for the
arc (v, u), where M is a large number. If the nodes u and v are not specified, and one wishes to find a
minimum cost Hamiltonian path in G, this can be achieved by a graph transformation, adding one node
14
and connecting it to all other nodes by arcs of cost −M . The optimal solution to the TSP on this modified
graph can be used to produce the optimal solution to the original problem.
2.1.1.B Time Dependent TSP
The Time-dependent Traveling Salesman Problem (TDTSP) is a generalization of the TSP, where arc
costs depend on their position in the tour [34, 35]. This section first introduces the TDTSP as a graph
problem, followed by its definition as a sequencing problem.
TDTSP as a graph problem
Let N = 1, 2, ..., n and let N0 = N ∪ 0. The TDTSP on a complete graph K(N0) can be modeled
as an optimization problem over a layered graph (V,A). V is the set composed by the source node 0, the
termination node T , and intermediate nodes ai,t for i, t ∈ N . In this representation of the intermediate
nodes, the first index of ni,t identifies the node of the graph K(N), and the second index represents the
position of the node i in the path between nodes 0 and T . In its turn, A is the set of arcs connecting the
nodes. This set is composed of initiation, intermediate, and termination arcs. For i ∈ N , a00,i denotes
an initiation arc from node 0 to node ni,1, and ani,T denotes a termination arc from node ni,n to node
T . Given i, j ∈ N such that i 6= j, and 1 ≤ t ≤ n − 1, ai,j,t denotes an intermediate arc from node
ni,n to node nj,t+1. The third index of an arc represents its layer, that is, the position in which it occurs
in the path, or in other words, the time of the arc traversal, if we consider 1 time unit for each of these
traversals.
When working on the TDTSP, it is often convenient to define G(n) as a subgraph of (V,A), induced
by V \0, T. This way, G(n) has n2 nodes ni,t : i, t ∈ N and all the n(n − 1)2 intermediate arcs of A.
A path with n nodes in G(n) is of the form (vt.t) : vt ∈ N, 1 ≤ t ≤ n. Since consecutive nodes are
in consecutive layers, the path can be described by an ordered array (vt : t ∈ N). This path can be
extended to a 0 − T path of (V,A) by appending node 0 and T to the beginning and end of the tour,
respectively.
The classical TSP and its time-dependent variation share the same objective, which is to find the
minimum cost Hamiltonian cycle over graph (V,A). Another property they share is the possibility of
working over a symmetric or asymmetric problem.
TDTSP as one-machine sequencing problem
In operation scheduling, the time dependent TSP can also be stated as a one-machine sequencing
problem [35]. Consider a set of n jobs, J1, ..., Jn, to be executed on a single machine. Each job has a
setup cost Cti,j , occurring when job Ji, processed in the t-th time unit, is followed by job Jj , processed in
the (t+ 1)-th time unit. Consider that each job completion takes exactly one time unit. The machine is in
some given initial state, denoted by 0, before the job processing begins. As happens with the classical
TSP, the machine has to be returned to its original state, after the job processing ends. The problem is
15
to find a sequence, Jw(1), ..., Jw(n), that minimized the total set-up cost C(w), defined by:
C(w) = C00,w(1) +
n−1∑i=1
Ciw(i),w(i+1) + Cnw(n),0 (2.2)
It is important to note some important characteristics of this formulation. First, problems with unspec-
ified initial/final state can be formulated in the same way using 0 as the initiation/termination cost, that
is C00,w(1) = 0 and Cn0,w(1) = 0. Secondly, the overwriting of the above formulation reduces the defined
problem into the classical TSP and the classical Assignment Problem [36]. The first case is achieved by
considering that setup costs are not time-dependent, that is, Cti,j = Ci,j . The latter is accomplished by
considering that setup costs are not dependent on the second/first job, that is, Cti,j = Cti .
2.1.1.C TSP with time-windows
The Traveling Salesman Problem with Time Windows (TSPTW) is a generalization of the TSP, in which
the objective is to find a minimum cost Hamiltonian cycle which visits every city in its requested time
window. This problem has important applications in the field of routing and scheduling. It is also par-
ticularly relevant for the Vehicle Routing Problem, as these constrains may be imposed by customers,
whose operation hours are limited to a time window [37]. Being a generalization of the TSP, the TSPTW
is also NP-complete, [38]. This section introduces two definitions, one for the asymmetric TSP with
time-windows [39], and a second one for the time-dependent variation [38] of this problem.
Consider a complete digraph G = (N,A), where V is the set of n = |N | nodes, and A the set of
arcs, each associated with a non-negative arc cost, ci,j , and non-negative setup times, tij associated
to each arc ai,j ∈ A. The nodes correspond to jobs to be processed (as described in the single-
machine sequencing problem), and arcs correspond to job transitions, where the setup times, tij , define
the changeover time needed to process node j immedtialy after node i. Each node i ∈ N has an
associated processing time pi ≥ 0, a release date ri ≥ 0 and a deadline di ≥ 0, where the release date
and deadline denote, respectively, the earliest and latest possible starting time for the processing of
node i. The minimal time delay for processing node j immediately after node i is given by vij = pi + tij .
The interval [ri, di] is called the time window for node i. The time-window is said to be relaxed if ri = 0
and di → +∞. On the contrary, a time-window is called active if ri > 0 and di < +∞. It is possible to
reach a node i ∈ N at a time t ∈ Z+ ∪ 0, sooner than its release date ai. In this case, it will undergo
a waiting time ai − t, before leaving node i at time ai.
When dealing with routing problems with time-windows, it is often necessary to define if the time-
windows are hard or soft constraints. Hard time windows consider that a node i ∈ A can not be visited
after its deadline di. On the contrary, when considering soft constraints, the node i might be visited after
the deadline di, but, in this case, a penalty occurs.
16
The objective function of the problem under consideration depends on the specific definition of the
problem, according to its time-window constraints. When dealing with hard constraints, the objective
function is defined by the sum of the costs of each arc belonging to that tour, while when using soft
constraints, the objective function depends on the specific problem definition and the values associated
to the aforementioned penalties.
There are several versions of the TSPTW which introduce time-dependent variations. These varia-
tions usually focus on time-dependent arc costs, setup times, or processing time. This generalization
may occur, for example, as a result of considering the traffic effects associated to real world routing
problems. Below is the formal definition of the ATSPTW with time-dependent travel times and costs
(ATSPTW-TDC), as defined in [38].
Let G = (V,A) be a simple directed graph, V = vini=0 its set of vertices, where v0 is the depot
vertex. Each vertex vi has an associated time window (or service time) [ai, bi], verifying that ai, bi ∈
Z+∪0 and [ai, bi] ⊆ [a0, b0]∀i ∈ 1, ..., n. Every time window [ai, bi] has associated pi = bi−ai instants
of time ai + k − 1pik=1. For simplicity, we will denote tki = ai + k − 1, and therefore tki ∈ Z+ ∪ 0. The
time and the cost of traversing an arc (vi, vj) ∈ A depend on the instant of time tki at which the traversing
is started. Consider ctij ≥ 0 and tijt ∈ Z+ ∪ 0, respectively, the cost and the time of traversing an
arc (vi, vj) starting at instant tki . Furthermore, the waiting times, t ∈ Z+, at every vertex vi have an
associated a waiting time cost cwti(t) ≥ 0.
The proposed goal in this formulation of the ATSPTW-TDC is to find a Hamiltonian cycle inG, starting
and ending at v0, and respecting the time-window [a0, b0], such that:
• Starting the circuit at time tki ≥ a0 involves a waiting time cost cwt0(tk0 − a0) ≥ 0, with cwt0(0) = 0;
• The circuit leaves each vertex vi ∈ V during its associated time window;
• If the circuit arrives at vertex vi ∈ V at time t ∈ Z+, such that t ≤ ai (before the beginning of the
service time), it is allowed a waiting time ai − t with cost cwti(ai − t) ≥ 0, with cwti(0) = 0. In this
case the circuit leaves vertex i at time ai;
• The sum of the costs of traversing arcs and of the waiting time costs is to be minimized.
The authors of the work introduced in [38] propose an exact algorithm for the previously defined
ATSPTW-TDC using several graph transformations, which successively reduce the problem into an
asymmetric TSP, for which several efficient exact algorithms already exist.
2.1.1.D Multi-objective TSP
The multi-objective Traveling Salesman problem is a generalization of the classic TSP, and is part of
much broader class of problems, comprehending the multi objective combinatorial optimization problems
17
[40] and, in particular, the multi objective vehicle routing problems [41].
The multi objective TSP is defined as follows [42]:
Given a list of n cities and a set D = (D1, D2, ..., Dk) of n ∗ n weight matrices, the objective is to
minimize f(π) = (f1(π), f2(π), ..., fk(π)), with fi(π) = (∑n−1j=1 d
iπ(j),π(j+1)) + diπ(n),π(1), where π is a
permutation over the set (1, 2, ..., n).
Note that when D = (D1), this corresponds to the single-objective TSP. Note also that the above
formulation considers that all objective functions calculate the weight of the Hamiltonian cycle, according
to the respective weight matrix.
The quality of the results of the multi objective TSP are usually measured according to its perfor-
mance across the Pareto criteria, defined as follows [43]:
Pareto dominance
A vector ~u = (u1, ..., uk) is said to dominate ~v = (v1, ..., vk), denoted by ~u ~v, if and only if ~u is
partially less than ~v, i.e. ∀ i ∈ 1, ..., k ui ≤ vi ∧ ∃ i ∈ 1, ..., k : ui < vi.
Pareto Optimality
Pareto optimality is defined as a concept of allocation optimality. An allocation is not Pareto optimal
if there is at least one alternative allocation which produces improvements.
A solution x ∈ Ω is said to be Pareto optimal with respect to the solution space Ω if, and only if, there
is no x′ ∈ Ω for which ~v = F (x
′) = (f1(x
′), ..., fk(x
′)) dominates ~u = F (x) = (f1(x), ..., fk(x)).
Pareto Optimal Set
For a given multi-objective problem F (x), the Pareto optimal set (P∗) is defined as :
P∗ = x ∈ Ω|¬∃x′ ∈ Ω : F (x′) F (x)
Although the above mentioned problem refers to the multi-objective TSP, without loss of generality,
the multi-objective optimization can be performed on a time-dependent TSP. There is very few direct
research about multi objective time dependent TSP, but one can cite [44], which proposes a multi-
objective tabu search for single machine scheduling problems with sequence-dependent setup times.
2.1.2 Vehicle Routing Problem
The Vehicle Routing Problem (VRP) is the problem of finding the optimal set of routes for a fleet of
vehicles, to serve a given set of customers. The VRP is believed to have been introduced by Dantzig, in
1959, in a work entilted The Truck Dispatching Problem [45]. Later, it was shown that the VRP, being a
generalization of the TSP, also belongs to the NP-hard class [46].
Being an NP-hard problem, the focus of the research usually revolves around heuristic algorithms,
although there are some procedures which are known to produce optimal solutions ( [47], [48]). As
referred by Donati in [49], citing the work of Blum [50], even when an exact procedure is available, it
18
usually requires large computational time, which is not viable in the time-scale of hours, as is required
by the industry.
In 1992, Malandraki [51] stated that the assumption of constant and deterministic costs is an ap-
proximation of the actual conditions of routing problems, and thus, a time-dependent formulation of
the problem should be considered. In 1999, Gambardella and colleagues proposed a multi ant colony
system for solving the vehicle routing problems using a meta-heuristic approach [49]. Years later, Gam-
bardella expanded this research to include time-dependent variations, [52], as proposed by Malandraki
and Daskin [51]. There are several other works, which propose meta-heuristic solutions to solve the
time-dependent VRP, [53], including the use of simulated annealing [54], and genetic algorithms [55].
The rest of this section is structured as follows. Subsection 2.1.2.A presents a formal definition of the
Vehicle Routing Problem. Its time-dependent and multi-objective variations are covered, respectively, in
subsection 2.1.2.B and 2.1.2.C. Since the TSP occurs only as a generalization of the non-capacitated
vehicle routing problem, the capacitated vehicle routing is out of the scope of this work.
2.1.2.A Problem definition
Following the definition proposed by Laporte [56], let G = (V,A) be a graph, where V = 1, ..., n is
a set of vertices, representing nodes/customers/cities, with the depot located at vertex 1, and A is the
set of arcs fully connecting the nodes. Each arc (i, j), i 6= j, is associated with a non negative weight,
cij . Depending on the context of the work, this weight might represent the distance between nodes, the
travel time, or even the travel cost. It is assumed that a fleet of m vehicles is available. The Vehicle
Routing Problems consists in finding the set of optimal routes such that:
1. each city in V \1 is visited exactly once, by exactly one vehicle;
2. all routes start and finish at the depot;
3. some constraints must be satisfied;
The most common constraints associated to 3) include: capacity restrictions associated with each
vehicle; limit on the number of nodes that each route might visit; total time restrictions; time-windows in
which each node must be visited; precedence relations between nodes.
The goal of the Vehicle Routing Problem usually consists in finding an optimal set of routes, as
to minimize the total cost, where the cost depends on the total distance covered, and the fixed costs
associated to each vehicle. However, depending on the problem under study, the goal may be different,
as to minimize the total travel time, minimize the total number of vehicles, or even both at the same
time [52].
19
2.1.2.B Time-dependent VRP
The Vehicle Routing Problem is a very wide class of optimization problems, whose precise problem
definition usually depends on the characteristics of the problem under consideration. Thus, introducing
time-dependencies on the problem also depends on the specificities of the situation. There are several
authors which consider time-dependent travel costs [54] and the objective is to minimize the total costs,
while others introduce time-dependent travel times [57], and the objective is to minimize the total travel
time. There are also those who consider that the objective function is a function of both travel time and
travel costs, and at least one of these (travel time, travel cost) is time-dependent [58]. The definition
herein proposed follow this last time-dependent variation.
Following the work of Figliozzi [58], the time-dependent VRP is defined as follows. Let G = (V,A)
be a graph where A = (vi, vj) : i 6= j ∧ i, j ∈ V is the set of arcs, and V = v0, ..., vn+1 is the set
of vertices. Vertices v0 and vn+1 denote the depot at which the vehicles are based. It is considered
that each vehicle has an uniform capacity of qmax. It is also expected that each vertex i ∈ V has an
associated demand qi ≥ 0,a service time gi ≥ 0, with the depot having q0 = 0 and g0 = 0. The set
of vertex C = v1, ..., vn specifies the set of n customers. The arrival time of a vehicle at customer i,
i ∈ C, is denoted by ai, and its departure time bi. Each arc (vi, vj) has an associated distance dij ≥ 0,
and a travel time tij(bi) ≥ 0. Note that the travel time is a function of the departure time from costumer
i. The set of available vehicles is denoted by K. Consider that the cost per unit of route duration is
denoted by ct, and the cost per unit of route distance is denoted by cd.
In this formulation, there are two goals for the time-dependent VRP. The first corresponds to the
minimization of the total number of vehicles used. The second corresponds to the minimization of the
total cost, which is a function of both distance and travel time.
The complete definition of the problem follows a mixed integer programming approach, with a total
of 11 constraints. These will not be covered in detail here, as the VRP is not the primary object of study
of this work. Thus, it is important to define in which circumstances the TDVRP can be transformed into
the TDTSP. This is possible by considering only one vehicle, with infinite capacity, and by adapting the
objective function according to the problem under consideration.
We conclude this section with the final remark that the above presented definition of the time-
dependent VRP corresponds to a static version of the time-dependent case. There is a lot of research
around the dynamic case, in which the problem is updated during the execution of the program. This
has major applications in the routing industry,and it is often referred to as real-time Vehicle Routing. For
more information regarding this problem, we refer to [59] [60].
20
2.1.2.C Multi-objective VRP
Multi objective optimization corresponds to the resolution of a combinatorial optimization problem in
which more than one goals are defined. In the case of the Vehicle Routing Problem, [41], the most
common objectives include minimizing the fleet size, the total traveled distance, the total required time,
the total tour cost, and/or maximizing the quality of the service or the collected profit. Note that in most
problems, when multiple objectives are identified, the different objectives often conflict with each other.
Multi-objective optimization usually relies on the use of meta-heuristics, [61]. There are several works
focusing on this problem, and the most promising meta-heuristics for multi-objective optimization include
Evolutionary Optimization, [62] and Simulated Annealing, [63]. There is also some work considering the
Ant Colony Optimization. In particular, a modified ant colony was designed to solve a bi-objective time
dependent vehicle routing problem, in which the main goal was the minimization of the fleet size, followed
by the minimization of the total cost, [52].
2.2 Common Optimization Methods overview
The algorithms that address the Traveling Salesman and other Combinatorial Optimization problems can
be classified as exact, heuristic or meta-heuristic. Some of the algorithms belonging to each of these
three classes will be discussed in subsection 2.2.1, 2.2.2 and 2.2.3, respectively. Exact algorithms
are those that always provide an optimal solution to the problem. Although these might seem the first
choice, exact algorithms are usually inefficient for solving large problems. In its turn, heuristic algorithms
intend to be efficient, handing over the objective of finding the best solution, and focusing on finding
near-optimal solutions in a short time. Heuristic algorithms are usually problem specific, while Meta-
heuristics algorithms are designed in such a way that they can be used for a variety of Combinatorial
Optimization problems, in a fast and efficient way.
2.2.1 Exact algorithms
There are multiple exact algorithms available for the Traveling Salesman Problem [64], including its time-
dependent [65] and time windows [66] variations. These algorithms usually require the problem to be
formulated as an Integer Linear Programming (ILP) instance. In this section, we present ILP definitions
for both the classic and the time-dependent TSP. We also present a brief introduction regarding the
Branch and Bound algorithm, often used to determine optimal or near optimal solutions.
21
2.2.1.A Integer Linear Programming
The Traveling Salesman Problem, defined over the graph G = (V,A), may be formulated as an integer
linear programming problem [64], by associating one binary decision variable xij to every arc aij . Let cij
represent the weight of the arc ai,j . When a decision variable has a value of 1, the corresponding arc
belongs to the solution. The ILP formulation for the TSP is as follows:
min∑i
∑j,j 6=j
cijxij (2.3)
s.t.∑j
xij , i = 1, . . . , n, (2.4)
∑i
xij = 1, j = 1, . . . , n, (2.5)
xij ∈ 0, 1, i, j = 1, . . . , n, i 6= j. (2.6)
The objective function of the TSP is described by equation 2.3. Equations 2.4 and 2.5 represent
the imposed constrains over the variables. In particular, they state that a tour must enter and leave,
respectively, each node exactly once. The final constraint, defined in equation 2.6, forces the decision
variables to binary values. This set of constrains is called the assignment constraints.
It is worth noting that although the assignment constraints force each node to be entered and left
exactly once, the formation of subtours is still possible. For example, if we consider two disjoint subtours,
the assignment constraints all hold, but this does not form a closed cycle,and thus not constitute a
valid solution to the Traveling Salesman Problem. Because of this, it is necessary to introduce subtour
elimination constraints. There are several possible formulations for this effect, in particular, the Dantzig-
Fulkerson-Johnson DFJ and the Miller-Tucker-Zemlin (MTZ ) formulations.
To exclude subtours, Dantzig and colleagues propose a strategy that introduces an exponential num-
ber of constraints [64] (approximately 2n) subtour elimination constraints), presented in equation 2.7.
The exponential number of constraints makes this strategy undesirable, even for medium and small size
instances.
∑i∈S
∑j∈S
xij ≤ |S| − 1, S ⊂ V, |S| ≥ 2 (2.7)
In its turn, the MTZ formulation includes less subtour elimination constraints, at the expense of the
introduction of a new set of variables u = 1, . . . , n, where ui denotes the position of node ni in the tour.
There are approximately n2/2 subtour elimination constrains [67], presented in equation 2.8. Although
the MTZ formulation introduces a new set of variables, compared to the DFJ strategy, the number of
subtour elimination constraints is significantly reduced.
22
u1 = 1
2 ≤ ui ≤ n ∀ i, j ∈ 2, . . . , n,
ui − uj + nxij ≤ n− 1 ∀ i, j ∈ 2, . . . , n, i 6= j
(2.8)
In its turn, the time-dependent Traveling Salesman Problem variant may be formulated as an integer
linear programming problem by associating one binary decision variables xtij to every arc atij . Let ctijrepresent the weight of the arc ati,j . A decision variable xtij takes a value of 1 when the arc atij , which
represents the transition from node ni to node nj at time t, belongs to the solution. Hence, the ILP
formulation for the TDTSP is as follows:
min∑i
∑j
∑t
ctijxtij (2.9)
s.t.∑j
∑t
xtij = 1, i = 1, . . . , n, (2.10)
∑i
∑t
xtij = 1, j = 1, . . . , n, (2.11)∑i
∑j
xtij = 1, t = 1, . . . , n, (2.12)
n∑j=1
n∑t=2
txtij −n∑j=1
n∑t=1
txtij = 1, i = 1, ..., n (2.13)
xtij ∈ 0, 1, i, j, t = 1, . . . , n, i 6= j. (2.14)
The objective function is presented in equation 2.9. Equations 2.10, 2.11 and 2.12 represent the
imposed constraints over the decision variable. Particularly, they state that each city must be entered
exactly once, left exactly once, and visited in exactly one time period, respectively. As occurs with the
classical TSP, the ILP formulation needs to formulate a constraint to eliminate the formation of subtours.
This is presented in equation 2.13. Finally, equation 2.14 guarantees that the decision variable takes
binary values.
2.2.1.B Branch and Bound
Branch and Bound (BB) is one of the most used tools to solve large NP-hard combinatorial optimization
problems. As an example, the software tool Concorde uses a Branch and Bound algorithm, and it was
used to solve all 110 instances of the TSPLib, reporting exact solutions in every problem, including a
instance with 89.900 nodes, although it required more than 110 CPU years. To be precise, B&B should
be classified as an algorithm paradigm, constituted by 3 main parts, which have to be chosen according
to the problem under consideration, and for which many options may exist [68].
23
The force of the B&B comes from it being a search algorithm which (indirectly) searches the com-
plete search space of the problem. Since this is not directly feasible, due to the common exponential
growth of the solution space, B&B takes advantage of bounds, combined with information regarding the
current best solution,to safely discard certain solutions among the search space.
At any point of the algorithm, there is always a current solution and a pool of unexplored subsets
of the solution space. At the beginning of the algorithm, this pool consists of (only) the root node, and
at the end of the algorithm, it will consists of an empty set, meaning that the entire search space was
successfully explored.
The initialization of the B&B requires the incumbent, which denotes the objective function value of
the current solution, to be initialized as ∞. In many cases, it is possible to generate an initial feasible
solution using some heuristic method (as, f.e., the nearest neighbour heuristic, that will be discussed in
subsection 2.2.2.B), this solution is recorded and its objective value is set as incumbent. The process of
generating an initial solution usually has a positive impact on the B&B algorithm.
After the initialization, this algorithm enters an iterative process, until the pool of unexplored subsets
is empty. This process consists of three main steps: i) selection of a node to process; ii) the bound
calculation; and iii) branching.
Branch and Bound algorithms vary according to the established strategies for each of the three main
steps of the iterative process, as well as the initial heuristic. In any case, the selected bounding function
is the key for any good branch and bounding algorithm, because the selection of a bad function can not
be compensated with good choices on the branching and bounding strategies.
As an example, consider the trivial case where the bounding function is the constant value of zero.
It is obvious that this will always be a lower bound to the problem, but it does not produce any quality
information of which solutions to discard. Ideally, the value of the bounding function for a given sub-
problem should be equal to the value of the best feasible solution to that problem. This is usually not
possible, since subproblems may also be NP-hard. Thus, bounding functions are chosen according to
the proximity to the best possible value, and to its time complexity - usually restricted to polynomial time.
To complete this overview it is necessary to reference the most relevant search strategies for the
Branch and Bound heuristic, which usually revolve around the Best First, Depth First and Breadth First
Search [68]. To a more detailed overview of the Branch and Bound algorithm , we refer to J. Clausen’s
Branch and Bound algorithms - principles and examples [68].
2.2.2 Heuristic algorithms
In some cases, exact algorithms can not be used in the resolution of the TSP instance under consid-
eration. This usually occurs when dealing with very large instances, or when there is an urgency in
obtaining solutions in a very fast manner. In these cases, the usage of approximation algorithms may
24
be a good choice. These algorithms are not guaranteed to produce an optimal solution, however, with a
good heuristic, approximation algorithms produce high quality solutions in a reasonable time. Generally,
the heuristic may be classified as one of two classes: construction or improvement heuristics [69].
2.2.2.A Held-Karp Lower Bound
In some cases, the quality of a heuristic solution can not be directly measured, as no exact solution for
the problem under consideration is known. In these cases, it is important to have a way of evaluating
its performance. The standard way of doing this is by comparing the heuristic solution with the solution
generated by the Held-Karp (HK) lower bound [70].
The HK lower bound is the solution of the linear programming relaxation of the ILP formulation of
the TSP. This solution can be found in polynomial time for moderate instance sizes. However, for a
very large problem, solving the relaxed problem directly is not feasible. In these cases, Held and Karp
proposes an iterative algorithm in order to approximate the solution. This method involves computing
a large number of minimum spanning trees. This iterative version of the algorithm will often keep the
solution within 0.01% of the HK lower bound [69].
2.2.2.B Tour construction
A construction algorithm is based on the generation of a valid solution, by using some heuristic function
to guide the construction process. These algorithms stop when a valid solution is found, and they do not
attempt to improve the solution any further. Two different tour construction heuristics for the TSP will be
presented: the nearest neighbour and the greedy heuristics.
The nearest neighbour [71] is a simple and intuitive heuristic for the TSP. It starts with the selection
of a random node. This is followed by the selection of the closest node, belonging to the set of nodes
not yet visited. This step is repeated until all nodes have been visited. Finally, the solution construction
process is completed by returning to the initial node. The computational complexity of the nearest
neighbour is O(n2), and the set of solutions generated with this heuristic are often within 25% of the
optimal solution [69]. To construct a solution using the nearest neighbour heuristic, the following steps
can be taken:
1. Select a random city
2. Select the nearest unvisited node
3. If there are unvisited nodes, repeat step (2)
4. Return to first node
25
In its turn, the greedy heuristic [71] is a construction algorithm which creates a valid solution by
repeatedly selecting the arc with the lowest weights, always taking into account the problem constraints.
Although this heuristic is similar to the nearest neighbour, there are some differences in the initialization
step. The nearest neighbour randomly selects the initial node, while the greedy heuristic is greedy
at every step of the algorithm, including the initialization. The computational complexity of the greedy
heuristic is O(n2log2(n)), and the solutions generated by this heuristic are often within the 20% of the
optimal solution [69]. The pseudocode for the greedy heuristic is presented below.
1. Sort all arcs according to its weight
2. Select the lowest weight arc, if it does not violate any constraint
3. If the constructed solution is not complete, repeat (2)
2.2.2.C Tour improvement
An improvement heuristic is an algorithms that works over a valid and complete solution, in order to
improve it. The most common improvement heuristics are the 2-opt and 3-opt local search procedures
[72], and the Lin-Kernighan algorithm (LK). The latter is a particular implementation of the former two
local searches methods, in which a k-opt local search is employed, but the value of k varies during the
algorithm execution. The LK algorithm is very efficient and capable of presenting high quality solutions
for the symmetric TSP.
The 2-opt is a simple iterative local search procedure, in which two arcs of the solution are removed,
and a new solution is constructed by reconnecting the nodes in a different way. Given that only 2 arcs
are exchanged, there is only one way of creating a different cycle. Note that by performing a 2-opt ex-
change, the path between the two exchanged arcs gets reversed. If the graph is symmetric, the objective
function difference can be calculated by analyzing two arcs only. However, if the graph is asymmetric, it
is necessary to calculate the cost of the new path, as to evaluate the objective function difference. This
makes the 2-opt procedure particularly efficient for symmetric problems, but less adequate for asymmet-
ric problems and, as a consequence, the asymmetric and time-dependent variations of the TSP.
Figure 2.2: The 2-opt local search reconnects two edges, hoping to fold possible crossovers, decreasing the overalltour cost. In the left image, a crossover is identified. In the middle image, the edges belonging to thiscrossover are removed, and in the figure to the right, they are reconnected, forming a new valid tour.
26
The 3-opt search is very similar to the 2-opt, but instead of selecting two edges and reconnecting
the path, the 3-opt selects 3 edges. In this case, there are multiple ways of forming a new valid tour. A
3-opt move can also be seen as two or three 2-opt moves combined in the formation of a new tour. The
iterative cycle of the 3-opt search works in the same way as the 2-opt.
More generally, the k-opt local search is a method for rearranging a tour, by taking k edges and
reconnecting the paths in order to form a new valid tour. Any tour that is known to be k − opt is also
(k − 1)-opt. Some particular problems, as the crossing bridges, illustrated in figure 2.3, can only be
solved with a 4-opt or higher method.
Figure 2.3: The crossing bridges can only be solved by reordering 4 edges. The resolution of this problem withlocal search is only possible with 4-opt or higher.
In its turn, the Lin-Kernighan heuristic [72] is an algorithm for the resolution of the symmetric TSP,
and it was the state of the art for over 15 years. LK is known for producing high quality solutions, some
of them optimal, and for having a time complexity of approximately O2.2n [69]. This heuristic is restricted
to the the symmetric TSP, and using it for asymmetric instances requires a graph transformation, which
transforms the asymmetric instance with n nodes, into an equivalent symmetric one with 2n − 1 nodes
( [73]). Thus, for the same number of nodes, solving an asymmetric TSP with the LK heuristic is usually
4 times harder than solving the symmetric case.
To understand the Lin-Kernighan heuristic, it is necessary to think about the TSP in a slightly different
manner. Consider the following way of defining a combinatorial optimization problem: ”find, from a set
S, a subset T that satisfies some criterion C and minimizes an objective function f .” In the TSP, the
objective is to find, from the set of all edges (S) of a complete graph, the subset (T ) which forms a valid
tour (C) and minimizes the objective function (f ).
Any non-optimal but feasible solution T is non-optimal because k elements x1, ..., xk in T are out of
place. To improve this solution, and make it optimal, one would have to substitute the set of k elements
x1, ..., xk with the elements y1, ..., yk of S\T . Because there is no knowledge about how many elements
are misplaced, Lin and Kernighan consider that setting the value of k a priori would seem artificial. Thus,
they propose an iterative procedure in which the algorithm dynamically estimates the best value for k. In
order to do this, the LK first estimates the most out of place elements, x1 and y1. With these values set
aside, it tries to repeat this process for x2 and y2, and so on. This inner loop stops when no improvement
seams plausible and, at this point, it replaces the current solution T with the new solution generated from
27
replacing the now selected elements, and restarts the whole process.
2.2.3 Meta-Heuristic algorithms
Unlike classic heuristics, metaheuristic algorithms are designed to be applied to any combinatorial opti-
mization problem, and not to a specific problem of this class. Meta-Heuristic gain importance during the
1990’s, and have become one of the most important class of algorithms in computer science.
More formally, a meta-heuristic is an iterative generation process, which guides an underlying heuris-
tic by combining intelligently different concepts, for exploring and exploiting the search space, using
learning strategies to structure information, as to efficiently find optimal or near-optimal solutions [74].
This subsection will introduce a few of the most relevant meta-heuristics in the resolution of the
Traveling Salesman Problem. In particular, it will focus on the Ant Colony Optimization (ACO) and the
Simulated Annealing (SA), presented in subsections 2.2.3.A and 2.2.3.B, respectively. There is a vari-
ety of meta-heuristics which are not discussed here, but which have also been successfully applied to
the TSP. Examples of these meta-heuristics are the Tabu-Search, Evolutionary Algorithms, in particu-
lar the Genetic Algorithm, and many other Swarm Intelligence algorithms, from which the Ant Colony
Optimization is the oldest and most widely used.
2.2.3.A Ant Colony Optimization
The Ant Colony Optimization [11, 75, 76] is based on the behaviour of real ants, and was developed by
M. Dorigo et. al, based on the generalization of the double bridge experiment [77], [78], illustrated in
figure 2.4. This led to an adaptation of this experiment, substituting the double bridge with a graph, the
pheromone trail with artificial pheromone, and the real ants with artificial ants, which presented some
extra capabilities intended to facilitate the resolution of more complex problems [11].
Figure 2.4: The double bridge experiment. On the left, two bridges with the same length. Experimental resultsshow that ants distribute themselves evenly amonst both bridges. On the right, one of the bridges islonger than the other. Experimental results show that ants use the shorter bridge more often.
28
By using the model of a static combinatorial optimization problem, as defined in section 2.1, it is pos-
sible to derive a generic pheromone model that can be exploited by the Ant Colony Optimization. This
means that both the classical and the time-dependent TSP, which can be formulated by the aforemen-
tioned model, may be solved by the ACO metaheuristic. The following steps represents the algorithmic
skeleton for the ACO model, and each of its parts will be explained with more detail below.
Algorithm 1 ACO metaheuristic
1: procedure ANTCOLONYOPTIMIZATION2: Initialization3: while (termination condition not met) do4: ConstructAntSolutions5: ApplyLocalSearch . optional6: GlobalPheromonesUpdate
The general process of the Ant Colony Optimization algorithms is as follows. The algorithm starts
with a parameter initialization. This is also responsible for setting the pheromones levels to some initial
value τ0. This value is usually chosen using a heuristic function. For the TSP case, the chosen heuristic
is often the nearest neighbour.
After initialization, and until some specific termination condition is met, the ACO algorithm runs
in a loop, which consists of 3 main steps: i)solution construction; ii)local search (optional); and iii)
pheromone update. Each of these steps will be detailed below.
The solution construction is a process that is carried out by each of a specified number of ants.
Each ant starts with an initially empty solution, sp, and at each iteration step expands its solution with
a valid solution component, cji . This construction function is what differentiates the ACO algorithm for
every combinatorial optimization problem. By restricting the construction method to the agents (the
ants), the rest of the algorithm does not have to be heavily adapted to the specific model. However,
the function that is responsible for selecting the feasible solution components has to be aware of the
decision variables and the set of constraints. It has to determine those variables whose addition to the
partial solution do not constitute a violation to the set of constraints of the model. This set is represented
by N(sp).
Having the set of all feasible solutions components, N(sp), it is necessary to choose a single com-
ponent, cji . This selection is done probabilistically, and the choice takes into account both pheromone
(exploitation) and heuristic information (exploration). The algorithmic parameter q0 is responsible for
defining both method’s relative importance. The outline execution of the function is presented in equa-
tion 2.15, and is as follows. A random value, q, is set. If this value is lower than the algorithms parameter
q0, the selection of the node is done with the heuristic rule, (see equation 2.16). Otherwise, the Ant
System rule (presented in 2.17) is used.
29
cji =
heuristic rule, ifq ≤ q0ant system rule, otherwise
(2.15)
argmaxl∈Nkiτil[ηil]
β (2.16)
p(cji |sp) =ταij [η(cji )]
β∑cli∈N(sp)
ταil [η(ci,l)]β(2.17)
After finalizing the construction of valid solutions, the ACO algorithm may implement a local search.
Although this step is optional, it has been demonstrated that ACO algorithms reach their best per-
formance when local search is applied. The ant’s construction method is biased by the pheromone
information, while the pheromone values are biased by the quality of the solutions. Local search, also
called Daemon Actions, are techniques which intend to work on the existing solutions, exploring and
expanding the search space, and ultimately improving the quality of the solutions. The most widely used
local search methods are the 2-opt search, the 3-opt search, and the Lin-Kernighan heuristic. The final
step of each loop of the algorithm is the pheromone update, presented in equation 2.18.
τij = (1− ρ)τij +∑
s∈Supd|cji∈s
g(s) (2.18)
The pheromone update is responsible for making solution components that belong to good solutions
more desirable in the following iterations. To achieve this objective, two methods are implemented. The
first is pheromone deposition, which increases the pheromone intensity of the solution components be-
longing to the most promising solutions. The amount of solutions that are used to deposit pheromone is
a parameter of the algorithm. The second method to achieve this step’s goal is pheromone evaporation.
While it may seam counter-intuitive to deposit pheromones and, at the same time, also evaporate them,
this step is crucial to avoid a rapid convergence to sub-optimal solutions. Pheromone deposition alone
is responsible for making good solutions more desirable, while pheromone evaporation reduces both
the desirability of bad solutions and the sub optimal convergence of good solutions, favoring a better
exploration of the search space.
2.2.3.B Simulated Annealing
The Simulated Annealing metaheuristic was developed using an analogy between the physical anneal-
ing in solids, and finding the minimum cost configuration in combinatorial optimization problems. In the
physical world, annealing is the process of heating a metal until the melting point, and reducing the
temperature slowly and in a controlled way. The decrease of the temperature results in a particle re-
30
arrangement, in which lower energy states are reached. When the heating temperature is very high,
and the temperature is decreased very slowly, this will result in the ground state of the solid - its min-
imum energy state. The analogy between physical world and the combinatorial optimization problems
is achieved by considering that the energy of the metal corresponds to the cost of the solution, and the
particle rearrangement consists in the selection of a neighbourhood solution [79].
Most simulated Annealing algorithms consist in an iterative improvement algorithm, which stochasti-
cally accepts up-hill moves. More precisely, the procedure starts with the selection of a feasible solution,
together with the initialization of some algorithmic specific parameters, such as the temperature, which
is used as a control variable. After this, the SA enters an iterative process (Markov chain). At the core of
this iterative process is a local search heuristic, which is executed a fixed number of times per iteration.
After this local search procedure is complete, the temperature is decreased according to the predefined
cooling schedule. After this, the local search restarts and the cycle continues, until either the execution
time is reached, or the temperature reaches the value of zero.
Simulated Annealing differs from other iterative improvement algorithms due to its ability to escape
local minima, by accepting up-hill moves (as happens also, f.e., with the Tabu Search). More precisely,
at each stage of the local search procedure, the difference in the energy level (∆f ) between the current
state (f(x)) and the newly generated state (f(y)) is calculated. If ∆f is negative, the new state is better
than the current one, and it is always accepted. On the contrary, if ∆f is positive, the state is accepted if
Metropolis criteria, presented in equation 2.19 is verified. Otherwise, the new solution is rejected. Note
that by using this criteria, as the temperature approaches zero, the metaheuristic becomes increasingly
greedy.
py =
1, if f(y) ≤ f(x),
e−∆ft , otherwise
(2.19)
Time-dependent scheduling problems can also be solved using this meta-heuristic [80], as the algo-
rithm solely relies on the search of a neighbourhood set. The following algorithmic skeleton defines the
Simulated Annealing procedure.
Algorithm 2 SA metaheuristic
1: procedure SIMULATED ANNEALING Initialization2: while Termination condition not met do3: while Length of Markov chain not reached do4: Generate candidate solution5: Apply Metropolis acceptance criteria
Update temperature
In the first published works about the Simulated Annealing [81], it was proven that if the temperature
is cooled very slowly, the process will converge to the optimal solution. More precisely, if temperature
31
drops no more quickly than C/log(n), where C is a constant and n is the number of steps taken so
far [71]. However, this result is not as relevant as it first seems, because this cooling schedule is very
slow. Some authors refer that it is faster to do exhaustive search than to follow this cooling schedule [71].
Hence, the Simulated Annealing metaheuristic requires the definition of a cooling schedule, a neigh-
bourhood function, and the Markov chain length. There are several reports that describe the influence
of these modules in the overall performance of the SA procedure [82].
32
3Flying Tourist Problem
Contents
3.1 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.2 Relation to the Traveling Salesman Problem . . . . . . . . . . . . . . . . . . . . . . . 36
3.3 Graph representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.4 Dimensional overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.5 Optimization methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
33
As referred to in chapter 1, the main goal of this work is to develop an application to solve uncon-
strained multi-city flight requests, as well as simpler flight requests, such as the one-way and round-trip
flights. In order to develop a system capable of addressing both simple and complex requests, we will
formally introduce, in section 3.1, a problem definition called the Flying Tourist Problem (FTP). In section
3.2 it will be shown that the FTP occurs as a generalization of the Traveling Salesman Problem, and it is
thus NP-hard. In its turn, section 3.3 shows how to construct a graph to represent this problem. Finally,
section 3.5 presents the strategies to solve the problem, which include both heuristic and metaheuristic
optimization algorithms (subsection 3.5.1 and 3.5.2, respectively).
3.1 Problem formulation
Consider a tourist who wishes to take a trip that visits every node (city) vi in the set of nodes V , |V | =
N , with no particular order. The start node will be denoted as v0, while the return node as vn+1, and the
complete set of nodes is given by Vc = V ∪v0∪vn+1. The trip must start at a time t ∈ T0 = [T0m, T0M ].
Upon visiting a node, the tourist will stay there for a duration of d time-units (days). Consider that for each
node to be visited, there is a range for the value d might take, that di ∈ [dim, diM ] and diM ≥ dim ≥ 1
(nights). The complete set of durations associated to each city is given by D, and |D| = N . Furthermore,
to each city vi ∈ V , there is an associated time-window TW , |TW | = |V | = N , which defines the set of
dates in which the city vi may be visited.
By following this definition, the FTP is completely defined by a structure G = (Vc, A, T0, D, TW ),
used to create a multipartite graph describing the request. This multipartite graph is divided into k
layers, where each layer corresponds to a particular moment in time. Besides this, every node in a layer
is connected to all nodes in the subsequent layer. The set of arcs that connects these nodes is given by
A. To each arc atij ∈ A, which connects node vi to node vj at time t, it is associated a cost ctij (ticket
cost) and a processing time ptij (flight duration), which depend upon the routed nodes, as well as the
time in which the arc transition is initiated, that is, ∀atij ∈ A, ctij ≥ 0 and ptij ≥ 0.
A valid solution s to the FTP is a set of arcs (commercial flights) which start from node v0 during the
defined start period, visit every node vi in V during its defined time-window TW (vi), and respect the
duration of stay in each node, defined by D(vi), before finally returning to node vn+1. The set of all valid
solutions is given by S. The goal of the FTP is to find the global minimum s∗ ∈ S, with respect to the
considered objective function.
The objective function associated to this problem depends on the user criteria. While some users
might consider the total cost to be the most important factor, others may consider that the total flight
duration is of crucial importance. Thus, a total of three different objective functions shall be herein
considered: (i) the expended cost (see eq. 3.1), (ii) the flight duration (see eq. 3.2), and (iii) the
34
Figure 3.1: Illustration of a Flying Tourist Problem using a multipartite graph. To each node (A,B,C) it is associateda waiting period of respectively (1,2,3) time units. The red arrows represent a possible solution to theproblem.
resulting entropy (see eq. 3.3), where the latter corresponds to a weighted sum between the former two.
Fc(s) =
N+1∑n=0
c(s[n]) (3.1)
Ft(s) =
N+1∑n=0
p(s[n]) (3.2)
Fe(s) =
N+1∑n=0
wc ∗ c(s[n]) + wp ∗ p(s[n]) (3.3)
Figure 3.1 illustrates the multipartite graph associated to a simple instance of the FTP with vn+1 =
v0 = X, one possible start date (t = 0), 3 nodes to visit (A,B,C), with a fixed duration of respectively
(1,2,3) time-units, and no constraints relative to the time-window of each city. A possible solution to this
problem instance corresponds to the set of arcs (a0X,B , a2B,A, a
3A,C , a
6C,X).
Despite the apparent complexity of the proposed definition, it can be used to state very simple flight
searches, including one-way and round-trip flights. For example, the problem of finding a single flight
from A to B at date T can be instantiated as a FTP given by v0 = A, vn+1 = B, T0 = T , and V = D =
TW = . In its turn, a round-trip flight involving the same two cities and the same start date, in which
the staying period in B is b days, is given by v0 = vn+1 = A, T0 = X, V = B, D = b and TW = .
Thus, this definition is adequate either for simple and complex trips, which can be customized according
to the user search criteria, by setting either an extended start dates, or flexible durations.
35
3.2 Relation to the Traveling Salesman Problem
The Traveling Salesman is the problem of, given a list of cities, finding the best route to visit them all,
according to some objective function. In its turn, the Flying Tourist Problem proposed in section 3.1
intends to find the best route, schedule, and set of flights to visit a given list of cities. This section will
explore some of these characteristics which distinguish the FTP from the TSP.
It is possible to reduce the Flying Tourist Problem into the Traveling Salesman Problem, by consider-
ing the following set of restrictions:
1. vn+1 = v0;
2. T0 = 0;
3. TW (vi) = [0,+∞[, ∀ vi ∈ V ;
4. D(vi) = 1, ∀ vi ∈ V ;
5. ctij = cij , ∀ vi, vj ∈ V , ∀t.
Constraint 1) operates over the depot, forcing the initial and final node to be the same. While this
constraint is not forced in the FTP, because a user might not necessarily want to finish the trip where it
was initiated, the TSP considers a single depot.
Constraint 2) is responsible for limiting the start-period to a single time-unit. In a real-world flight
search application, this constraint is extremely undesirable, as it reduces the overall quality of the search.
Constraint 3) removes the time-windows constrains imposed to each city. This mean that any city
might be visited during any time period.
Constraint 4) forces each city to be visited during exactly one time-unit. Once again, this constraint
is extremely undesirable in a flight search application, since in most cases, users do not want to spend
only one night in a destination.
Finally, constraint 5) removes the time-dependencies of the cost matrix. It should be noted that the
characteristics of commercial flights contradict this constraint.
Applying constrains (1-4) to the proposed FTP leads to the time-dependent Traveling Salesman
Problem [83], as defined in section 2.2.1.A.
Constraints (1-4) together with (5) reduce the proposed FTP into the asymmetric Traveling Salesman
Problem. In order to reduce it to the classical symmetric TSP, it would be necessary to apply an additional
constraint which would force the symmetry of the cost matrix.
Given that the FTP occurs as a generalization of the TSP, and given that the latter belongs to the
class of NP-hard problems [20], then so does the former one.
36
3.3 Graph representation
As described in section 3.1, a FTP instance is completely described by a structureG = (Vc, A, T0, D, TW ).
Note that the previously referenced structure requires a set of arcs A connecting every pair of nodes
belonging to Vc. Thus, it is possible to represent this information in a weight matrix, where each of its
entries corresponds to a particular arc.
Since the presented formulation of the FTP wishes to addresses a real-world situation involving
commercial flights, upon constructing the weight matrix, it is necessary to take the characteristics of
commercial flights into account. For any pair of cities, at any given moment in time, there are several
commercial flights that connect these two cities in that particular moment. These flights are denoted as a
family of arcs. Given that there is a direct mapping between commercial flights and FTP arcs, then there
are multiple arcs for each weight matrix entry. Moreover, a commercial flight connecting two cities, does
not have a constant price over time. Thus, the value of an arc connecting two cities, varies over time.
This means that a weight matrix of a FTP instance is three dimensional, with three variables i, j, t, where
i is the origin node, j the destination node, and t the moment in time at which the transition occurs.
In particular, it is possible to consider a weight matrix in which, for every entry, there are multiple
values, corresponding to the different possible arcs for that pair of nodes and time. Consequently,
accessing a particular arc requires not only the triplet (i, j, t), but also some information about which
particular arc to select.
Given that the dimensions of a FTP weight matrix are considerable, considering a family of arcs
for each weight matrix entry is not recommended, since it would increase the weight size even more.
Instead, a pragmatic strategy is followed. Upon constructing a weight matrix for the FTP instance, the
objective function is taken into account. This means that, instead of considering that there are multiple
arcs for the triplet (i, j, t), it is considered that there is only one: the one that has the minimum value
according to the objective function. For example, if the objective function intends to minimize the total
trip cost, upon constructing the weight matrix, for each family of arcs (i, j, t), only the minimum cost arc
would be selected. By using this strategy there is only one available arc, for each cost matrix entry, and
it is the one which minimizes the objective function.
The price of a commercial flights usually depends on the cities, dates, and direction of traversal.
This means that the weight matrix of the FTP is not necessarily symmetric. Moreover, there is also
no guarantee that a commercial flight between two cities exist. In fact, there are many cities which
do not have a direct flight connection. Fortunately, many commercial flight providers have this into
consideration, and try to establish an indirect connection between any two cities by adding connecting
flights. However, this is not always the case, and it may be necessary to initialize each entry of the
weight matrix to a very high value, in order to discard these non-existent flights from a possible solution.
To conclude the analysis of the characteristics of the FTP weight matrix, it is worth mentioning that
37
the matrix does not need to be complete, because not every arc is relevant for the construction of a
solution. While it is necessary to have every arc connecting two pair of nodes belonging to the set of
nodes to be visited, it is not necessary to have every arc connecting the initial and final nodes to the
others. In reality, the arcs leaving from and returning to the initial and final node, respectively, are only
necessary in particular moments of time. To better understand this, consider figure 3.2, which illustrates
the necessary arcs to construct a solution to a FTP instance. Every arc of a FTP instance can be
classified into three different groups, according to their characteristics: initial, transition and final arcs.
Figure 3.2: Illustration of the distribution of the initial, final and transition arcs.
The initial arcs are those which might initiate the trip. Consequently, they must start at node v0,
at t ∈ T0 = [T0m, T0M ], connecting v0 to every node in V . In its turn, the final arcs are those which
connect every node in V to the return node, vn+1 at t ∈ [Tfm, TfM ], where Tfm = T0m +∑D and TfM
= T0M +∑D, and
∑D is the total trip duration, calculated by adding the number of nights in every
destination. There are a total of ki = T0M − T0m + 1 = TfM − Tfm + 1 initial and final arc layers. In the
example depicted in Figure 3.1, there is a single initial and final layer, since there is only one possible
start date.
The transition arcs are those which fully connect the N nodes belonging to V . In the example
presented in figure 3.2, the earliest transition arc occurs at a time no sooner than t1 = T0m + min(D),
where min(D) corresponds to the lowest value of the set of durations. Hence, if the trip starts by
traversing an initial arc at time T0m, the first transition arc must only be traversed min(D) time-units
later. By following a similar approach, the latest transition arc can occur no latter than t2 = T0M +∑(D)−min(D). Thus, there are a total of k2 = t2− t1 + 1 transition layers, and k2 ∗n ∗ (n− 1) transition
arcs.
The union of the initial, transition and final arcs gives the set A of all the arcs, which may be used to
construct a solution to the requested trip.
38
3.4 Dimensional overview
This section presents a brief analysis of the weight matrix dimensions of an FTP instance, in particu-
lar, the number of entries and its size in memory. It also includes an overview of the solution space
associated to a FTP request.
The FTP is described by a tri-dimensional array (weight matrix), with shape (|V c|, |V c|, |T0|+∑D),
where Vc is the complete set of nodes (note that|V c| = (n + 2)), and T0 is the length of the start period
and∑D is the total trip duration. The total number of entries (ne) of the weight matrix is given by
ne = (n+ 2)2 × (T0 +∑D).
Consequently, the total number of entries of the weight matrix depends mostly on the number of cities
to visit, since there is a quadratic relation. Furthermore, it is always true that 1 ≤ (T0 +∑D) ≤ 365,
since commercial flights are not sold with more than one year in advance. Thus, it is possible to simplify
the expression of the number of entries of a weight matrix by considering ne = k(n + 2)2, where k is
(T0 +∑D).
Each entry of the weight matrix shall be represented with a 32bit integer and thus, it is possible to
calculate the size of the weight matrix in memory (mgraph), given by mgraph = 4 ∗ ne (bytes).
Hence, when considering a TSP instance n cities to visit, there are n! possible routes. The goal of
the FTP is to find both the best route and schedule for a trip. Note that in a FTP problem, the trip start
length is given by T0, and thus, there are at most T0 different possible schedules. Thus, the size of the
solution space to a FTP is given by |S| = T0 ∗ n!
From the above, it is possible to conclude that the size of the search space of an FTP instance is
closely related to that of the TSP. In particular, if there is a single start date, the size is the same. As the
length of the start period increases, than so does the search space. This increase is always linear, and
it is usually below 3 orders of magnitude.
Note that the procedures considered above are only true after reducing the family of arcs into a single
arc, according to the objective function, as described in the previous section. Prior to this, instead of
having a single 32bit integer representing a weight matrix entry, there is a structured data object (JSON)
with flight information. In general, each flight roughly occupies 3500 bytes, and there are multiple flights
(≈1-100) for each pair of cities, and for each date.
3.5 Optimization methodology
This section introduces the considered optimization algorithms to produce a solution to a Flying Tourist
Problem, as defined in section 3.1. Given the real-world application under development, and its goals,
the devised optimization system should be capable of providing a stream of responses in finite-time. Due
to these objectives, the considered optimization strategies are based on heuristic algorithms (subsection
39
3.5.1) and metaheuristics algorithms (subsection 3.5.2), as the characteristics of these algorithms fit the
goals of the system.
3.5.1 Heuristic algorithms
3.5.1.A Pseudo-random construction procedure
Formally, the method introduced in this subsection is not an optimization algorithm, but rather a solution
construction procedure for the Flying Tourist Problem. This procedure is relevant for two reasons. First,
it can be used to construct a preliminary solution to a given request in a very fast manner. Naturally,
the quality of this solution may be extremely poor, but it is very useful as an initial and fast response
to a user request. Furthermore, this construction procedure is also relevant, because some optimiza-
tion algorithms, like the Simulated Annealing (discussed in section 3.5.2.B), require an initial valid and
complete solution.
The method introduced below will be hereinafter called pseudo-random construction procedure, and
requires an instance of the Flying Tourist ProblemG = (V c,A, T0, D, TW ) where there are no restrictions
regarding the time-windows, that is TW (i) = [0,+∞[, ∀i ∈ V . The steps of this procedure can be
summarized as follows:
1. set an initial empty solution: s = ();
2. set the current time to one of the possible start dates: t ∈ T0 = [T0i, T0f ];
3. set the current node to the start node: vc = v0;
4. if the set of nodes to visit, V , is empty, go to step 11); else, continue to step 5)
5. select the next node by choosing a random node from the set of nodes to visit: vi ∈ V ;
6. remove the selected node from the set of nodes to visit: V = V \vi;
7. extend the solution with the arc: atvc,vi ;
8. increment the time according to the duration of visit of the selected node: t = t+ d(vi);
9. update the current node: vc = vi;
10. go to step 4);
11. extend the solution with the final arc atvc,vn+1.
The proposed pseudo-random construction procedure is expected to be adequate for single-flights
and round-trips, as well as small multi-city requests. As the number of cities to visit increases, the
quality of the solutions presented by this solutions is expected to fall. In any case, despite the size of the
instance under resolution, this procedure is expected to be very fast and able to return a solution to a
request in a very short time.
40
3.5.1.B Nearest neighbour procedure
The nearest neighbour is a solution construction procedure that starts with an initial empty solution, and
at each step of the algorithm updates the current solution be extending it with a solution component (i.e.,
an arc). Thus, this construction procedure is very similar to the one described in subsection 3.5.1.A.
However, while the previous construction procedure selects the next node to visit in a pseudo-random
way, the nearest neighbour heuristic takes a different approach, by selecting the next node according to
some particular objective.
During the development of this work, two different nearest neighbour heuristic were used. The first
only takes into account the distance between nodes, visiting always the closest node relative to the
current one. This is exactly the nearest neighbour procedure applied to the Traveling Salesman Problem.
In the second approach, instead of considering the distance between nodes, it considers the targeted
objective function. That is, if the objective is to minimize the total cost, than this heuristic will always
select the node according to the minimum cost arc. In its turn, if the objective is to minimize the flight
time, or any other criteria, then it is this criteria that is used upon selecting a node to visit, always
choosing the node which minimizes the increase in the current objective function.
In order to distinguish the two nearest neighbour heuristic, we will denote them as dNN and rNN,
that is, distance nearest neighbour and refined nearest neighbour, respectively. Note that the first only
takes into account the distance between nodes, and not directly the objective function, while the latter
takes into account the objective function, and completely disregards the distance between nodes.
The nearest neighbour construction procedure can be adapted from the previously introduced pseudo-
random construction procedure by simply replacing the construction step number 5). Thus, the distance
nearest neighbour considers:
• select the next node by choosing the one closest to the current node:
vi ∈ V : d(vc, vi) ≤ d(vc, vj), ∀ vj ∈ V \ vi
while the refined nearest neighbour considers:
• select the next node by choosing the one which increases the objective function the least:
vi ∈ V : f(vc, vi) ≤ f(vc, vj), ∀ vj ∈ V \ vi
It is worth nothing that the application of the distance and the refined nearest neighbour heuristics
require different levels of information. On one hand, the distance nearest neighbour requires the knowl-
edge of the distances between each pair of cities. On the other hand, the refined nearest neighbour
requires a complete weight matrix regarding the objective function.
41
3.5.2 Metaheuristic algorithms
3.5.2.A Ant Colony Optimization procedure
The considered Ant Colony Optimization (ACO) algorithm receives, as input, a weight matrix with the
information regarding all solution components of the problem. It also receives other relevant parameters
for the solution construction process, as the initial and final node, v0 and vf , the start period T0, and the
set of waiting periods D.
The initialization of the ACO metaheuristic requires the construction of an initial pheromone matrix.
Each entry of this matrix is set to an initial pheromone value, according to Eq. 3.4, where n is the number
of nodes and Cnn is the cost of the nearest neighbor heuristic [84].
τ tij = τ0 =1
nCnn(3.4)
The initialization of the metaheuristic also requires the definition of a variety of algorithm-specific
parameters, such as the number of ants m, the pheromone evaporation rate ρ, the heuristic relative
influence β, the pheromone relative influence α, and the exploration rate Q0.
After the initialization, and until the termination condition is met, the algorithm enters an iterative
cycle, where every ant belonging to the colony constructs a solution to the problem. This is followed by
a pheromone update phase, to reflect the colony search experience. A new iteration may only start after
all ants have finished the solution construction process and the pheromone matrix has been updated.
The construction process undertaken by each ant is as follows. First, the current time is set to a
value belonging to the allowable trip start dates, t ∈ T0, and the current node is set to the start node
v0. Each ant enters an iterative cycle until all nodes belonging to the set of nodes to visit, V , are
visited exactly once. At every step of this cycle, an ant chooses one of the remaining valid solution
components. After the selection of a solution component, the current time is updated, according to the
duration of the selected city. Furthermore, it is also necessary to apply a local pheromone update (see
equation 3.8), after the selection of each solution component, as to reduce the probability of other ants
selecting the same one in the current iteration [84]. By following this iterative construction process, a
valid but incomplete solution is found. To complete this solution, it is necessary to add an extra solution
component, which closes the route by adding the return node, vn+1.
In the construction process described above, each ant selects the next solution component by either
exploiting or exploring the search space. That is, exploitation is the process of selecting the next solution
component mostly based on the previous ants’ search experience, while exploration intends to diversify
the traversed search space. The decision of exploiting or exploring depends on the algorithm parameter
Q0 and on a pseudo-random value q, calculated at run-time. The selection of the solution component
vj , which identifies the next city to be visited, is given by Eq. 3.5.
42
vj =
exploitation (Eq. 3.6), if q ≤ Q0
exploration (Eq. 3.7), otherwise(3.5)
The exploitation of the search space utilizes the pseudorandom proportional rule, defined by Eq. 3.6,
which determines the next solution component of the ants’ solution. The Jk(i, t) term represents the
set of solution components that might be selected to form a valid solution component by an ant in its
current state, where the state refers to the current ant position of the trip it has constructed so far. In the
presented equations, η is the inverse of the weight matrix value.
argmaxj∈Jk(i,t)[τ(i, j, t)][η(i, j, t)]β (3.6)
On the other hand, the exploration is given by Eq. 3.7, with pa(i, j, t) representing the probability of
ant a (which is currently at node i at time t) selects j as the next node to visit.
pa(i, j, t) =
[τ(i, j, t)][η(i, j, t)]β∑
u∈Jk(i,t)[τ(i, u, t)][η(i, u, t)]β, if j ∈ Jk(i, t)
0, otherwise(3.7)
After each ant finishes its iterative solution construction process, the ACO metaheuristic enters into
its pheromone update step. Depending on the chosen ACO algorithm, the pheromone update may vary.
This work follows the Ant Colony System (ACS) strategy, whose pheromone update requires both a
deposit and an evaporation step. Unlike many other ACO algorithms, the pheromone update applies
only to the arcs belonging to the best solution found so far, Sbs. This results in the update of the
pheromone values by means of Eq. 3.9, where (∆τ tij)bs is given by 1/Cbs, where Cbs represents the
objective function value of the best solution.
τ tij = (1− ρ)τ tij + ρτ0 (3.8)
τ tij = (1− ρ)τ tij + ρ(∆τ tij)bs (3.9)
It is common (and often recommended) to combine ACO algorithms with local search heuristics,
also denoted daemon actions, that try to improve the quality of the constructed solutions, after each
of the ants’ iterative cycle. However, this was not applied to the proposed optimization, due to the
nonexistence of adequate local search procedures for the time-dependent TSP. In fact, even the k-opt
exchange procedures, widely used in the classical TSP as local search, are not efficient for the time-
dependent TSP because it requires, at each step, the computation of the entire trip cost, as opposed to
just the cost difference regarding the k arcs, as in the symmetric TSP.
43
3.5.2.B Simulated Annealing procedure
The considered Simulated Annealing algorithm receives, as input, a weight matrix with the information
of the solution components of the problem. It must also receive other relevant parameters for the so-
lution construction process, as the initial and final node and the set of waiting periods D. This specific
information about the instance under optimization is crucial, as it enables the validation of a solution and
the calculation of the respective objective function value.
The general procedure of the Simulated Annealing metaheuristic is as follows. Given an initial so-
lution (x), at each step of the inner cycle (also called Markov chain), a new candidate solution (y) is
constructed based on a neighbourhood function, which is usually problem specific. Depending on the
quality of the new solution, and the current temperature of the algorithm, this solution may or may not be
accepted. This process of constructing and conditionally accepting a new solution occurs a fixed number
of times per outer cycle - this number is referred to as the Markov chain length. Having completed one
Markov chain, the temperature of the state is decreased, according to a predefined cooling schedule.
Given the above described procedure, the Simulated Annealing metaheuristic requires:
1. an initial solution - over which the local search operates;
2. a neighbourhood function - used to construct candidate solutions;
3. an acceptance criteria - used to conditionally accept the candidate solutions;
4. a cooling schedule - to decrease the temperature of the state.
As a local search metaheuristic, the Simulated Annealing conditionally requires an initial solution. It is
possible to construct this initial solution by applying the pseudo-random construction procedure (section
3.5.1.A), or by using the nearest neighbour (section 3.5.1.B). In general, the quality of the initial solution
does not affect the quality of the best solution found by the algorithm [85].
The considered neighbourhood function selected for the generation of new candidate solutions is the
2-opt swap procedure. Hence, at each iteration step of the Markov chain, it selects two random nodes
and swaps the corresponding path. It is also necessary to take into account both the initial and final
nodes in order to produce a valid solution. Since this swapping procedure may change the dates at
which each node is visited, which consequently changes the solution arcs, it is necessary to adjust the
flight dates and calculate the objective function value of the candidate solution.
The acceptance criteria is a function that determines the probability of accepting a candidate solution.
The developed SA algorithms uses the Metropolis acceptance criteria [86], presented in equation 3.10,
which can be summarized as follows: i) if a candidate solution y is better than the current solution x, it
is always accepted; ii) if a candidate solution is worse, it may, or may not, be accepted.
py =
1, if f(y) ≤ f(x),
e−∆ft , otherwise
(3.10)
44
The probability by which a worse solution is accepted depends upon the difference in the objective
function values (∆f ) of the two solutions and the current temperature of the system (Eq. 3.10). As ∆f
increases, and as the temperature decreases, the probability of accepting a worse solution is reduced.
With such an approach, the Metropolis acceptance criteria allows up-hill moves, which enable the algo-
rithm to escape from local minimum. Notwithstanding, as the temperature reaches very low values, the
algorithm becomes increasingly greedy.
The developed SA optimization uses a geometric cooling schedule. It starts with an initial tempera-
ture t0, and at each outer iteration, the temperature is decreased, using equation 3.11, where k is the
iteration counter of the outer loop and λ is the cooling parameter.
tk+1 = λ ∗ tk (3.11)
The cooling schedule parameters t0, tf and λ must be calculated beforehand based on the prob-
ability of accepting a worse solution during the first iteration (p0) and during the last iteration (pf ), and
on the total number of outer iterations (k). The defined algorithm establishes p0 as 0.98 and pf as a
positive value close to zero. The total number of iterations is set according to the time available for the
optimization process, and the length of the Markov chain (M ) is set to the number of nodes m.
To calculate the value of t0 and tf , the algorithm starts by generating some candidate solutions
using the neighborhood function and the current solution x [87]. These candidate solutions are used to
calculate the average absolute difference in the objective function ∆avg. This allows the calculation of
the t0 value according to Eq. 3.12, based on the Metropolis criteria. The final temperature tf is given
by tf = λkt0. This allows the calculation of λ with Eq. 3.13. Given t0, tf and λ, the geometric cooling
schedule is completely defined.
t0 =−∆avg
ln(p0)(3.12)
λ =
(−∆avg
ln(pf )t0
)1/k
(3.13)
3.6 Summary
This chapter enabled the definition of a problem (FTP) which is adequate for the characterization of both
simple and complex trips. The instantiation of this problem requires a weight matrix where each of its en-
tries corresponds to a single flight. This problem was shown to be a generalization of the TSP and thus,
it belongs to the class of NP-hard problems. Due to its computational complexity, the methodologies
proposed to solve the problem rely on heuristic and metaheuristic optimization algorithms.
45
46
4System Design and Implementation
Contents
4.1 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.2 Underlying technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.3 Client Side Application implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.4 Server Side Application implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.5 Optimization system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
47
While chapter 3 presented a formal definition of the problem and proposed some methodologies
to solve it, this chapter addresses the design choices and implementation details related to the devel-
opment of a flight search application to solve unconstrained multi-city flight requests. It presents an
overview of the architecture (section 4.1), structure and design choices (subsections 4.1.1 and 4.1.2).
It also presents the implementation details of the developed application. In particular, it describes the
underlying technologies (section 4.2), the structure of the CSA (subsection 4.3) as well as the current
user interface (subsection 4.3.3), and the SSA API and DMS (section 4.4).
4.1 System Architecture
The aimed web application should allow users to search for the best schedule, route and set of flights, for
both one-way and round-trip flights, as well as unconstrained multi-city trips. During the formal definition
of the problem, in section 3.1, it was shown that these problems can be converted into a FTP instance,
which is a generalization of the TSP. Because of this, as the number of cities increases, the problem
becomes increasingly more difficult to solve. In order to cope with this, each user defined request is
solved using the optimization algorithms previously presented in chapter 3.5.
The proposed system is structured into two separate applications: the Client-side application (CSA)
and the Server-side application (SSA). The client side application is designed to solely interact with
the user, redirecting requests to the server side application, whose goal is to solve these requests.
Thus, there is a complete separation of concerns between both applications: the CSA serves only as
an input/output port, and the application logic and intelligence is handed by the SSA. The communica-
tion between both application relies on the Hypertext Transfer Protocol (HTTP) and on Asynchronous
JavaScript and XML (AJAX). This means that the CSA may request data from the SSA using a simple
HTTP protocol, and that this communication is asynchronous, allowing the user to continue to interact
with the application, even while the response is being prepared. The structure of the proposed appli-
cation, and the data-flow associated to the resolution of a user defined request are presented in figure
4.1.
Figure 4.1: Structure and data flow of the proposed application.
48
4.1.1 Client Side Application
The CSA is a web application designed to interact with the user, allowing the definition of flight requests
and the presentation of meaningful solutions. The response to these requests are not processed directly
by the CSA, as it would consume too much resources (CPU and RAM) of the users device, and are,
instead, redirected to the SSA to be solved. Thus, the CSA intends to be only an input/output port
between the user and the SSA.
Each user defined request is constructed in such a way that it can be used to instantiate a FTP, as de-
fined in section 3.1. Each FTP request is a specific resource, uniquely identified by a particular Uniform
Resource Identifier (URI). This will be further detailed in section 4.4.2. Each of these resources are
used to request a solution to the request from the the server side application. Following this convention,
the user interface must enable the collection, from the user, the following information:
• the start city and return city, v0 and vn+1, respectively;
• a list of cities to be visited V , and the durations D associated to each;
• the start dates T0 associated to the trip.
Upon receiving a solution to a user request, the User Interface (UI) must be updated, displaying the
relevant information of these solutions. Each solution to a request is an object which contains at least
one set of flights that satisfy the user defined itinerary. However, a response to a request should contain
several valid solutions, so that the user might choose the most adequate to his needs. Each solution
is composed of one or several flights, and each of the presented flights should contain, at least, the
following information:
• the flight cost;
• the flight duration;
• the date, departure and arrival time;
• the number of layover flights.
Having a clear idea of the objectives and structure of the CSA, it is possible to discuss its actual
design. The User Interface can be separated into two independent views: the Request and the Re-
sponse view. These views allow the definition of user requests, and the visualization of the constructed
response, respectively. It would also be useful to define and implement a third view: a Map view. While
the request and response views are essential to the overall function of the application, the map view is
not. However, a map capable of displaying the routes of the selected flights, as well as other relevant
information, would certainly contribute to a better and more complete user experience.
49
A web application can be accessed by multiple devices, as phones, tablets and computers, and
each of these devices has different screen dimensions. Because of this, the proposed web application
should be responsive. That is, the design and dimensions of the user interface should be adequate and
responsive to the size of the device rendering it.
With this in mind, the proposed UI should follow the design illustrated in figure 4.2. Notice that there
are a total of three views: the request and response views, which are essential, and thus are always
present; and the map view, which is not, and thus can be discarded on some smaller devices.
Figure 4.2: Proposed User Interface layout for small/medium and large devices. There are two essential views andone optional view.
4.1.2 Server Side Application
The SSA is responsible for producing a solution to a user defined FTP instance. To do so, the SSA
shall be implemented as an Application Programming Interface (API), listening for requests to particular
resources, which uniquely identify these FTP instances. Each received request corresponds to a shallow
FTP instance, that is, a problem in which the cities are specified, but the flights are not. To overcome
this problem, and collect the relevant flight data, the SSA will rely on a Data Managemet System (DMS),
discussed in section 4.1.2.A. Having a complete FTP instance, the SSA will use the Optimization system
discussed in section 4.1.2.B to construct a solution to these user requests.
4.1.2.A Data Management System
Upon receiving a user defined request, the set of nodes, as already well as the start time and durations,
are well defined. On the other hand, the set of arcs which connects these nodes are not. For example,
50
a user request may correspond to a single flight from A to B, at time t, and, upon receiving this request,
there is no available information regarding the possible flights (arcs) between these two cities. In fact,
that is exactly what the user is searching for. Thus, the goal of the Data Management System is to
collect the necessary information to construct the list of arcs associated to a user request.
The communication with a flight API utilizes a simple HTTP protocol, and every request is identified
according to an Uniform Resource Identifier, whose syntax is defined by the API provider. A response
to a request usually consists in a data tree, by using a structured data format, such as JavaScript Object
Notation (JSON). Every response includes a list of possible flights, and each flight has a vast number of
attributes, as the cost, flight duration, departure time, and so on.
Although there are several publicly available flight data API’s, the information provided by each of
those might be considerably different. Multiple flight search applications were compared in chapter 1
from a simple services offer perspective, and the results were presented in tables 1.1 and 1.2. The
analysis of this comparison from the corresponding API perspective shows that the flight data presented
by each of these API’s varies considerably. For example, the cost of a single flight may be up to 44%
higher, according to the API flight data provider.
Hence, one of the main goals of the proposed web application is to find the best set of flights for
a given query, according to some objective function and, in particular, the minimization of the total
flight cost. Given that there are considerable differences among flight APIs, ideally, the proposed web
application should query multiple APIs.
The role of the Data Management System is of crucial importance for the development of a high
quality flight search web application. This is because every user request seeks to find a given set of
flights to a satisfy a particular itinerary. Thus, it is of extreme importance to have the most up-to-date
flight data for each of the possible flights.
4.1.2.B Optimization system
The main goal of the Operating System (OS) is to produce solutions to user defined FTP instances.
Upon defining an itinerary and translating it to a FTP instance, the arcs which connect the nodes are not
defined, and thus, no valid solution can be produced. Thus, the OS is heavily dependent on the Data
Management system, as it requires relevant flight data to process the requests.
Depending on the particular request being processed, the required time to collect all the necessary
flights, or the time to run a computationally heavy optimization algorithm, might be very high. Because of
this, the optimization system is distributed into different layers, as illustrated in figure 4.3, which require
different amounts of information regarding the flights, and produce multiple solutions using different
heuristics, at different times. The goal of this is to reduce the latency that is sensed by the user, by
producing an initial solution as soon as possible, and continue to search for a better one afterwards.
51
Figure 4.3: Simplified illustration of the optimization system, which utilizes different algorithms to produce a solutionto a user defined request.
The first layer of the optimization system implements a random solution, as proposed in section
3.5.1.A. This procedure allows the construction of a solution in a very fast manner, as it randomly selects
one node after another, and only requires information regarding a very limited number of flights. Despite
producing a very fast solution, the quality of it is expected to be low.
The second layer of the OS implements the nearest neighbour heuristics proposed in section 3.5.1.B.
The implementation of this heuristic requires information regarding the entire weight matrix associated
to the problem. Because of this, it must run only after constructing the initial solution. It is expected that
the solution constructed by this procedure is of higher quality to the initial one.
In its turn, the third layer of the OS implements the meta-heuristic optimization algorithms proposed in
section 3.5.2.A and 3.5.2.B. Similarly to the nearest neighbour heuristic, these optimization algorithms
require information regarding the entire weight matrix. Thus, they must also run only after the initial
solution construction procedure, but they might run in parallel with the nearest neighbour heuristic. It
is expected that the solutions constructed by these meta-heuristics are of much better quality than the
initial and the nearest neighbour solutions.
4.2 Underlying technologies
By following the architecture proposed in section 4.1 and illustrated in figure 4.1, the developed system
consists of two web applications: the CSA and SSA. The CSA is the application responsible for rendering
the User Interface, allowing the definition of user requests, which are processed by the SSA. Thus,
although these two applications run separately, the CSA is dependent upon the SSA.
The developed applications are hosted on Heroku, a cloud platform which, upon request, creates
two separate runtime environments, one for each application. Each application requires a server to
52
listen to requests and serve content. In particular, the CSA and SSA run on node.js and django servers,
respectively (see figure 4.4). Furthermore, since the User Interface will render a map, the CSA requires
access to the Google Maps API. To enable the implementation of a modern web application, the CSA
also uses several other frameworks, in particular, React and Redux. In its turn, the SSA requires access
to real flight data and thus, will interact with Kiwi’s flight data API. The described application structured
is illustrated in figure 4.4, and by Bfly App and Bfly API denote, respectively, the the CSA and SSA.
The previously mentioned underlying technologies will be discussed with further detail in the following
subsection.
Figure 4.4: Technology stack used by the developed application.
4.2.1 Heroku
Heroku is a cloud platform as a service (PaaS), which allows applications to be built, deployed, monitored
and scaled. Customers who use Heroku do not need to worry about implementation details specific to
infrastructure and software, such as the hardware and the servers [88]. In terms of services, Heroku
competes directly with other cloud platform services, such as Google’s App Engine [89] and Amazon’s
Web Services [90]. A succinct comparison of the advantages and inconveniences among these different
services can be consulted in [91], while a more comprehensive (although somehow outdated) overview
is available at [92].
Heroku provides a detailed explanation of its services and how to use them in order to deploy and
manage an application [93]. Following this overview, an Heroku application can be defined as a slug
which runs on dynos.
53
A slug is typically the result of the deployment and bundling of an application. An application is
defined by its source code, and in order to run it on Heroku, it is necessary to specify the programming
language runtime, its dependencies, and and any compiled output of the built system. An applications
can be deployed to Heroku via git or using Heroku’s own API. Upon deployment, Heroku builds the
application, and a ready to be executed slug is produced.
A dyno is a virtualized UNIX container that provide the necessary environment to run the slug of the
application. Hence, it is possible to run applications without costs by using a single cost free dyno. As
the number of requests to an application increases, a single dyno may not be sufficient to serve all users
without compromising the connection time. To overcome this, it is possible to scale the developed appli-
cation by increasing the total number of dyno. This is particularly useful for the SSA, as the optimization
process of the FTP requests require significant ammount of RAM and CPU.
During the development of this work, Heroku is used as the host service for both the client and
the server side applications. Both of these applications run on a single free dyno, and do not use any
database or other add-ons. It should be noted that the single free dyno has one inconvenient: upon
30 minutes of inactivity, the application will sleep. This means that the next connection to a sleeping
application requires some additional time.
4.2.2 Node.js
Node.js [94] is an open-source and cross-platform JavaScript runtime environment, which executes
JavaScript code on the server [95]. Node enables the development of fast and scalable web servers
using JavaScript only, by implementing an asynchronous event loop, a low-level input/output API, and
Google’s V8 JavaScript engine. This last technology is a fundamental part of the Node.js stack, since it
allows the compilation of JavaScript source code into native machine code.
Upon the creation of the first web browsers, JavaScript was utilized as a scripting language used to
modify, in run time, the content of a web page. This enabled the creation of the first dynamic web pages.
Today, due to Node.js, this scripting language is not restricted to the browser, and can be used in the
server to create dynamic web pages, even before the page is served to the client [96].
Thus, Node.js enables the unification of the web development environments around a single pro-
gramming language, which may be used both on the client and the server. It also includes a package
manager (npm) that allows the execution of third-party software, that can be used, for example, in the
management of databases, networking, file system I/O and data streams.
In the development of this work, Node.js is used to create a web server for the Client Side Applica-
tion. Upon receiving a client request, the web server responds with a JavaScript and HyperText Markup
Language (HTML) bundle file. This bundle is the result of the compilation of the dynamic React appli-
cation, illustrated in figure 4.6 in the following section. This bundle contains the necessary information
54
for the browser to render, manage and update the User Interface upon interaction with the user and
third-party APIs.
4.2.3 Django
Django is a python framework for the development of web servers and APIs [97]. During the development
of this work, django was used to develop the SSA, by creating a web server to run an API to process
requests from the CSA. This API, described with further detail in subsection 4.4.1, interprets the requests
using regular expression matching, following the protocol defined in subsection 4.4.2. Upon interpreting
the request, it is possible to execute particular sets of functions, as the flight data requests to third party
APIs, and the optimization algorithms to solve the FTP requests.
4.2.4 React and Redux
React is a JavaScript library for the development of user interfaces for web and mobile applications [98].
React is based on some core principles, which include the concepts of components, props and state, a
JavaScript language extension called JSX and the concept of a virtual Document Object Model (DOM)
[99],
The interfaces rendered by React are built using components, which combine the markup of HTML
with the dynamic utilities of JavaScript, and the styling of CSS. This is achieved by using JSX, which
bundles these three technologies under the same file, creating a single independent component. These
components may receive input arguments, denoted as props, allowing them to be flexible and reusable.
Usually, a component is a pure function, rendering always the same content for the same input. Com-
ponents may also be called by other components, allowing the creation of complex architectures. In
general, an application consists of multiple different components.
The state of an application is a data structure with some relevant information for the construction
of the user interface. In a flight search application as the one being developed, the state contains
information regarding the user request and the solutions associated to it. In general, components do not
access the state directly, but they may receive specific parts of it as props.
As applications grow bigger, it becomes increasingly difficult to manage their state. React, by itself,
is not well prepared to do so, but it is possible to use third party libraries for this effect. One of these
libraries is Redux [100]. Redux is a JavaScript library for the management of the state of an application.
It is often called a predictable state container, because it does not allow the state to be changed directly,
but instead requires a description of these changes using a plain object, called action. Dispatching an
action triggers the execution of a function to manipulate the state, called reducer. Reducers are always
pure functions, producing the same result for the same input. Because of this, the state of the application
55
is deterministic or, in other words, predictable.
The user interface of a React application is always up to date with the current state. However, it does
not re-render the entire page every time the state is updated. Instead, it compares the current DOM
structure to a virtual DOM introduced by React, and identifies the components that require an update.
This enables a fast and effective update of the user interface.
During the development of this application, React and Redux were used together to build the client
side application. This means that React is responsible for rendering the user interface, while Redux
manages the state and interacts with the SSA using a simple HTTP protocol.
4.3 Client Side Application implementation
The Client Side Application is implemented as a web service to interact with the user and allow the
resolution of complex flight requests. The CSA is constructed as a single page application, and is
hosted using heroku, being publicly available 1.
The CSA, built using react and redux, evolves around a concept called state. The state of the appli-
cation is stored in the redux store, and it contains the relevant information associated to the application
at each instance. In the developed application, the state tree is divided into requests and responses.
Requests are associated to the user input, while responses come from the SSA. The state cycle of the
developed application is illustrated in figure 4.5.
Figure 4.5: Block diagram of the state cycle of the Client Side Application.
To each branch of the state (request and responses) it is associated with a container (or a panel
view). Containers are top level components, which access the state directly, and call specific compo-
1The developed CSA has the following URL: https://desolate-castle-31305.herokuapp.com/
56
nents, often using parts of the state as input (props) for them. Thus, the developed application has two
main views, as it was proposed in section 4.1.1. There is also a third container, for the map, although it
does not have a state branch for itself, but instead derives the necessary information from the other two.
The complete architecture of the implemented react/redux application, including some of the con-
cepts previously discussed, is illustrated in figure 4.6. This figure illustrates the top down hierarchy,
with a single component in the top of the hierarchy (denoted app.js), which instantiates the redux store,
and renders the complete application, by invoking the containers which are connected to the relevant
presentational components. This figure also illustrates the interaction with the store and with third-party
API’s.
Figure 4.6: Building blocks of an application built using React And Redux.
During the remaining of this section, it will be shown how to collect user input (subsection 4.3.1) and
how to communicate with the SSA (subsection 4.3.2). Finally, subsection 4.3.3 illustrates the developed
web application, by presenting some actual screen-shots of it.
4.3.1 User Input
There are three types of requests that will be processed by the application: single flight, round trip and
multi-city trip. A user must initially select the desired type of search, and follow this with the completion
of a series of forms that collect the relevant input.
Every request requires at least an origin and a destination, as well as a start date. There are other
57
search attributes which might be relevant, according to the selected trip type, and the duration associated
to each city that will be visited. A complete list of the input that may be collected from the user is
presented in table 4.1 (left column).
Table 4.1: Parallelism between User Input, Actions and Reducers. To each user defined input corresponds anaction, declaring the intent of changing the state with some specific data, and a reducer, which actuallymodifies the state.
User input Action Reducer
origin actOrigin(origin) setOrigin(state, action)
destination actDest(dest, index) setDest(state, action)
duration actDur(dur, index) setDuration(state, action)
start date actDate(date, index) setStartDate(state, action)
submit actRequest(request) setResponse(state, action)
Then, to every user input it is associated an action and a reducer. Table 4.1 also defines the action
that is dispatched each time the input is updated (center column), and the corresponding reducer that is
responsible for updating the state of the application (right column). Note that the last user input (submit)
is processed by an asynchronous action. Thus, after the submission of the request, the reducer will be
called only upon receiving the response from the SSA, updating the state by storing the received data,
and triggering the update of the user interface.
The developed application forces the user to submit the request, by clicking on a button that dis-
patches an action. During the development of the application, the possibility of removing this button,
and to automatically dispatch a request was considered. However, it was subsequently rejected be-
cause of the difficulty to know if a certain request is complete. For example, given a single flight request,
knowing if the request is complete is simply a matter of verifying if an origin, destination and departure
date exist. For a round-trip, the duration would also be a requirement. However, for multicity requests,
unless the user specifies how many cities are to be visited, there is no way of knowing when a request is
complete. Furthermore, any change to an already complete request would trigger another new request.
Thus, it must be an user defined action to declare the intention of submitting a request.
4.3.2 Communication with the SSA API
Upon the submission of a user defined trip, the CSA communicates with the SSA in order to produce a
solution to it. This communication relies on HTTP, and utilizes the SSA API protocol that will be described
in subsection 4.4.2. The URL extension of the request, called the Uniform Resource Identifier (URI),
enables the specification of each particular resource under request. Thus, the URI may be composed of
the necessary fields that specify the attributes which characterize the user selected resource. It follows
that a particular resource is identified by a collection of (key, value) pairs, which identify the resource
58
attribute and its user defined value.
Although the application logic of the developed CSA is managed by Redux, this framework is not able
to make HTTP requests or provide the desired asynchronous behaviour. In order to do so, it is necessary
to use two third-party libraries: redux-thunk and superagent. Redux-thunk is a store enhancer, or
a middleware, providing additional functionalities to the store: and superagent is a library for AJAX
requests. Together, these two libraries enable the dispatch of asynchronous actions, which are used to
make HTTP requests for specific resources to the SSA API.
4.3.3 User Interface
The User Interface consists in a single page application, divided into three main views: the Request,
the Response and the Map view, as proposed in section 4.1.1 and illustrated in figure 4.7. Since the
application is developed using React, every view is managed by a container, which reads the state from
the store, calls the rendering of presentational components, and may dispatch actions on user input
or other events. It is also worth nothing that, due to the modular nature of react components, they
are reusable. This is particularly useful in the construction of the request view. As an example, one
component (city form) is used to collect information regarding all cities: origin, return city and cities to
visit. This is possible because the underlying structure is the same. The only difference is the reference
to these components, which update different parts of the state.
Figure 4.7: Structure of the developed user interface
The user interface was designed to be mobile friendly, by being responsive to the device size. This
was achieved using the Bootstrap grid system, a web application design paradigm in which the user
screen is divided into 12 columns, and each element of the user interface may specify a variable number
of columns, depending on the screen size.
Figure 4.8 and 4.9 illustrate two possible views of the developed application. The first image illus-
trates the application in a desktop device and the second in a mobile. It is worth noting that these two
59
Figure 4.8: Application rendered on a desktop computer. Figure 4.9: Application renderedon a mobile device.
screenshots correspond to the same application. The design differences between these two views is a
result of the responsiveness of the application, which is responsible for resizing the application elements
according to the device size. It also includes toggles in the Request and Response views, as to generate
more space to the map view.
4.4 Server Side Application implementation
The Server Side Application is the module that is responsible for producing a solution to a user request.
The architecture and implementation details of the SSA are presented in section 4.4.1. Each request
processed by the SSA is defined by a set of parameters that specify the aimed trip, as will be described
in section 4.4.2. Producing a solution to a user request also involves the communication with third
party API’s, as to obtain the necessary flight data, which is handled by the Data Management System,
described in section 4.4.3. The actual production of a solution is managed by the Optimization System,
described in section 4.5.
4.4.1 SSA dataflow
As defined in section 4.1.2, the goal of the Server Side Application is to serve the CSA with the solution
to FTP requests. Its corresponding API is publicly accessible 2 and is used by the CSA to communicate
with the CSA. Upon establishing a connection, the Uniform Resource Identifier (URI) of the FTP request
is used to identify a particular resource corresponding to the the set of user selected input arguments.
This is discussed with more detail in subsection 4.4.2.2The developed API has the following URL: https://safe-plateau-51528.herokuapp.com/documentation
60
To construct a solution to a given request, it is necessary to sequentially execute a particular set of
steps, denoted as the main cycle of the SSA, which can be summarized as follows:
1. generate the list of necessary flights;
2. request flight data from third-party API;
3. construct the weight matrix for the objective functions;
4. execute the optimization algorithms;
5. construct a solution to the request.
Each step of the main cycle is executed by a particular set of functions. Steps number 1), 3) and 5)
are managed by a predefined python class, which implements the logic associated to each FTP request.
This class is also responsible for the communication with the DMS and OS, which execute steps number
2) and 4), respectively. It should be noted that the execution of the optimization step (4) is only required
for multi-city flight requests, since, for the single-flight and round trip, determining the best solution is a
trivial task.
The implemented SSA dataflow is illustrated in figure 4.10. By using the utilities offered by django, the
SSA implements a server with a single endpoint, where the requests from the CSA are received. These
requests must be validated, in order to discard meaningless queries. This validation occurs immediately
after receiving the request. It uses regular expressions to parse the request, followed by an analysis of
the input arguments, according to the rules defined in table 4.2 of the following subsection. After the
validation, the resources are translated into a FTP request, and, in particular, to a python object that
executes the previously described main cycle.
4.4.2 SSA interface protocol
Every CSA request may be defined as a FTP instance, as proposed in section 3.1. Thus, each request
is described by a limited number of attributes (origin, start date, etc.), and it may be uniquely identified
according to its uniform resource identifier (URI).
Hence, each flight resource is described by a predefined list of pairs of keywords and values. This is
illustrated in table 4.2, where the first and second column represent a FTP variable name and symbol,
respectively. In its turn, the third column defines the keyword for the URI, while the fourth column
provides a description about the acceptable values for each keyword.
It should be noted that the defined pairs of keywords and values must undergo a validation process,
not only to verify if the values provided for each keyword are acceptable, but also to assert that the
request is a valid FTP formulation.
61
Figure 4.10: Server Side Application dataflow
Table 4.2: Keyword/value pairs of the SSA protocol to uniquely identify a resource.
Name Symbol Keyword Details
start city v0 flyFrom City name or ICAO code.
return city vn+1 returnTo City name or ICAO code.
destinations V citiesCities to be visited.Accepts multiple cities, separated by commas.Accepts city name or ICAO code.
durations D durationLength of stay (in days) in each city.Accepts positive integers.Must be the same length as the number of cities.
start date T0minDatemaxDate
minDate is the earliest start date (T0i),maxDate is the latest start date (T0f ).Accepts date format dd/mm/yyyy.
62
4.4.3 Data Management System
The Data Management System (DMS) is responsible for the communication with third-party flight data
APIs. It allows the request of flight data in behalf of the CSA. This implementation of the DMS is based
on a low-level API wrapper, around the KIWI flight API, and it is constructed as a worker/consumer
factory to execute concurrent HTTP requests (see figure 4.11).
Given a list of necessary flights, usually requested by the CSA, the DMS communicates with third-
party APIs as to request the relevant flight data. This communication relies on the HTTP protocol, and
uses the URI as to identify a particular resource. This identification follows the protocol defined by the
third-party API [8]. A response to a request is given by a structured data tree, usually as a JSON data
type. Hence, each worker of the DMS worker/consumer factory executes the following tasks:
1. generate the URL for the intended resource;
2. execute an HTTP request to the given URL;
3. parse the response.
When communicating with third-party APIs, the bottleneck of the system is usually the time necessary
for the server to process the request and produce a response to it. The implemented worker/consumer
approach takes advantage of this bottleneck. Instead of waiting for a request to complete, this approach
uses the waiting period to spawn more requests. Thus, the waiting period associated to one request is
not imposed to the others. Despite this, executing multiple requests will increment the servers workload,
increasing the time necessary to respond to one (isolated) request. This approach will be discussed,
and its results illustrated, in subsection 5.2.4.
Figure 4.11: The Data Management System uses concurrent HTTP requests to communicate with third-party flightAPI’s.
63
Table 4.3: Algorithm specific parameters.
Alg. Parameter Value
ACO
Pheromone relative influence (α) 1Heuristic relative influence (β) 5Pheromone evaporation rate (ρ) 0.1Exploration rate (Q0) 0.9Number of ants (m) 10
SA
First iter. acceptance prob. (p0) 0.98Last iter. acceptance prob. (pf ) 10−300
Initial temperature (t0) see Eq. 3.12Final temperature (tf ) see Eq. 3.11Cooling parameter (λ) see Eq. 3.13Markov chain length (M ) N
4.5 Optimization system
Given a FTP request, the Optimization System (OS) is the module that determines the respective so-
lutions, using the strategies described in subsection 4.1.2.B. This module was implemented using the
Python3 programming language, and uses the numpy library [101] for the construction and management
of the multi-dimensional (and often very large) arrays, which describe an FTP instance.
The developed optimization system implements each of the three proposed optimization strategies:
the nearest neighbour heuristic, the ACO and the SA metaheuristics (section 3.5.2). The adopted imple-
mentation of the nearest neighbour algorithm closely follows the steps presented in subsection 2.2.2.B.
The remaining of this subsection will focus on the implementation details of the ACO and the SA meta-
heuristics.
The implementation of the ACO follows the methodology described in subsection 3.5.2.A and is
illustrated in figure 4.12. This metaheuristic receives, as input, a data structure describing the FTP
requests, which includes the weight matrix and the constraints of the problem. This metaheuristic starts
by initializing the algorithmic specific parameters, using the values presented in table 4.3. It is also
necessary to initialize the pheromone matrix, by setting each entry to a value defined by equation 3.4.
Figure 4.12: Flowchart of the implemented Ant Colony System procedure.
64
This is followed by the initialization of all ants belonging to the colony. Each ant is implemented as
an independent thread, which allows for the colony to run in parallel. Thus, all ants construct a solution
to the problem in an autonomous way, using the transition rules described in equations 3.5, 3.6 and
3.7. At each step of the construction process, it is necessary to apply a local pheromone update, by
using equation 3.8, which decreases the probability of that solution component being selected by other
ants, in the same iteration. After each ant finishes the construction of a solution, it is necessary perform
the pheromone update (equation 3.9), as to reflect the search experience. Given that each ant is an
independent thread, it is necessary to lock the pheromone matrix, as to protect it from being accessed
and manipulated by multiple threads at the same time.
In its turn, the implementation of the SA follows the methodology described in subsection 3.5.2.B, as
it is illustrated in figure 4.13. As it happens with the ACO, the SA metaheuristics receives as input the
relevant data to describe the FTP, and initializes its algorithmic specific parameters according to table
4.3. This is followed by the determination of the cooling schedule. To do so, it is necessary to construct
a number of solutions, as to calculate the average difference in the objective function, which allows the
calculation of the initial temperature (equation 3.12) and cooling parameter (equation 3.13).
Figure 4.13: Flowchart of the implemented Simulated Annealing procedure.
After this initialization step, the SA system starts the optimization cycle, by constructing a new candi-
date solution y based on the current solution x, using a 2-opt move, as described in section 2.2.2.C. To
accept the candidate solution y, it is necessary to apply the Metropolis acceptance criteria (see equa-
tion 3.10). Unlike the ACO implementation, in which the the solution construction process is parallel,
the developed Markov cycle follows a serial approach, given that the candidate solution y is always
constructed based on the current solution x. If there were multiple threads constructing solutions at the
same time, the current solution would be an uncertain entity. After completing a cycle, the tempera-
ture of the system is reduced, by applying equation 3.11. As the temperature decreases, the algorithm
becomes increasingly greedy and converges to a local optimum.
Both metaheuristics have a stop-criteria that is evaluated at the end of each solution construction
cycle (the ants construction procedure, and the Markov cycle). This criteria is either set to a maximum
65
number of iterations, or to a maximum allowable execution time. When the stop-criteria is reached, these
heuristics return the best solution found so far.
It is worth noting that the solution returned by the ACO and the SA modules correspond to a set
of solution components , which define both the pair of cities and the schedule of the arc traversal. If
the problem under resolution is a user request issued by the CSA, it is necessary to construct the set
of flights which characterize the solution to the FTP request. Thus, the SSA must map each solution
component to a particular flight, as to produce the itinerary for the requested trip.
4.6 Summary
This chapter presented the design and implementation details of the proposed flight search web applica-
tion. In the first part, covering the sections related to the design process, it was presented the considered
architecture for the web application, which separates the user interface from the logic associated to the
resolution of FTP requests. It also established a design goal for the user interface, and proposed an
architecture for the optimization algorithm, as to reduce the latency that is sensed by the users. During
the remaining of the chapter, the implementation details regarding these topics were provided.
66
5Experimental Results
Contents
5.1 Optimization module evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.2 Flying Tourist Problem evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.3 Comparison with Kiwi ’s Nomad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
67
In order to validate and evaluate the performance of the developed system, several tests were devel-
oped and executed. First, by using a set of benchmarks, the implemented optimization algorithms are
tested as to evaluate their overall quality (section 5.1). Then, the overall utility of the proposed system is
evaluated, by performing a series of tests on the Flying Tourist Problem (section 5.2). Finally, a thorough
comparison with the only known state-of-the-art alternative for the devised FTP is performed (section
5.3). This is achieved by considering a comprehensive set of real-world multi-city flight problems, using
different objective functions as to evaluate the systems performance.
These experiments were executed on a 2.6GHz Intel i7-6700 CPU, with 8GB of RAM, and all the
code was developed using the Python3 programming language.
5.1 Optimization module evaluation
The main difficulty behind the aimed evaluation of the developed optimization module results from the
absence of publicly available FTP benchmarks (with a priori known optimal solutions) that could be used
to validate the obtained results. To circumvent this adversity, it was decided to conduct such evaluation
using closely-related NP-hard problems, such as the Asymmetric TSP (subsection 5.1.1) and Time-
Dependent TSP (subsection 5.1.2).
5.1.1 Asymmetric TSP evaluation
The set of benchmarks (and respective input data) considered for the Asymmetric TSP were collected
from the publicly available TSPLib and correspond to problems with 17, 35, 53, 124 and 323 nodes
(br17, ftv35, ft53, kro124p, rbg323, respectively). The adoption of such graph dimension ranges allows
the evaluation of the developed optimization module with different complexity levels: small instances (17
nodes), corresponding to a dimension more closely-related to the targeted flying tourist application sce-
nario; medium instances (35-53 nodes), used to evaluate more computationally demanding scenarios;
and large instances (124-323 nodes), used to evaluate the scalability of the developed system.
Each benchmark was independently solved by the two considered optimization algorithms (ACO and
SA). The execution of each problem was repeated 5 times and the obtained results were averaged.
Furthermore, since there are strict limitations on the maximum time a user is willing to wait for a
result in a real-world application, it was considered a stop criteria to limit the total optimization time. In
accordance, during the execution of these tests, each algorithm may run for no longer than 60 seconds.
The results of both meta-heuristic algorithms on the considered set of asymmetric benchmarks are
presented in Table 5.1.
For small instances (17 nodes), both ACO and SA consistently present optimal solutions. For medium
instances (35-53 nodes), both algorithms perform in the range of 5-16% relative error. As for bigger
68
problems (124-323 nodes), the performance of the ACO slightly decreases (17-25%), followed closely
by the SA (22-29%).
By observing the number of iterations that were executed during the 60-seconds interval, it is clear
that the SA is much faster than the ACO, performing 38 to 68 times more iterations in the same time
interval. However, this greater number of iterations does not seem to directly contribute to a better final
result of the SA. This happens because the ACO search strategy is guided, taking into account the
previous search experience in the selection of the next solution. This does not occur with the classical
implementation of the SA, which utilizes a simple and non-guided local search.
On the other hand, while the considered stop-criteria value may be sufficient to reach the meta-
heuristic stationary result for small problem instances, meaning that continuing the optimization further
may not affect the final error in a significant way, the higher relative error and lower number of iterations
for bigger problem instances suggest that improvements may still occur.
To analyze the advantages of further optimization, all problems were solved once more, by using
the same procedure previously described, this time for a total of 300 seconds. The obtained results are
presented in Table 5.2, which allow the comparison of the final result as a function of the execution time.
Columns RE60 and RE300 present the relative error after 60 and 300 seconds, respectively. As it can be
seen, increasing the optimization time leads to an improvement in the final result for both metaheuristics
and for almost every benchmark problem. These improvements are more significant for bigger problem
instances, and affect the small instances only slightly.
Hence, the observed performance of the implemented metaheuristic algorithms on the asymmetric
TSP, for the level of complexity of the targeted flying tourist application scenario, showed to be highly
promising, leading to final solutions which are either optimal or whose relative error is close to 10%.
Table 5.1: Performance of the ACO and SA on the asymmetric TSP benchmarks, taken from the TSPLib (stop-criteria = 60 seconds).
Alg. Bench #Iterat. Optimal Comput. Errormark solution solution [%]
ACO
br17 6893 39 39 0ftv35 1469 1473 1552.4 5.39ft53 689 6905 8269.6 16.13kro124p 275 36230 44102 17.00rbg323 20 1326 1660.4 25.21
SA
br17 264319 39 39 0ftv35 63659 1473 1641 11.41ft53 36189 6905 7963 15.32kro124p 17533 36230 44102 21.73rbg323 1361 1326 1706.2 28.67
69
Table 5.2: Effects of increased optimization time on the final result.
ACO SABenchmark RE60 RE300 RE60 RE300
br17 0 0 0 0ftv35 5.39 4.75 11.41 11.73ft53 16.13 13.71 15.32 13.20
kro124p 17.00 13.91 21.73 17.21rbg323 25.21 18.78 28.67 10.27
5.1.2 Time-dependent TSP evaluation
Due to the absence of standardized benchmarks for the time-dependent case in the TSPLib, it was
necessary to define specific problems (whose best known solutions are known a priori) by using a
method described in ( [102]), based on the duality principle over the Integer Linear formulation for the
time-dependent TSP. The defined benchmarks were constructed as to have the same dimensions of
those used for the Asymmetric TSP, and their names were appended with an asterisk suffix to distinguish
from the asymmetric TSP case.
Executing the same evaluation strategy on the time-dependent TSP problems leads to the results
presented in Table 5.3. Once again, it is possible to compare the efficiency of the ACO and the SA, by
evaluating the relative error of each set of problems.
For small instances (17 nodes), ACO and SA present a small relative error, ranging from 7% and
11%. For medium instances (35-53 nodes), the relative error decreases for both algorithms, ranging
from 4% to 7%, with SA offering better results than ACO. This may occur because the high number
of performed iterations in small and medium instances leads to an intensive search space exploration.
However, when bigger problems (124-323 nodes) are considered, the performance of the ACO increases
(possibly due to its guided search exploration), reducing the relative error to close to 3% and surpassing
that of the SA.
Table 5.3: Performance of the ACO and SA on the time-dependent TSP benchmarks (stop-criteria = 60 seconds).
Alg. Bench #Iterat. Optimal Comput. Errormark solution solution [%]
ACO
br17* 9720 2458 2729 11.03ftv35* 2590 5131 5500 7.20ft53* 1099 7930 8370 5.53kro124p* 71 25483 26402 3.61rbg323* 13 48991 50261 2.59
SA
br17* 219517 2458 2631 7.04ftv35* 80262 5131 5406 5.38ft53* 37450 7930 8265 4.23kro124p* 3521 25483 26427 3.70rbg323* 901 48991 51926 5.99
70
In any case, the performance of these two algorithms on the time-dependent TSP is highly promising,
leading to final solutions which are consistently below the 10% relative error mark.
5.1.3 Discussion
A precise interpretation of the factors involved in these results is not straightforward. Still, the following
conclusions seem plausible. When comparing the time-dependent TSP with the asymmetric TSP, the
former is usually expected to be more heavily constrained than the latter. In those cases, finding the
problem itself simplifies the search for the overall optimum, as bad solutions are much easier to identify.
In particular, both ACO and SA seem to explore this property well, as it can be inferred by the fact that for
the corresponding benchmarks the time dependent version obtains a better relative error rate with less
iterations. Moreover, the better relative error rates themselves seam to be problem induced, meaning
that the asymmetric TSP contains big gaps in the objective function. This happens, in particular, from
the optimal value to a close by maximum, whereas the time-dependent TSP contains smaller gaps and
a higher concentration of near maximums.
5.2 Flying Tourist Problem evaluation
To demonstrate and quantify the actual benefits of the proposed system, a series of FTP instances were
defined, ranging from just 1 city to visit (which corresponds to a round-trip), up to a total of 20 cities. For
each problem instance, several solutions were obtained using four different approaches:
• a pseudo-random approach, i.e., a closed tour randomly generated that connects all the ;
• a metric nearest-neighbor heuristic that promotes the nodes proximity to define the traveling route
(this approach closely approximates the strategy usually followed by a human solver);
• a regular nearest-neighbor heuristic using the flight price as the objective function;
• the proposed ACO and SA meta-heuristics (where the best of these two results is chosen), con-
sidering two different objective functions (minimization of the total cost and of the entropy).
In this experiment, the number of nodes was varied and multiple requests (more than 5) were made
for each set of nodes, averaging their results. In every case, the trip starts and returns to Lisbon (Portu-
gal) and visits a given set of cities, randomly chosen from the following set: Abuja, Atlanta, Barcelona,
Beijing, Cairo, Casablanca, Dubai, Dublin, Frankfurt, Hong Kong, Istanbul, Johannesburg, Kiev, Los An-
geles, Madrid, Miami, Moscow, New-York (JFK), Oslo, San Francisco, Sidney, Singapore. The start date
was set to be the same for all requests(1 November 2018), which, upon the execution of the tests, was
71
Table 5.4: Comparison of different Flying Tourist Problem solutions obtained with distinct algorithms and optimiza-tion criteria, considering the Metric Nearest Neighbor approach (shaded gray) as reference.
#Nodes 1 3 5 7 10 15 20
PR
ICE
Random 635 1455 (+1.2%) 2194 (+7.0%) 3436 (+26.6%) 4791 (+28.8%) 7222 (+63.5%) 9154 (+68.8%)Metric Nearest Neighbor 635 1438 2051 2715 3721 4417 5422Regular Nearest Neighbor 635 1438 1993 (-2.8%) 2553 (-6.0%) 3412 (-8.3%) 3911 (-11.5%) 4678 (-13.7%)Proposed (Price) 635 1398 (-2.8%) 1727 (-15.8%) 1911 (-29.6%) 2466 (-33.7%) 3051 (-30.9%) 3699 (-31.8%)Proposed (Entropy) 876 (+38.0%) 1761 (+22.5%) 2203 (+7.4%) 2749 (+1.3%) 3687 (-0.9%) 4123 (-6.7%) 4707 (-13.2%)
DU
RA
TIO
N
Random 61 125 (+5.0%) 183 (+3.4%) 270 (+31.7%) 369 (+48.2%) 497 (+63.0%) 638 (+92.7%)Metric Nearest Neighbor 61 119 177 205 249 305 331Regular Nearest Neighbor 61 119 181 (+2.3%) 212 (+3.4%) 257 (+3.2%) 323 (+5.9%) 358 (+8.2%)Proposed (Price) 61 121 (+1.7%) 151 (-14.7%) 179 (-12.7%) 258 (+3.6%) 292 (-4.3%) 319 (-3.6%)Proposed (Entropy) 29 (-52.5%) 57 (-52.1%) 82 (-53.7%) 92 (-55.1%) 104 (-58.2%) 140 (-54.1%) 160 (-51.7%)
50 days into the future. The waiting period on each city was set to a random value between 1 and 5
days.
5.2.1 Quantitative evaluation and improvement
The result of the execution of the described test is summarized in Table 5.4, where both the total flight
cost and duration are presented, as a function of the optimization approaches and the number of nodes.
This table also presents the observed improvement, relative to the metric nearest-neighbor solution,
which is used as reference due to its proximity to the human search approach.
The first insight into these results allows a preliminary evaluation of the utility of the proposed system.
It can be seen that, for a small number of nodes (1 and 3), the metric and regular nearest neighbour
present the same results. However, for greater instances (5-20) nodes, the metric nearest neighbour
starts to present worse results when compared to the regular variant. The regular variant performs
better (reducing the cost between ≈ 3% and 13%) because it always selects the node with the lowest
cost. Despite the positive results presented by the regular nearest neighbour approach, the proposed
meta-heuristics are still capable of improving them. The metaheuristic approach enables the cost to be
reduced by an extra 3% to 25%, when compared to the regular nearest neighbour, and up to a total of
34%, when compared to the metric one.
5.2.2 Balancing the total flight price and duration
It is possible to execute the proposed metaheuristics with different objective functions, as proposed in
subsection 3.1. In particular, it is possible to introduce both the price and flight duration in the objective
function (see equation 3.3). With this approach, a better balance between price and flight duration is
envisaged, by minimizing an entropic metric defined as a weighted value, where the price and flight
duration contribute with 70% and 30%, respectively.
As it can be observed in Table 5.4, such an entropic minimization leads to significant improvements
72
Figure 5.1: Variation of the total flight price and duration when minimizing the entropy objective function.
in the flight duration, although it introduces some penalization in the flight price for a small number of
nodes (1-5). However, as the number of nodes increases, the flight price significantly decreases, but
it is always higher than that provided by the former price-only meta-heuristics. This happens because
the two objective functions (price and duration) can hardly be simultaneously minimized, and thus, a
compromise has to be reached. In this case, compromising means slightly increasing the price, to
significantly reduce the duration.
This compromise between flight price and duration is also illustrated in Fig. 5.1, which presents the
relative duration improvement as a consequence of the increase in price. This figure shows that, in
general, increasing the price by around 20% leads to a decrease in the flight duration by around 50%,
when compared to the price-only metaheuristic approach.
5.2.3 Impact of the trip start interval
To evaluate the influence of the trip start interval on the obtained results, the same queries and data
sets were used to solve these FTPs using start periods of different lengths. These results are illustrated
in Fig. 5.2, where NN refers to the metric nearest neighbour heuristic and M − 1, M − 15 and M −
31 represents the proposed meta-heuristic algorithms, with start periods of length 1, 15 and 31 days,
respectively.
The analysis of Fig. 5.2 shows that increasing the interval of the start date may lead to great im-
provements, with flight price reductions as high as 15%, even for medium size instances with up to 20
nodes.
5.2.4 Response time
The response time to a requests depends mostly on two difference procedures: i) the data gathering,
which is handled by the DMS; and ii) the optimization, handled by the OS. This subsection evaluates
the response time to a given request, by comparing the relative influence of these two steps.
73
Figure 5.2: Price improvement as a function of the trip start interval.
As to collect the necessary flight data to a request, it is necessary to communicate with third-party
APIs. The module responsible for this is the DMS, which implements a concurrent architecture to make
HTTP requests. Figure 5.3 illustrates the required time to receive the response to 100 flight queries,
using the KIWI flight API. By varying the number of threads, it also shows the speed-up obtained by
implementing the concurrent system.
Figure 5.3: Required time to perform 100 HTTP requests to a third-party API, as a function of the number ofconcurrent requests.
Given a serial approach, which corresponds to the case in which there is only one worker thread,
the system requires approximately 100 seconds to perform all queries. Thus, the time for the remote
API server to respond to one query is, on average, one second. It is possible to increase the number
of requests per second, by opting for a concurrent approach. Figure 5.3 shows that by performing two
concurrent requests, the response time drops to half. As the number of concurrent requests increases,
the response time decreases.
However, this decrease is not always linear. While the first concurrent requests have a very positive
effect in the reduction of the response time, this behavior eventually reaches a stagnation point, in which
74
Figure 5.4: Total response time to a request, as a function of the number of nodes and length of start period.
continuing to increase the number of concurrent requests has a negative effect. Thus, it is recommended
to be sensible upon defining the number of concurrent requests. Despite this, the proposed DMS allows
the collection of 100 requests in less than 5 seconds, which is over 20 times faster than the serial
approach.
After collecting the necessary set of flights, the proposed system determines the response to a
request by running the appropriate optimization algorithms. Each request is solved using the nearest
neighbour heuristic, and the SA and ACO metaheuristics. For each request, it is necessary to run the
optimization algorithms for a total of three times, one for each objective function (price, flight duration
and entropy). While the nearest neighbour runs until a solution is found, it is possible to define multiple
stop criteria for the metaheuristics, as it was referred in section 5.1. In this particular evaluation, it was
defined that each optimization algorithm may run for a maximum of 1 second, or 10.000 iterations.
As a result, the total time that is necessary to respond to a request, as a function of the number of
nodes and length of the start period, is illustrated in figure 5.4. The analysis of this figure shows that
requests with up to 10 nodes are solved in less than 60 seconds. It also shows that the response time
increases non-linearly as the number of nodes increases. On the other hand, increasing the length of
the start interval has low influence for small instances (up to 10 nodes), but has a significant impact for
greater instances.
5.3 Comparison with Kiwi ’s Nomad
At the present time, Kiwi ’s Nomad is the only (non-disclosed) tool that is capable of addressing the
formalized Flying Tourist Problem in the form of an unconstrained multi-city routing problem, although
limited to only 10 different nodes. To facilitate the comparison of the conceived optimization system with
this tool, the definition of the user requests of the proposed FTP (see section 3.1) was kept as similar
as possible to Kiwi ’s Nomad interface. The user is asked to specify the departing/arriving city, together
with the start date, the set of cities to be visited and the duration of the stay in each city.
75
The results provided by both applications were directly compared against each other, according to
each considered objective function. The difference in the total flight price and duration (for each query)
was also measured and analyzed as a function of the query parameters. The former evaluation will be
called absolute comparison (subsection 5.3.1), while the latter quantitative evaluation (subsection 5.3.2).
The execution of these tests involved over 100 different queries, by varying not only the number of
nodes (2-10), but also the length of the trip start interval (1-15 days). All queries that were performed
on both applications had its start and return city set to Lisbon (Portugal), while each city to be visited
belongs to the same set of hub airports that were considered in the previous subsections. These queries
were executed during the period between 15 and 16 of June 2018 and the base start date was set to
the 1st of August 2018, which, at the time of the tests, was 45 days in the future. The staying period in
each city was set to a random value between 1 and 5 days. For extended start periods, the base start
date was extended by 31 days.
5.3.1 Absolute comparison
Both applications respond to each query with three different sets of flights, serving the following different
optimization criteria: the cheapest, the fastest and the recommended. For each query, a winner was
determined according to the following criteria. The cheapest set of flights is determined according to
the total flight price, while the fastest depends solely on the total flight duration. The recommended set
of flights depends on both the price and the duration, and the winner for this criteria must have both
lower prices and duration. While the optimization criteria used by Kiwi is not known, it is a reasonable
assumption that the criteria for the cheapest flight is solely its price. For the same reason, the flight
duration if the only criteria to determine the fastest set of flights. However, for the recommended set of
flights, both the price and the flight duration should be considered, but it is possible to also include other
optimization criteria, giving each an heuristic weight (f.e., it is possible to consider that the departure
time is a relevant factor for the recommended flight).
Figure 5.5: Comparison of the results provided by the proposed tool and by Kiwi ’s Nomad application.
Fig. 5.5 illustrates the obtained comparison, by presenting the total number of times that an applica-
tion outperformed the other, for each of the three different optimization criteria. It also shows the number
76
of cases in which the responses were very similar.
The analysis of this figure indicates that the developed application presents better solutions for a
significant amount of queries. In fact, while the fastest set of flights is only achieved in 42% of the
queries, it presents the cheapest set of flights 95% of the times and the best recommended result 75%
of the times.
The developed application presents high quality results for two of the three objective functions, in
a consistent way. However, it does not perform particularly well in the minimization of the total flight
duration. This occurs because of an implementation detail in the DMS. Upon receiving a list of flights for
a query, the DMS selects only a subset of these flights, as to reduce the required amount of memory.
This subset always selects the cheapest set of flights. However, in general, if a flight is fast, its price
is high. Thus, upon selecting the subset of flights, the most promising solution components for the
minimization of the total flight duration are discarded.
5.3.2 Quantitative evaluation
To evaluate the difference of the responses provided by both applications, the total flight price and
duration of the recommended set of flights was also quantitatively measured (see Fig. 5.6). The values
presented in these graphs refer to the developed application response and were normalized using the
Kiwi ’s Nomad response as reference.
(a) Single start date. (b) Extended start period (31 days)
Figure 5.6: Comparison of the recommended flights price and duration, as a function of the number of nodes andthe length of start interval. The presented values refer to the proposed application response, and werenormalized with respect to Kiwi ’s Nomad response value.
Figure 5.6a presents the results of the queries performed for a single start date. Its analysis shows
that, for a small number of nodes (2 and 3), the developed application recommends flights that are
slightly more expensive (≈ 10% to 19%) but have a much lower flight duration (≈ 33%-46%). For
requests with more nodes (5 to 10), the results presented by the developed application have both lower
prices (≈ 2%-18%) and flight duration (≈ 9%-24%).
Figure 5.6b depicts the obtained results when the length of the start interval was extended to 31
days. With such an extended start period, every recommended set of flights provided by the proposed
77
application has a lower price and duration. The price presents the most significant change: the minimum
improvement is 8%, while the maximum is 29%.
Finally, it is worth noting that all the presented experiments only consider up to 10 different cities to be
visited by the traveler. The reason why more nodes were not considered arises not from the developed
application (which could easily accommodate more cities), but is motivated by a strict limit presented by
Kiwi ’s Nomad application, which does not support more than 10 nodes in the planned route.
78
6Conclusions
Contents
6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
6.2 Achievements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
6.3 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
79
6.1 Conclusions
The present work was developed with the aim of simplifying the planning and scheduling of complex
trips. In particular, its main goal is to find the best schedule, route and set of flights, for any given flight
request. This includes the resolution of the unconstrained multi-city flight routing problem.
Despite the proximity to the Traveling Salesman Problem (TSP), the problem under resolution has
several attributes that distinguish it from the TSP and its time-dependent variation. These differences
led to the proposal of a formal definition of the problem, denoted by the Flying Tourist Problem (FTP),
that better describes the characteristics of the problem under resolution.
In order to solve the problem in an efficient manner, the presented work proposes different optimiza-
tion strategies that address different goals. The first goal relates to the ability of the system to present
an initial solution in a very short time. This is achieved by implementing simple heuristics, such as the
nearest neighbour. The second goal considers the ability of the system to produce high quality solu-
tions. To achieve this, this work implements two different meta-heuristic optimization algorithms: the Ant
Colony Optimization and the Simulated Annealing. The final goal of the system is to produce different
solutions to different objective functions. To achieve this, the optimization system is run multiple times,
using different representations of the problem, according to the objective function under consideration.
The presented work also considers an analysis of the quality of the solutions constructed by the
different optimization algorithms. The analysis of these results show that the solutions of the developed
meta-heuristics present a much higher quality than those provided by simpler heuristics, such as the
nearest neighbour. This improvement is modest for very small instances (3 nodes), but it is significant
for instances of medium and large sizes (5-20 nodes).
The discussion presented above leads to the conclusion that the developed system completly achieves
its main goals. First of all, it is successful in the resolution of unconstrained multi-city flight requests.
Second, the developed application is successful in the integration of the proposed solution. This, to-
gether with the analysis of the success rate of the different optimization algorithms, leads us to the
conclusion that the devised system is successful in the resolution of the FTP, and it is capable of saving
time and money in the planning and scheduling of trips of different sizes and complexities.
6.2 Achievements
The main achievements of the present work can be summarized as follows:
• The definition of the Flying Tourist Problem (FTP) allows the construction of single-flights, round-
trips and multi-city flight requests. This definition also integrates the concepts of extended start
dates, variable durations and time-windows, as to allow the construction of personalized requests.
80
• The development of an optimization system that implements the Ant Colony Optimization and
Simulated Annealing meta-heuristics, as well as other optimization methods, in the resolution of
the FTP.
• The development of a web application that allows the construction and resolution of user defined
FTP requests.
• The development of a back-end system, available via API, that integrates the developed optimiza-
tion system, and constructs solutions to the FTP requests.
• A comprehensive analysis of the success rate of different optimization algorithms in the resolution
of FTP instances with different sizes and parameters.
6.3 Future work
Given the present work, and considering the goal of developing better search tools for the planning and
scheduling of complex trips, there are some interesting extensions and possible improvements to the
developed system:
• Since the developed optimization system is implemented using the Python3 programming lan-
guage, there could be significant speedups in the optimization time by rewriting this module in a
language like C++, and by using parallel programming techniques to exploit the multiple processors
to accelerate this task,
• It is necessary to device a more efficient way to collect the flight data necessary for the resolution
of a FTP requests. For big instances, the bottleneck of the system is usually the data collection
process, and not the actual optimization.
• Finally, it would be particularly useful and interesting to extend the developed system by consid-
ering different public transportation means, such as bus and railroad services, in the resolution of
the proposed routing problem. This would extend the search space of the problem, and possibly
contribute to the construction of better itineraries and solutions to the requests.
81
82
Bibliography
[1] Kiwi.com, “Travelling Salesman Challenge,” https://travellingsalesman.cz/, June 2018.
[2] N. Rosenberg, “Innovation and economic growth,” Innovation and Growth in Tourism, OECD, pp.
43–52, 06 2006.
[3] W. Travel and T. Council, “Travel and Tourism: world economic impact in 2017,” https://www.wttc.
org/-/media/files/reports/economic-impact-research/regions-2017/world2017.pdf, June 2018.
[4] M. Westcott, Innovation and Growth in Tourism. B.C. Open Textbook project, 2006. [Online].
Available: https://www.oecd-ilibrary.org/content/publication/9789264025028-en
[5] A. M. Research, “Online Travel Market by Mode of Booking, Types of Platform, Service Types, and
Age Group: Global Opportunity Analysis and Industry Forecast, 2014-2022,” 2016.
[6] E. Inc., “Expedia API documentation,” https://hackathon.expedia.com/docs/public/api/, June 2018.
[7] A. I. group, “Amadeus for developers,” https://developers.amadeus.com/enterprise, June 2018.
[8] Kiwi.com, “API documentation,” https://docs.kiwi.com/, April 2018.
[9] E. Lawler, “Combinatorial Optimization: Networks and Matroids,” University of California at Berke-
ley, 1976.
[10] A. Schrijver, “Combinatorial Optimization: Polyhedra and Efficiency,” Springer Berlin Heidelberg,
2002, 2002.
[11] M. Dorigo and T. Stutzle, “Ant Colony Optimization: Overview and Recent Advances,” IRIDIA –
Technical Report Series ISSN 1781-3794, 2009.
[12] A. Punne, “The Traveling Salesman problem: applications, formulations and variations,” Science
+ Business, Springer, 2007.
[13] G. Dantzig, R. Fulkerson, and S. Johnson, “Solution of a Large-Scale Traveling-Salesman Prob-
lem,” Journal of the Operations Research Society of America, 1954.
83
[14] L. Shen, “Computer Solutions of the Traveling Salesman Problem,” The Bell System Technical
Journal ( Volume: 44, Issue: 10, Dec. 1965 ), 1965.
[15] L. Paquete, M. Chiarandini, and T. Stutzle, “Pareto Local Optimum Sets in the Biobjective Traveling
Salesman Problem: An Experimental Study,” Metaheuristics for Multiobjective Optimisation pp
177-199, 2007.
[16] Y. Caseau and F. Laburte, “Solving Small TSPs with Constraints,” Proceedings of the 14th inter-
national conference on logic programming, 1997.
[17] A. Punne, “A travelling salesman problem with allocation, time window and precedence constraints
and application to ship scheduling,” Intl. Trans. in Operation Research 7 (2000) 231-244, 2000.
[18] M. Savelsbergh, “Local search in routing problems with time windows,” Anuals of Operation Re-
search 4 (1986/6) 285-305, 1985.
[19] J. Kruskal, “On the shortest spanning subtree of a graph and the traveling salesman problem,”
American Mathematical Society, 1956.
[20] R. Karp, “Reducibility among combinatorial problems,” Complexity of Computer Computations,
vol. 40, pp. 85–103, 01 1972.
[21] ——, “Reducibility among combinatorial problems,” The IBM Research Symposia Series, 1972.
[22] T. Oncan, I. Altinel, and G. Lapote, “A comparative analysis of several asymmetric traveling sales-
manproblem formulations,” Computers Operation Research 36 (2009) 637-654, 2007.
[23] A. Punne, “A Traveling-Salesman-Based Approach to Aircraft Scheduling in the Terminal Area,”
Nasa Technical memorandum 100062, 1988.
[24] N. Agatz, P. Bouman, and M. Schmidt, “Optimization Approaches for the Traveling Salesman
Problem with Drone,” SSRN Electronic Journal. 10.2139/ssrn.2639672., 2016.
[25] F. Furini, “The Time Dependent Traveling Salesman Planning Problem in Controlled Airspace,”
Transportation Research Part B: Methodological. 90. 38-55. 10.1016/j.trb.2016.04.009., 2015.
[26] “The air Traveling Salesman,” https://sites.google.com/site/travellingcudasalesman/, March 2018.
[27] J. Schneider, “The time-dependent traveling salesman problem,” Physica A 314 (2002) 151 – 155,
2002.
[28] L. Gouveia and S. Vob, “A classification of formulations for the (time-dependent) traveling sales-
man problem,” European Journal of Operational Research 83 (1995) 69-82, 1993.
84
[29] D. Venkatesan, K. Kannan, and S. Balachandar, “A New Genetic Algorithm for Time Depen-
dent Combinatorial Optimization Problem,” Natl. Acad. Sci. Lett. DOI 10.1007/s40009-016-0433-5,
2015.
[30] C. Malandraki, “Time dependent vehicle routing problems: formulations, properties and heuristic
algorithms,” Transportation Science 26(3):185-200 · August 199, 1992.
[31] S. Porta, P. Crucitti, and V. Latora, “The network analysis of urban streets: A dual approach,”
Physica A 369 (2006) 853–866, 2006.
[32] J. MacGregor and Y. Chu, “Human performance of the traveling salesman and related problems:
a review,” The Journal of Problem Solving • volume 3, no. 2 (Winter 2011), 2011.
[33] G. Gutin and A. Punne, “The Traveling Salesman Problem and its variations,” Science + Business,
Springer, 2007.
[34] H. Abeledo and R. Fukasawa, “The time dependent traveling salesman problem: polyhedra and
algorithm,” Springer and Mathematical Optimization Society 2012, DOI: DOI 10.1007/s12532-012-
0047-y, 2013.
[35] J. Picard and M. Queyranne, “The Time-Dependent Traveling Salesman Problem and Its Applica-
tion to the Tardiness Problem in One-Machine Scheduling,” Operations Research, 1978.
[36] E. M. Loiola, N. M. M. de Abreu, P. O. Boaventura-Netto, P. Hahn, and T. Querido, “A survey
for the quadratic assignment problem,” European Journal of Operational Research, vol. 176,
no. 2, pp. 657 – 690, 2007. [Online]. Available: http://www.sciencedirect.com/science/article/pii/
S0377221705008337
[37] J. Albiach, D. Soler, and E. Martines, “A way to optimally solve a time-dependent Vehicle Routing
Problem with Time Windows,” Operations Research Letters 37 (2009) 37-42, 2008.
[38] C. Cheng and C. Mao, “A modified ant colony system for solving the travelling salesman problem
with time windows,” Mathematical and Computer Modelling 46 (2007) 1225–1235, 2007.
[39] N. Ascheuer, M. Fischetti, and M. Grotschel, “A Polyhedral Study of the Asymmetric Traveling
Salesman Problem with Time Windows,” Networks, Vol. 36(2), 69-79 2000, 2000.
[40] E. Ulungu and J. Tehhem, “Multi-objective Combinatorial Optimization Problems: A Survey,” Jour-
nal of Multi-Criteria Decision Analysis, 1994.
[41] N. Jozefowiez, F. Semet, and E. Talbi, “Multi-objective vehicle routing problems,” European Journal
of Operational Research, 2008.
85
[42] Z. Yan, L. Zhang, L. Kang, and G. Lin, “A New MOEA for Multi-objective TSP and Its Convergence
Property Analysis,” Springer-Verlag Berlin Heidelberg, 2003.
[43] D. Veldhuizen and G. Lamont, “Multiobjective Evolutionary Algorithms: Analyzing the State-of-the-
Art,” Evolutionary Computation, 2000.
[44] F. Choobineh, E. Mohebbi, and H. Khoo, “A multi-objective tabu search for a single-machine
scheduling problem with sequence-dependent setup times,” European Journal of Operational Re-
search, 2006.
[45] G. Dantzig and J. Ramser, “The Truck dispatching problem,” Management Science, Vol. 6, No. 1
(Oct., 1959), pp. 80-91, 1959.
[46] J. Lenstra and A. Kan, “Complexity of vehicle routing and scheduling problem,” Networks 11, 221-
227, 1981.
[47] G. Laporte, “The vehicle routing problem: An overview of exact and approximate algorithms,”
European Journal of Operational Research 59 (1992) 345-358, 1992.
[48] D. Soler, J. Albiach, and E. Martinez, “A way to optimally solve the time-dependent Vehicle Routing
Problem with time windows,” Operations Research Letters 37 (2009) 37–42, 2009.
[49] L. Gambardella, E. Taillard, and G. Agazzi, “MACS-VRPTW: Vehicle Routing Problem with time
windows,” New Ideas in Optimization, McGraw-Hill, London, 1999, pp. 63–76, 1999.
[50] C. Blum, “Metaheuristics in Combinatorial Optimization: Overview and Conceptual Comparison,”
ACM Computing Surveys, Vol. 35, No. 3, September 2003, pp. 268–308, 2003.
[51] C. Malandraki and M. Daskin, “Time Dependent Vehicle Routing Problems: Formulations, proper-
ties and heuristic algorithms,” Transportation Science 26(3):185-200, 1992.
[52] A. Donati, M. R, N. Casagrande, A. Rizzoli, and L. Gambardella, “Time dependent vehicle routing
problem with a multi ant colony system,” European Journal of Operational Research 185 (2008)
1174–1191, 2006.
[53] M. Gendreau, J. Potvin, O. Braysy, G. Hasle, and A. Looketagen, “Metaheuristics for the Vehicle
Routing Problem and Its Extensions: A Categorized Bibliography,” The Vehicle Routing Problem:
Latest Advances and New Challenges pp 143-169. Operations Research/Computer Science In-
terfaces, vol 43. Springer, Boston, MA, 2007.
[54] Y. Kuo, “Using simulated annealing to minimize fuel consumption for the time-dependent vehicle
routing problem,” Computers & Industrial Engineering 59 (2010) 157–165, 2010.
86
[55] J. Soojung and A. Haghani, “Genetic algorithm for the time-dependent vehicle routing problem,”
Transportation Research Record 1771 Paper No. 01-0261, 2001.
[56] G. Laporte, “The vehicle routing problem: An overview of exact and approximate algorithms,”
European Journal of Operational Research, vol. 59, pp. 345–358, 06 1992.
[57] C. Malandraki and M. Daskin, “Time Dependent Vehicle Routing Problems: Formulations, Prop-
erties and Heuristic Algorithms,” Transportation Science 26(3):185-200., 2017.
[58] M. Figliozzi, “The time dependent vehicle routing problem with time windows: Benchmark prob-
lems, an efficient solution algorithm, and solution characteristics,” Transportation Research Part E
48 (2012) 616–636, 2012.
[59] H. Chen, C. Hsueh, and M. Chang, “The real-time time-dependent vehicle routing problem,” Trans-
portation Research Part E 42 (2006) 383–408, 2006.
[60] A. Haghani and S. Jung, “A dynamic vehicle routing problem with time-dependent travel times,”
Computers & Operations Research 32 (2005) 2959–2986, 2004.
[61] D. Jones, S. Mirrazavi, and M. Tamiz, “Multi-objective meta-heuristics: An overview of the current
state-of-the-art,” European Journal of Operational Research 137 (2002) 1-9, 2002.
[62] K. Deb, “Multi-objective optimization usingevolutionary algorithms: an introduction,” KanGAL Re-
port Number 2011003, 2011.
[63] ——, “A Simulated Annealing-Based Multiobjective Optimization Algorithm: AMOSA,” IEEE trans-
actions on evolutionary computation, 2011.
[64] G. Laporte, “The traveling salesman problem: An overview of exact and approximate algorithms,”
European Journal of Operational Research, vol. 59, no. 2, pp. 231–247, June 1992. [Online].
Available: https://ideas.repec.org/a/eee/ejores/v59y1992i2p231-247.html
[65] J. Albiach, M. Sanchis, and D. Soler, “An asymmetric TSP with time windows and with time-
dependent travel times and costs: An exact solution through a graph transformation,” European
Journal of Operational Research 189 (2008) 789–802, 2007.
[66] M. Soloman, “An exact algorithm for the traveling salesman problem with time windows,” Opera-
tions Research, Vol. 43, No. 2 (Mar. - Apr., 1995), pp. 367-371, 1992.
[67] G. Pataki, “Teaching integer programming formulations using the traveling salesman problem,”
Siam Review - SIAM REV, vol. 45, pp. 116–123, 03 2003.
[68] J. Clausen, “Branch and bound algorithms – principles and examples,” 1999.
87
[69] B. Golden, L. Bodin, T. Doyle, and W. Stewart, “Approximate traveling salesman
algorithms,” Operations Research, vol. 28, no. 3, pp. 694–711, 1980. [Online]. Available:
http://www.jstor.org/stable/170036
[70] M. Held and R. M. Karp, “The traveling-salesman problem and minimum spanning
trees,” Operations Research, vol. 18, no. 6, pp. 1138–1162, 1970. [Online]. Available:
https://doi.org/10.1287/opre.18.6.1138
[71] D. Johnson and L. McGeoch, “The traveling salesman problem: A case study in local optimization,”
Local Search in Combinatorial Optimization, vol. 1, 01 1997.
[72] S. Lin and B. Kernighan, “An Effective Heuristic Algorithm for the Traveling-Salesman Problem,”
Bell Telephone Laboratories, Incorporated, Murray Hill, N.J., 1971.
[73] R. Jonker and T. Volgenant, “Transforming asymmetric into symmetric traveling salesman prob-
lems,” Operations Research Letters Volume 2, Issue 4, November 1983, Pages 161-163, 1983.
[74] I. Osman and J. Kelly, “Meta-Heuristics: An Overview,” Luwer academic publishers, 1996.
[75] M. Dorigo and G. Caro, “Ant Colony Optimization: a newmeta-heuristic,” Evolutionary Computa-
tion, 1999. CEC 99. Proceedings of the 1999 Congress on, 1999.
[76] M. Dorigo and T. Stutzle, “Ant Colony Optimization,” MIT Press, Cambridge, MA, 2004, 2004.
[77] S. Goss, S. Aron, J. L. Deneubourg, and P. J. M, “The Self-Organizing Exploratory Pattern of the
Argentine Ant,” Journal of lnsect Behavior, Vol. 3, No. 2, 1990, 1989.
[78] ——, “Self-organized shortcuts in the Argentine Ant,” Naturwissenschaften 76: 579-581, 1989.
[79] E. Aarts, J. Kost, and W. Michiels, “Simulated annealing,” Naturwissenschaften 76: 579-581, 1989.
[80] S. Radhakrishnan and J. Venture, “Simulated annealing for parallel machine scheduling with
earliness-tardiness penalties and sequencedependent set-up times,” International Journal of Pro-
duction Research, 38:10, 2233-2252, 2010.
[81] D. Henderson, S. H. Jacobson, and A. W. Johnson, The Theory and Practice of Simulated An-
nealing. Boston, MA: Springer US, 2003, pp. 287–319.
[82] Y. Nourani and B. Andresen, “A comparison of simulated annealing cooling strategies,” Journal
of Physics A: Mathematical and General, vol. 31, no. 41, pp. 8373–8385, oct 1998. [Online].
Available: https://doi.org/10.1088%2F0305-4470%2F31%2F41%2F011
88
[83] J.-C. Picard and M. Queyranne, “The time-dependent traveling salesman problem and its
application to the tardiness problem in one-machine scheduling,” Operations Research, vol. 26,
no. 1, pp. 86–110, 1978. [Online]. Available: https://doi.org/10.1287/opre.26.1.86
[84] M. Dorigo and L. M. Gambardella, “Ant colony system: a cooperative learning approach to the
traveling salesman problem,” Central European Researchers Journal, vol. 1, no. 1, pp. 53–66,
April 1997.
[85] J. Szabo, “Comparison of methods for generating initial solution for simulated annealing,” Faculty
of Management Science and Informatics, University of Zilina, vol. 2, no. 1, pp. 37–41, June 2016.
[86] Metropolis, A. Rosenbluth, M. Rosenbluth, A. Teller, and E. J. Teller, “Equation of state calculations
by fast computing machines,” Journal of Chemical Physics, vol. 21, pp. 1087–1092, 01 1953.
[87] C. Y. Wang, M. Lin, Y. Zhong, and H. Zhang, “Solving travelling salesman problem using multiagent
simulated annealing algorithm with instance-based sampling,” International Journal of Computing
Science and Mathematics, vol. 6, pp. 336–353, 09 2015.
[88] H. D. Center, “What is Heroku,” https://www-staging.heroku.com/what, June 2018.
[89] G. Cloud, “Google App Engine,” https://cloud.google.com/appengine/, June 2018.
[90] Amazon.com, “Amazon Web Services,” https://aws.amazon.com/, June 2018.
[91] Stackshare, “Heroku vs. Google App Engine vs. AWS Elastic Beanstalk,” https://stackshare.io/
stackups/aws-elastic-beanstalk-vs-google-app-engine-vs-heroku, June 2018.
[92] H. Jin, S. Ibrahim, T. Bell, W. Gao, D. Huang, and S. Wu, Cloud Types and Services. Springer
US, 2010, pp. 335–355. [Online]. Available: https://doi.org/10.1007/978-1-4419-6524-0 14
[93] H. D. Center, “How Heroku Works,” https://devcenter.heroku.com/articles/how-heroku-works,
June 2018.
[94] N. Foundation, “Node.js,” https://nodejs.org/en/, June 2018.
[95] S. Tilkov and S. Vinoski, “Node.js: Using javascript to build high-performance network programs,”
vol. 14, pp. 80 – 83, 01 2011.
[96] M. Cantelon, M. Harter, T. Holowaychuk, and N. Rajlich, Node.Js in Action, 1st ed. Greenwich,
CT, USA: Manning Publications Co., 2013.
[97] D. S. Foundation, “Django,” https://www.djangoproject.com/, June 2018.
[98] F. Inc., “React: a JavaScript library for building User Interfaces,” https://reactjs.org/, June 2018.
89
[99] M. K. Caspers, “React and redux,” Rich Internet Applications, Carl von Ossietzky University, Old-
enburg, pp. 14–18, 02 2017.
[100] D. Abramov, “Redux: a predictable state container for JavaScript applications,” https://redux.js.
org/, June 2018.
[101] O. ravis E, “A guide to numpy,,” 2006, uSA: Trelgol Publishing,.
[102] R. J. V. Wiel and N. V. Sahinidis, “Heuristic bounds and test problem generation for the
time-dependent traveling salesman problem,” Transportation Science, vol. 29, no. 2, pp. 167–183,
1995. [Online]. Available: https://doi.org/10.1287/trsc.29.2.167
90
91