Flight Time and Cost Minimization in Complex Routes · Declaration I declare that this document is...

Flight Time and Cost Minimization in Complex Routes

Rafael Alexandre da Silva Marques

Thesis to obtain the Master of Science Degree in

Aerospace Engineering

Supervisors: Doutor Nuno Filipe Valentim RomaDoutor Luıs Manuel Silveira Russo

Examination Committee

Chairperson: Doutor Jose e Fernando Alves da SilvaSupervisor: Doutor Nuno Filipe Valentim Roma

Member of the Committee: Doutor Vasco Miguel Gomes Nunes Manquinho

October 2018

Declaration

I declare that this document is an original work of my own authorship and that it fulfills all the require-

ments of the Code of Conduct and Good Practices of the Universidade de Lisboa.

Acknowledgments

I would like to thank my parents and all my family, friends and both of my professors for the support,

patience and discussions that allowed for the development and conclusion of this work.

This work was partially supported by national funds through Fundacao para a Ciencia e a Tecnologia

(FCT) under projects UID/CEC/50021/2013 and PTDC/EEI-HAC/30485/2017.

ii

Abstract

The present work formalizes and addresses the Flying Tourist Problem (FTP), a NP-hard problem

that occurs as a generalization of the Traveling Salesman Problem (TSP), and whose goal is to find

the best schedule, route, and set of flights, for any given unconstrained multi-city flight request. In fact,

despite the current existence of numerous flight search applications, most of them lack the ability to

properly address unconstrained multi-city flight requests, since this problem is generally not tractable. In

accordance, the main goal of this research is to develop a methodology that allows an efficient resolution

of this rather demanding problem. To accomplish this, different heuristics and meta-heuristic optimiza-

tion algorithms were considered, including the Ant Colony Optimization and the Simulated Annealing,

allowing the identification of solutions in real-time, even for large instances. The developed methods

were integrated into a web application prototype, allowing a fast resolution of user-defined requests.

In particular, the implemented system was evaluated using different criteria, including the quality of its

optimization system; the utility of the devised problem; and its performance compared to other similar

systems. Furthermore, when comparing the developed system to the only known (but not-disclosed) al-

ternative, it was shown that the developed application provides the cheapest and the best-recommended

solutions, respectively 95% and 74% of the times. As a result, upon the planning of a complex multi-city

trip, the developed system showed to allow the user to save a significant amount of time and money.

Keywords

Flying Tourist Problem, Traveling Salesman Problem, Combinatorial Optimization, Ant Colony Optimiza-

tion, Simulated Annealing, Web Application.

iii

Resumo

O presente trabalho formaliza e aborda um problema a que se designou Problema do Turista Voador,

que ocorre como uma generalizacao do conhecido Problema do Caixeiro Viajante, e cujo objectivo e de-

terminar o melhor agendamento, rota e conjunto de voos que permitem cumprir um itinerario que passa

por varias cidades, sem restricoes, e realizada apenas com base em voos comerciais. O principal objec-

tivo deste trabalho e o desenvolvimento de uma metodologia eficiente para a resolucao deste problema.

Para concretizar este objectivo, considerou-se algoritmos de optimizacao baseados em tecnicas de

Optimizacao por colonia de formigas e Tempera Simulada, o que permite a determinacao de solucoes

em tempo real, mesmo para problemas de grande dimensao. Os metodos desenvolvidos foram inte-

grados e prototipados num servico de internet, permitindo a resolucao de problemas reais definidos

pelo utilizador. O sistema implementado foi avaliado usando diferentes criterios, incluindo a qualidade

do seu sistema de optimizacao; a utilidade do problema proposto; e o seu desempenho quando com-

parado com outros sistemas semelhantes. Alem disso, ao comparar o sistema desenvolvido com a

unica alternativa (nao aberta) actualmente existente, verificou-se que em 95% das vezes a solucao

encontrada e a mais barata e que em 74% das vezes corresponde a melhor solucao recomendada.

Consequentemente, o sistema desenvolvido oferece vantagens significativas no planeamento de via-

gens envolvendo varias cidades, permitindo poupar quantidades significativas de tempo e dinheiro.

Palavras Chave

Problema do Turista Voador, Problema do Caixeiro Viajante, Optimizacao Combinatoria, Optimizacao

por colonia de formigas, Tempera Simulada, Servico de internet.

v

Contents

1 Introduction 1

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Existing flight search services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2.1 User search tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2.2 Developer search tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.4 Main contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.5 Document structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2 Literature review on the Traveling Salesman Problem 9

2.1 Common Combinatorial Optimization Problems . . . . . . . . . . . . . . . . . . . . . . . . 10

2.1.1 Traveling Salesman Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.1.1.A Problem definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.1.1.B Time Dependent TSP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.1.1.C TSP with time-windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.1.1.D Multi-objective TSP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.1.2 Vehicle Routing Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.1.2.A Problem definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.1.2.B Time-dependent VRP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.1.2.C Multi-objective VRP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.2 Common Optimization Methods overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.2.1 Exact algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.2.1.A Integer Linear Programming . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.2.1.B Branch and Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.2.2 Heuristic algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.2.2.A Held-Karp Lower Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.2.2.B Tour construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.2.2.C Tour improvement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

vii

2.2.3 Meta-Heuristic algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

2.2.3.A Ant Colony Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

2.2.3.B Simulated Annealing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3 Flying Tourist Problem 33

3.1 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.2 Relation to the Traveling Salesman Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.3 Graph representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.4 Dimensional overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.5 Optimization methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.5.1 Heuristic algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.5.1.A Pseudo-random construction procedure . . . . . . . . . . . . . . . . . . . 40

3.5.1.B Nearest neighbour procedure . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.5.2 Metaheuristic algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.5.2.A Ant Colony Optimization procedure . . . . . . . . . . . . . . . . . . . . . 42

3.5.2.B Simulated Annealing procedure . . . . . . . . . . . . . . . . . . . . . . . 44

3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4 System Design and Implementation 47

4.1 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

4.1.1 Client Side Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

4.1.2 Server Side Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4.1.2.A Data Management System . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4.1.2.B Optimization system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.2 Underlying technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

4.2.1 Heroku . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.2.2 Node.js . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

4.2.3 Django . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

4.2.4 React and Redux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

4.3 Client Side Application implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

4.3.1 User Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.3.2 Communication with the SSA API . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

4.3.3 User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

4.4 Server Side Application implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

4.4.1 SSA dataflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

4.4.2 SSA interface protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

4.4.3 Data Management System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

viii

4.5 Optimization system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

5 Experimental Results 67

5.1 Optimization module evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

5.1.1 Asymmetric TSP evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

5.1.2 Time-dependent TSP evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

5.1.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

5.2 Flying Tourist Problem evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

5.2.1 Quantitative evaluation and improvement . . . . . . . . . . . . . . . . . . . . . . . 72

5.2.2 Balancing the total flight price and duration . . . . . . . . . . . . . . . . . . . . . . 72

5.2.3 Impact of the trip start interval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

5.2.4 Response time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

5.3 Comparison with Kiwi ’s Nomad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

5.3.1 Absolute comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

5.3.2 Quantitative evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

6 Conclusions 79

6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

6.2 Achievements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

6.3 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

ix

List of Figures

2.1 Illustration of the classes P, NP, NP-complete and NP-hard . . . . . . . . . . . . . . . . . . 12

2.2 The 2-opt local search reconnects two edges, hoping to fold possible crossovers, decreas-

ing the overall tour cost. In the left image, a crossover is identified. In the middle image,

the edges belonging to this crossover are removed, and in the figure to the right, they are

reconnected, forming a new valid tour. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.3 The crossing bridges can only be solved by reordering 4 edges. The resolution of this

problem with local search is only possible with 4-opt or higher. . . . . . . . . . . . . . . . 27

2.4 The double bridge experiment. On the left, two bridges with the same length. Experimen-

tal results show that ants distribute themselves evenly amonst both bridges. On the right,

one of the bridges is longer than the other. Experimental results show that ants use the

shorter bridge more often. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.1 Illustration of a Flying Tourist Problem using a multipartite graph. To each node (A,B,C) it

is associated a waiting period of respectively (1,2,3) time units. The red arrows represent

a possible solution to the problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.2 Illustration of the distribution of the initial, final and transition arcs. . . . . . . . . . . . . . . 38

4.1 Structure and data flow of the proposed application. . . . . . . . . . . . . . . . . . . . . . 48

4.2 Proposed User Interface layout for small/medium and large devices. There are two es-

sential views and one optional view. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4.3 Simplified illustration of the optimization system, which utilizes different algorithms to pro-

duce a solution to a user defined request. . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

4.4 Technology stack used by the developed application. . . . . . . . . . . . . . . . . . . . . . 53

4.5 Block diagram of the state cycle of the Client Side Application. . . . . . . . . . . . . . . . 56

4.6 Building blocks of an application built using React And Redux. . . . . . . . . . . . . . . . 57

4.7 Structure of the developed user interface . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

4.8 Application rendered on a desktop computer. . . . . . . . . . . . . . . . . . . . . . . . . . 60

xi

4.9 Application rendered on a mobile device. . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

4.10 Server Side Application dataflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

4.11 The Data Management System uses concurrent HTTP requests to communicate with

third-party flight API’s. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4.12 Flowchart of the implemented Ant Colony System procedure. . . . . . . . . . . . . . . . . 64

4.13 Flowchart of the implemented Simulated Annealing procedure. . . . . . . . . . . . . . . . 65

5.1 Variation of the total flight price and duration when minimizing the entropy objective function. 73

5.2 Price improvement as a function of the trip start interval. . . . . . . . . . . . . . . . . . . . 74

5.3 Required time to perform 100 HTTP requests to a third-party API, as a function of the

number of concurrent requests. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

5.4 Total response time to a request, as a function of the number of nodes and length of start

period. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

5.5 Comparison of the results provided by the proposed tool and by Kiwi ’s Nomad application. 76

5.6 Comparison of the recommended flights price and duration, as a function of the number

of nodes and the length of start interval. The presented values refer to the proposed

application response, and were normalized with respect to Kiwi ’s Nomad response value. 77

xii

List of Tables

1.1 Search results across several applications. Search tools provided according to application 4

1.2 Multi-city search results and tools. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

4.1 Parallelism between User Input, Actions and Reducers. To each user defined input corre-

sponds an action, declaring the intent of changing the state with some specific data, and

a reducer, which actually modifies the state. . . . . . . . . . . . . . . . . . . . . . . . . . 58

4.2 Keyword/value pairs of the SSA protocol to uniquely identify a resource. . . . . . . . . . . 62

4.3 Algorithm specific parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

5.1 Performance of the ACO and SA on the asymmetric TSP benchmarks, taken from the

TSPLib (stop-criteria = 60 seconds). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

5.2 Effects of increased optimization time on the final result. . . . . . . . . . . . . . . . . . . . 70

5.3 Performance of the ACO and SA on the time-dependent TSP benchmarks (stop-criteria =

60 seconds). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

5.4 Comparison of different Flying Tourist Problem solutions obtained with distinct algorithms

and optimization criteria, considering the Metric Nearest Neighbor approach (shaded

gray) as reference. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

List of Algorithms

1 ACO metaheuristic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2 SA metaheuristic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

xiii

Acronyms

TSP Traveling Salesman Problem

ATSP Asymmetric Traveling Salesman Problem

TDTSP Time-dependent Traveling Salesman Problem

ACO Ant Colony Optimization

DTS Direct Travel Supplier

API Application Programming Interface

FTP Flying Tourist Problem

COP Combinatorial Optimization Problem

VRP Vehicle Routing Problem

ILP Integer Linear Programming

BB Branch and Bound

SA Simulated Annealing

LK Lin-Kernighan algorithm

CSA Client-side application

SSA Server-side application

HTTP HyperText transfer protocol

AJAX Asynchronous JavaScript and XML

URI Uniform Resource Identifier

UI User Interface

xv

JSON JavaScript Object Notation

OS Optimization System

DMS Data Managemet System

HTML HyperText Markup Language

DOM Document Object Model

OTA Online Travel Agency

API Application Program Interface

HTTP Hypertext Transfer Protocol

OS Operating System

UI User Interface

xvi

List of Symbols

The next list describes several symbols that will be later used within the body of the document

α The relative pheromone influence

∆f The objective function difference

∆avg The average objective function difference

η The inverse of artificial pheromone value

λ The cooling coefficient

Ω The set of constrains amongst the decision variables

Π The a Combinatorial Optimization Problem (COP)

ρ The pheromone evaporation ratio

σ A permutation of the set of nodes

τ The artificial pheromone value

A The set of arcs

aij The arc transition from nodes i to node j

atij The arc transition from nodes i to node j at time unit t

cij The cost of transition of arc aij

ctij The cost of transition of the arc atij at time unit t

cwtk The cost of the waiting time at time k

D The set of durations of stay

f The objective function

xvii

G The graph of a COP

i, j sindex components of a node

M The Markov chain length

m The number of artificial ants

p The flight duration (processing time

py The probability of accepting a candidate solution y

q Pseudo-random value calculated at run-time

Q0 exploration rate

S Search space of a COP

s A solution to a COP

s∗ A optimal solution

t The time index

T0 The allowable start period

tk Temperature of the state at time k

TW The set of time window associated to each node

V The set of nodes

wc The relative weight of the flight price

wp The relative weight of flight duration

x, y Solutions to a COP

xviii

1Introduction

Contents

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Existing flight search services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.4 Main contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.5 Document structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1

1.1 Motivation

Online Travel Agency (OTA) are online applications that sell traveling goods, as, for example, commercial

flight tickets. Although many consumers retain the option to buy flights directly from airline companies,

the majority opts to use OTA. The main reason for this is that these agencies aggregate flight data from

multiple airlines, instead of being limited to a single one, which ultimately increases the options of the

consumer. Furthermore, many OTA work as meta-search engines, searching over a variety of websites

in order to find the best flights which satisfy the consumer requirements. However, while OTA usually

provide very complete search functionalities for simple flights, the majority fails to offer the same search

options for a trip composed of multiple cities.

As an example, consider a trip starting and ending at a given city, which must visit every other city

specified in a particular list of cities. If there are no constraints associated with the order in which these

cities must be visited, this problem is already well known in the scientific community as the Traveling

Salesman Problem (TSP). This problem is considered very difficult to solve, since the total number of

possible solutions increases in an exponential way, as the number of cities increases.

However, when commercial flights are the means of transportation between every pair of cities, this

problem can no longer be considered the classic Traveling Salesman Problem (TSP), but rather its

time-dependent counterpart. This is because some of the major flight characteristics, as its price and

duration, cannot be considered constant over time. Rather, they are dependent on the particular flight

that is selected, and the characteristics of that flight may follow no apparent logical rule, at least from the

consumers perspective.

Consequently, finding the most efficient set of flights, from the consumer point of view, tends to be a

repetitive and time-consuming task. Faced with this problem, only the most persistent consumer will be

able to find the best solution for the problem, and this can only occur for a very small list of cities. As the

number of cities increases, even the most persistent consumer can hardly verify all possible solutions.

This means that, ultimately, the final consumer will pay more than necessary for the requested service.

This thesis also arises as a response to the public contest called Traveling Salesman Challenge [1],

issued by Kiwi, a well established OTA, even though the beginning of this work dates prior to the issue of

the challenge. In this challenge, Kiwi recognizes that, in most cases, users do not care about the order

in which they visit a given list of cities. The challenge was set since most OTA do not offer these type of

search tools, due to the computational complexity associated with the problem, although there exists a

market niche with interest in these type of services.

In accordance, this work intends to address the problem of solving unconstrained multi-city trips, by

studying it and by developing the necessary tools to effectively solve the problem in a time-efficient man-

ner. It also aims at the development of a proof of concept online flight search application, implementing

these technologies in order to, ultimately, provide high-quality search for complex flight requests.

2

1.2 Existing flight search services

The tourism industry dates to the 19th century, but it was impacted by some significant chapters of

human technology, which led to an increased and sustained growth of its market size. First, during the

1920’s, the development of commercial aviation had a significant positive impact on the industry, shifting

the transportation focus to the airplane. Much later, during the 90’s, the establishment of the internet led

to some changes in the market, because airlines could sell directly to the passengers [2]. More recently,

the widespread use of mobile phones led to a new increase of the markets size. In 2016, the direct

contribution of the tourism industry for the GDP was over 2.3 trillion dollars, while the total contribution

was over 7.6 trillion dollars [3].

The market size growth of the tourism industry is sustained by traveling agencies, whose main func-

tion is to serve as an agent, advertising and selling products and services on behalf of others [4]. These

services usually include, but are not limited to, transportation, accommodation, insurance, tours and

other tourism associated products. In recent years, Online Travel Agencies (OTA) became particularly

important to the industry. This is because they allow a fast, direct and rich interaction with the user.

By using only a cell-phone, most people can, in a matter of minutes, search and book a flight, hotel

services, and, if necessary, even car rental.

Most OTA operate as meta-search engines, that is, they perform a search across multiple indepen-

dent travel services providers. This is a significant difference to the services provided by Direct Travel

Supplier (DTS), which are limited to offer their own services. Examples of DTS are airlines, hotels and

car rental companies which, usually, only sell their own product or service directly to the client. On the

other hand, OTA usually do not own any travel services, but serve solely as an intermediary between

the traveler and the travel services provider. Recent reports show that OTA are increasing their market

share, but direct travel suppliers still account for 57% of the total online travel consumption [5].

Hence, this section will present a brief overview of the search tools provided by different OTA upon

the request of information regarding a single-flight, as well as round and multi-city trips. Since the search

tools provided by OTA may vary according to the entity making the request, this overview is done both

from the user and the developer point of view, in sections 1.2.1 and 1.2.2, respectively.

1.2.1 User search tools

This section will focus on providing an overview of the search tools available for users to find and book

commercial flights online. Thus, it presents a brief survey of the services provided by different meta-

search engines, in particular, Google Flights, Booking.com, Tripadvisor, Cheapflights.com, Momondo,

Expedia, Edreams, Kayak, Kiwi and Skyscanner. This overview is grouped according to the trip size.

Single-flights and round-trips are considered together, and multi-city trip are considered separately.

3

Table 1.1: Search results across several applications. Search tools provided according to application

Application Date- Price Multiple Map Price [C] Price [C]range overview results single-flight round-trip

Google Flights X X X 65 127Expedia 81 167

Booking.com X X 67 126Momondo X X 62 102TripAdvisor 56 124

cheapflights.com 75 150Adioso X X 69 137

Kiwi X X X X 76 224SkyScanner X X 65 127

Edreams X X 56 116

As to summarize the characteristics of the solutions presented by each of the tested applications,

the same flight queries were performed on each. In every application, three different queries were

performed, one for each type of flight search. First, a single-flight from Lisbon to Amsterdam, on the

date of 8/6/2018. Second, a round-trip involving the same two cities and start date, but considering

a duration of stay of 3 days. Finally, the third query considers three different single-flights: i) from

Lisbon to Amsterdam (8/6/2018); ii) from Amsterdam to Berlin (11/6/2018); and iii) from Berlin to

Lisbon (14/6/2018). Given that different applications have access to different flight data, it is expected

that each application presents a different results. This overview intends to analyze and quantify these

differences. It is also worth noting that a round-trip is not necessarely composed of two flights from the

same airline but may, instead, be composed of two completly independent flights. In general, a flight

search application will consider both cases and suggest the best option.

The results of the single-flight and round-trip queries are presented in table 1.1. The first column of

this table indicates the application in which the previously described queries were performed. The sec-

ond column denotes the ability of that application to provide a search over multiple start-dates, instead of

forcing a single one. The third column indicates if the application presents an overview of the prices for

different start-dates, usually in the form of a visual aid, as a bar chart. The fourth column indicates if the

application responds to the query with different results for different objective functions, as the cheapest,

the fastest and the recommended set of flights. In its turn, the fifth column indicates if the application

integrates a map in its user interface. Finally, the sixth column indicates the price, in euros, of the single

flight query, while the seventh and last column indicates the price of the round-trip query.

Given the considered single-flight and round-trip queries, which were performed on all applications,

and its results presented in table 1.1, it is possible to conclude that:

• different applications provide significantly different results:

– Single-flight prices vary up to 44%;

4

– Round-trip prices vary up to 120%;

• only half of the applications provide date range support;

• only half of the applications provide multiple results for different objective functions;

• some applications present an overview of the different prices as a function of time;

• some applications present an interactive map.

The results of the multi-city query are presented in table 1.2. Columns 3-6 present an overview of the

search tools available on the respective applications, as previously described. In its turn, the seventh

and final columns indicate the price, in euros, of the multi-city flight requests. There is also a column (the

second one) which indicates the applications ability to perform unconstrained multi-city flight requests.

Note that upon the execution of this set of tests, there was no application that provided this tool, but

during the development of this work, Kiwi launched a service (denoted Nomad) which addresses this

problem. As far as we know, this is the only application that provides such a service at the present time.

In fact, while the majority of the applications present a multi-city flight search option, this option usu-

ally forces the user to constrain the query to a predefined route and schedule. However, if the requested

trip involves visiting n different cities, then there are n! different routes to visit all cities. Moreover, if the

schedule of the trip considers an extended start period of d days, then there are n! ∗ d combination of

different routes and schedules for the trip. Hence, by forcing the user to a predefined route and schedule,

the application is effectively searching for only 1 of the n! ∗ d different available solutions to the request.

Given the considered multi-city queries, which were performed on all applications, and its results

presented in table 1.2, it is possible to conclude that:

• different applications provide significantly different results, with prices varying up to 135%;

• most applications only provide constrained multi-city trip search utilities;

• only one application provides unconstrained multi-city trip search utilities;

• most applications do not present different results for different search criteria;

• no application offers the ability to perform multi-city trips on a range of start dates;

• no application displays an interactive map to help the user.

1.2.2 Developer search tools

From the perspective of the developer who intends to create a flight search engine, there are several

OTA that provide access to their Application Program Interface (API), allowing the direct access to flight

data. These APIs usually offer the possibility of searching for cached and, sometimes, real-time flight

5

Table 1.2: Multi-city search results and tools.

Application Flexible Date- Price Multiple Map Price [C]multi-city range overview results multi-city

Google Flights 184Expedia 420

Booking.com 191Momondo X 178TripAdvisor X 238

cheapflights.com -Adioso -

Kiwi X X 226SkyScanner X 238

Edreams 333

data. In some cases, these APIs also extend their range of services, and include endpoints for the

access of hotel information, car-rental and railroad services, and even cruise ships itineraries [6,7].

An API may be classified as public, limited or private according to the restrictions it imposes to its

access. Private APIs are those which are only accessible to enterprises. An example of a company

whose API is private is Skyscanner. In its turn, Google offers a service which can be classified as

limited, since it only offers up to 50 free queries per day. Finally, there are other APIs whose access is

completely free and unlimited, and therefore, are classified as public. Kiwi is an example of a company

that provides a public API.

Given the academic purpose of this work, the selection of a flight API is restricted to those that

are public. Unfortunately, the number of public flight APIs is very short. Among the 10 companies

enumerated in the previous section, only three have public APIs: Expedia, Amadeus and Kiwi. The first

two companies operate mostly in North America and their flight data is mostly restricted to that continent.

In its turn, Kiwi is an European company, but does not limit its services to this continent. Instead, its API

operates as a meta-search engine, aggregating data from different content providers.

As a consequence, during the development of this work, the necessary flight data will be provided by

the Kiwi public API. The major search utilities that this particular API offers are [8]:

• access to relevant information about cities, airports and airline companies;

• access to single-flight and round-trip information, with flexible queries that include:

– flexible start dates;

– flexible durations of stay;

– queries with undefined destination;

– queries with multiple origins and destinations;

• the possibility to aggregate up to 9 single-flights and round-trip queries in a single request.

6

It is worth noting that, although Kiwi (and other companies) enable the query of multiple flights at

once, each flight must specify a particular pair of cities and date. Thus, this corresponds to a constrained

multi-city search, as discussed in the previous section, and does not actually provide information regard-

ing the best route, schedule or set of flights for an unconstrained multi-city trip.

1.3 Objectives

Th thesis has has two main goals:

• The study of the unconstrained multi-city flight routing problem, and the development of an effective

optimization strategy to address this problem;

• The development of a web application prototype, that integrates a working solution to the afore-

mentioned problem.

The first goal of this work can be accomplished by studying similar routing problems, as is the Trav-

eling Salesman Problem. This should provide the theoretical background that is necessary to better

understand the problem, and to developed and implement the necessary optimization algorithms to

solve it. Finally, to address the second goal, it is necessary to develop the necessary tools to create a

web application, and integrate it with the optimization solution devised for the resolution of the problem.

1.4 Main contributions

The main contributions of this work are included in the following article, which will be submitted to

publication in the journal Expert System with Applications, from Springer:

Rafael Marques, Luıs Russo, and Nuno Roma, ”Flying Tourist Problem: flight time and cost mini-

mization in complex routes”, submitted to Expert Systems with Applications, Springer, October 2018.

1.5 Document structure

The presented work can be divided into two main parts: the study of the unconstrained multi-city routing

problem, and the implementation of a web application that integrates a solution to this problem. Thus,

this document is structured as follows.

Chapter 2 presents a literature review on problems that are closely related to the one under study, in

particular, the Traveling Salesman Problem. Chapter 3 presents a formal definition of the problem and

7

defines the optimization strategies that will be implemented to solve it. Chapter 4 presents the develop-

ment and implementation details of a web application that allows the resolution of unconstrained multi-

city flight requests. This is achieved by implementing the previously mentioned optimization strategies

to solve the user requests. Chapter 5 evaluates the implemented system, by analyzing the performance

of its different parts, and the utility of the developed solution as a whole. Finally, Chapter 6 presents the

main conclusions of this work, and addresses the future work that may improve the developed solution

with further utilities.

8

2Literature review on the Traveling

Salesman Problem

Contents

2.1 Common Combinatorial Optimization Problems . . . . . . . . . . . . . . . . . . . . . 10

2.2 Common Optimization Methods overview . . . . . . . . . . . . . . . . . . . . . . . . . 21

9

The problem that this work wishes to address, and which will be formally introduced as the Flying

Tourist Problem (FTP) in chapter 3, is closely related to the Traveling Salesman Problem (TSP). Both

of these problems belong to broader classes of optimization problems, particularly, Combinatorial Opti-

mization Problems (COPs) and Routing problems. This chapter will start by introducing the concept of

Combinatorial Optimization Problems in section 2.1 and, in particular, the Traveling Salesman Problem

and the Vehicle Routing Problem in subsections 2.1.1 and 2.1.2, respectively. This is followed by an

overview of the most common methods to solve the Traveling Salesman Problem, presented in section

2.2.

2.1 Common Combinatorial Optimization Problems

Lawlers (1976) [9], defined combinatorial analysis as ”the mathematical study of the arrangement, group-

ing, ordering, or selection of discrete objects, usually finite in number”. Schrijver (2002) [10], following

this definition of Lawlers, improves it with the important concept of optimal solution, ”Combinatorial op-

timization searches for an optimum object in a finite collection of objects.”. This definition is followed by

a remark, stating that ”typically, the collection (...) grows exponentially in the size of the representation”,

and concluding that ”scanning all objects one by one and selecting the best is not an option”.

Following a more concise definition [11], a combinatorial optimization problem (COP) may be defined

as follows:

Definition 1) A combinatorial optimization model P = (S,Ω, f) consists of:

1. a search space S, defined by a finite set of decision variables, each with a domain;

2. a set Ω of constraints amongst the decision variables;

3. an objective function f : S → R+0 , to be minimized.

The search space is defined by a set of decision variables Xi, i = (1, ..., n), each associated to a

domain Di, which specifies the possible value of each decision variable. An instantiation of a variable is

an assignment of a value vji ∈ Di to a variable Xi. This leads to the definition of a feasible solution s ∈ S,

which corresponds to the assignment of a value to each decision variable, according to its domain, in

such a way that all constraints in Ω are satisfied. Finally, the objective of the problem is to find a global

minimum of P, that is, a solution s∗ ∈ S such that f(s∗) ≤ f(s) ∀s ∈ S. The set of all global minima is

denoted by S∗ ⊆ S.

When working on combinatorial optimization problems, it is useful to have an idea of how difficult the

problem is. This characterization is provided by a field called computation complexity. A COP, Π, is said

to have worst-case time complexity O(g(n)), if the best algorithm for solving Π finds an optimal solution

to any instance of size n of Π, in a computation time upper bounded by g(n).

10

A problem Π is said to be solvable in polynomial time if the maximum amount of computing time that

is necessary to solve any instance of size n is bounded by a polynomial in n. If k is the largest exponent

of such a polynomial, then the combinatorial optimization problem is said to be solvable in O(nk) time.

Hence, a polynomial time algorithm is characterized by a computation time bounded by O(p(n)),

for some polynomial function p, where n is the size of the problem instance. If an algorithm has a

computational time that can not be bound by a polynomial function is denoted as an exponential time

algorithm. Any problem that can be solved in polynomial time is said to be tractable, while problems that

are not solvable in polynomial time are called intractable.

In the field of computational complexity, there is also another important concept called polynomial

time reductions, which transform a problem into another problem, in polynomial time. If the latter is

solvable in polynomial time, than so is the first one. The class of problems which is solvable in polynomial

time is called P . On the other hand, there is a class of problems called NP , which stands for non-

deterministic polynomial acceptable problems, for which given a solution can be verified in polynomial

time. The class of NP-complete refers to the most difficult problems in NP.

It is worth mentioning that there exists another class, called NP-hard, for which each problem is as

hard as the hardest NP-complete problem. More precisely, a problem H is NP-hard when every problem

in NP can be reduced to H using a polynomial time transformation. This definition of the NP-hard class

leads to the logical conclusion that finding a polynomial time algorithm for any problem in NP-hard,

would imply the resolution of all NP problems in polynomial time. However, until now, no polynomial

time algorithm was found for any NP-hard problem. Note that the class NP-hard does not necessary

coincide with the class NP.

Regarding this, there is a discussion among the scientific community regarding the question ”P =

NP?”, since it is one of the major unsolved problems in computer science. Figure 2.1 illustrates this

discussion by depicting the several classes, according to both possible solutions to the aforementioned

question.

2.1.1 Traveling Salesman Problem

Given a list of cities and the distances between them, the Traveling Salesman Problem (TSP) is the

combinatorial optimization problem of finding a minimum length route which connects every city. With

this original definition, proposed by A. Punne [12], the focus of the TSP is to perform optimization on

routing problems, as the school bus problem, studied by Merrill Flood in 1942 [13], minimizing the total

distance of a tour. Some variations of the original formulation allow the adaptation of the problem to suit

different optimization goals [14]. For example, instead of distance, the focus may be the minimization

of the total cost, travel time, or some other attribute associated to the problem under consideration. It

is also possible to search for a route which minimizes two, or more, objective functions at once [15].

11

Figure 2.1: Illustration of the classes P, NP, NP-complete and NP-hard

In some routing problems, the tour under consideration must satisfy some constraints [16]. Most often,

these constraints refer to scheduling conflicts which must be satisfied [17]. A practical application of this

is the resolution of routing problems with time windows [18].

When states in the field of Graph theory, the TSP is the problem of finding a minimum cost Hamil-

tonian cycle over a complete, undirected, weighted graph [19]. The problem of finding a minimum cost

Hamiltonian cycle was shown to be NP-complete [20]. This implies the NP-hardness of the TSP [21].

Furthermore, in several graph problems, considering a symmetric cost between two points is not suit-

able. This is known as the asymmetric TSP, and it considers a directed graph instead [22].

In the flight industry, the Traveling Salesman has vast applications. As an example, it was applied to

aircraft scheduling in the terminal area, enabling the increase in the airports capacity [23]. More recently,

the TSP, and its time dependent variation, have been focus of attention in fields related to Unmanned

Aerial Vehicle routing ( [24], [25]). There is also an online website which introduces the Air Traveling

Salesman, whose goal is to find the layover airports when no direct route is available [26].

In some cases, the classic formulation of the Traveling Salesman does not adequately describe the

characteristics of the problem under consideration. To overcome this, different problem formulations are

considered. An example of this is the time dependent TSP [27]. In this formulation, the cost of each

arc is not constant, but varies as a function of time. In general, this problem is harder to solve than

the classic TSP [28]. There are several other combinatorial optimization problems which benefit from

considering a time dependent approach [29]. Vehicle routing is a field which particularly focus on this

problem, due to the characteristics of street traffic [30]. In fact, the Traveling Salesman Problem can be

regarded as a special case of the Vehicle Routing Problem, in which the fleet is composed by only one

vehicle, the salesman, [30]. Because of this, works related to the Vehicle Routing Problem may also be

relevant for the resolution of the Traveling Salesman Problem.

During the day to day activities, many people are faced with similar routing problems every day.

12

Consider the problem of walking or driving from point A to point B. This is a graph problem, in which

the arcs are the streets, the nodes the streets intersections [31]. In its turn, the weights refer to the

distance or travel time, which in its turn may be affected by other parameter, as traffic [27]. If the person

is familiar with the graph, they are capable of finding a good route mentally, in a very fast manner [32]. If

the undertaken route is to visit a set of points exactly once, before returning to the original starting point,

this is known as the Traveling Salesman Problem.

This section is structured as follows. First, we formally define the classic Traveling Salesman Prob-

lem, in section 2.1.1.A, and present some of its most common variations. This is followed by a more

in-depth overview of the time-dependent TSP, in section 2.1.1.B, as this problem is particularly relevant

for the work under development. For the same reason, the following subsections 2.1.1.C and 2.1.1.D

present the TSP with time-windows and with multiple objectives, respectively.

2.1.1.A Problem definition

As it was previously referred, the Traveling Salesman Problem may be defined both as combinatorial

optimization and as a graph problem. In either case, the TSP is defined by a graph G = (N,A), where N

is the set of nodes, and A is the set of arcs connecting those nodes. The set of nodes is of size n = |N |,

while the size of the set of arcs is n2. Each arc, aij ∈ N has an associated weight, cij , which represents,

for example, the distance between cities. The set of arcs is fully connected, that is, each node is capable

of directly reaching any other node, without visiting a third node. When two nodes can not be connected

by an arc, the cost of that node is considered as very high. In the classical TSP formulation, the graph

is undirected, which implies symmetry in the costs of the arcs, that is: cij = cji∀i, j ∈ N . Because of the

characteristics of this TSP formulation, the graph is said to be connected, weighted and undirected.

The objective of the TSP is to find the minimum cost Hamiltonian cycle, that is a path which visits

each node exactly once, and returns to the initial node, closing the path. A generic solution to the TSP

is any permutation σ = (σ(1), . . . , σ(n)) over the set of nodes N . The permutation σ is also a set, where

σi, i ∈ len(σ), represents the node in the i’th index of the cycle. The cost of a cycle is given by the sum

of the weights of each arc by which it is composed, that is C(σ) =∑ni=1 cσiσi+1

, where σn+1 = σ1

The TSP may solve different types of problems by optimizing different parameters. In the classic

formulation, the weight between an arc connecting two nodes represent the distance between two cities.

However, the weight of an arc can represent different things, particularly, travel time or travel cost.

Although changing the parameter may lead the TSP formulation intact, in some cases, it changes the

problem. This occurs, for example, when considering that the costs are time-dependent and this will be

approached in section 2.1.1.B.

Up until now, only the symmetric Traveling Salesman Problem was addressed. However, the Travel-

ing Salesman Problem usually refers to a broader class of problems, which include, but are not limited,

13

to the symmetric case. These problems are often called variations of the symmetric TSP. Below, the

most common variations will be presented briefly, by providing a description for the asymmetric, metric,

euclidean and the bottleneck TSP, as well as the messenger problem. Other TSP variations, as the

time-dependent one, will be discussed in its own subsection, as they are particularly relevant for the

work under development. The definitions here provided are with respect to those formalized by with

respect to those formalized by [33].

Asymmetric TSP

In the Asymmetric Traveling Salesman Problem (ATSP), the weight matrix associated to the problem

is not symmetric. That is, there is no constraint imposing that cij = cji, ∀i, j ∈ N , i 6= j, as happens with

the classical TSP.

For some particular real world problems, the ATSP may be more adequately describe the problem

than the TSP. For example, when considering a routing problem over a city, some roads may not be

connected in both ways. In this case, the weight of an arc connecting two points is different, depending

on the direction of traversal of the arc.

Metric TSP

The metric TSP is a special case of the TSP, in which the arcs cost, in addition to being symmetric,

also respect the triangle inequality. That is, cij ≤ cik + ckj , ∀ i, j, k ∈ N .

Euclidean TSP

In the Euclidean TSP, the set of nodes is placed in a d-dimensional space, and the weight of each

arc is given by the euclidean distance. This distances is calculated based on equation 2.1, for two points

x = (x1, x2, ..., xd) and y = (y1, y2, ..., yd).

dij =

( d∑i=1

(xi − yi)2)1/2

(2.1)

The euclidean TSP is a variation which is both symmetric and metric.

Bottleneck TSP

In the Bottleneck TSP, the objective is to find a valid route which minimizes the cost of the highest

cost arc of the tour. According to the characteristic of the graph, the Bottleneck TSP may either be

symmetric, asymmetric, metric or time-dependent.

The Messenger Problem

The Messenger problem, also known as the wondering traveling salesman, is the problem of finding

a minimum cost Hamiltonian path connecting edges u and v of the graph G. It can be seen as a Traveling

Salesman Problem in which the tour is not closed, but ends on a specific node, different from the initial

one. The Messenger problem can be transformed into the TSP, by considering a cost of −M for the

arc (v, u), where M is a large number. If the nodes u and v are not specified, and one wishes to find a

minimum cost Hamiltonian path in G, this can be achieved by a graph transformation, adding one node

14

and connecting it to all other nodes by arcs of cost −M . The optimal solution to the TSP on this modified

graph can be used to produce the optimal solution to the original problem.

2.1.1.B Time Dependent TSP

The Time-dependent Traveling Salesman Problem (TDTSP) is a generalization of the TSP, where arc

costs depend on their position in the tour [34, 35]. This section first introduces the TDTSP as a graph

problem, followed by its definition as a sequencing problem.

TDTSP as a graph problem

Let N = 1, 2, ..., n and let N0 = N ∪ 0. The TDTSP on a complete graph K(N0) can be modeled

as an optimization problem over a layered graph (V,A). V is the set composed by the source node 0, the

termination node T , and intermediate nodes ai,t for i, t ∈ N . In this representation of the intermediate

nodes, the first index of ni,t identifies the node of the graph K(N), and the second index represents the

position of the node i in the path between nodes 0 and T . In its turn, A is the set of arcs connecting the

nodes. This set is composed of initiation, intermediate, and termination arcs. For i ∈ N , a00,i denotes

an initiation arc from node 0 to node ni,1, and ani,T denotes a termination arc from node ni,n to node

T . Given i, j ∈ N such that i 6= j, and 1 ≤ t ≤ n − 1, ai,j,t denotes an intermediate arc from node

ni,n to node nj,t+1. The third index of an arc represents its layer, that is, the position in which it occurs

in the path, or in other words, the time of the arc traversal, if we consider 1 time unit for each of these

traversals.

When working on the TDTSP, it is often convenient to define G(n) as a subgraph of (V,A), induced

by V \0, T. This way, G(n) has n2 nodes ni,t : i, t ∈ N and all the n(n − 1)2 intermediate arcs of A.

A path with n nodes in G(n) is of the form (vt.t) : vt ∈ N, 1 ≤ t ≤ n. Since consecutive nodes are

in consecutive layers, the path can be described by an ordered array (vt : t ∈ N). This path can be

extended to a 0 − T path of (V,A) by appending node 0 and T to the beginning and end of the tour,

respectively.

The classical TSP and its time-dependent variation share the same objective, which is to find the

minimum cost Hamiltonian cycle over graph (V,A). Another property they share is the possibility of

working over a symmetric or asymmetric problem.

TDTSP as one-machine sequencing problem

In operation scheduling, the time dependent TSP can also be stated as a one-machine sequencing

problem [35]. Consider a set of n jobs, J1, ..., Jn, to be executed on a single machine. Each job has a

setup cost Cti,j , occurring when job Ji, processed in the t-th time unit, is followed by job Jj , processed in

the (t+ 1)-th time unit. Consider that each job completion takes exactly one time unit. The machine is in

some given initial state, denoted by 0, before the job processing begins. As happens with the classical

TSP, the machine has to be returned to its original state, after the job processing ends. The problem is

15

to find a sequence, Jw(1), ..., Jw(n), that minimized the total set-up cost C(w), defined by:

C(w) = C00,w(1) +

n−1∑i=1

Ciw(i),w(i+1) + Cnw(n),0 (2.2)

It is important to note some important characteristics of this formulation. First, problems with unspec-

ified initial/final state can be formulated in the same way using 0 as the initiation/termination cost, that

is C00,w(1) = 0 and Cn0,w(1) = 0. Secondly, the overwriting of the above formulation reduces the defined

problem into the classical TSP and the classical Assignment Problem [36]. The first case is achieved by

considering that setup costs are not time-dependent, that is, Cti,j = Ci,j . The latter is accomplished by

considering that setup costs are not dependent on the second/first job, that is, Cti,j = Cti .

2.1.1.C TSP with time-windows

The Traveling Salesman Problem with Time Windows (TSPTW) is a generalization of the TSP, in which

the objective is to find a minimum cost Hamiltonian cycle which visits every city in its requested time

window. This problem has important applications in the field of routing and scheduling. It is also par-

ticularly relevant for the Vehicle Routing Problem, as these constrains may be imposed by customers,

whose operation hours are limited to a time window [37]. Being a generalization of the TSP, the TSPTW

is also NP-complete, [38]. This section introduces two definitions, one for the asymmetric TSP with

time-windows [39], and a second one for the time-dependent variation [38] of this problem.

Consider a complete digraph G = (N,A), where V is the set of n = |N | nodes, and A the set of

arcs, each associated with a non-negative arc cost, ci,j , and non-negative setup times, tij associated

to each arc ai,j ∈ A. The nodes correspond to jobs to be processed (as described in the single-

machine sequencing problem), and arcs correspond to job transitions, where the setup times, tij , define

the changeover time needed to process node j immedtialy after node i. Each node i ∈ N has an

associated processing time pi ≥ 0, a release date ri ≥ 0 and a deadline di ≥ 0, where the release date

and deadline denote, respectively, the earliest and latest possible starting time for the processing of

node i. The minimal time delay for processing node j immediately after node i is given by vij = pi + tij .

The interval [ri, di] is called the time window for node i. The time-window is said to be relaxed if ri = 0

and di → +∞. On the contrary, a time-window is called active if ri > 0 and di < +∞. It is possible to

reach a node i ∈ N at a time t ∈ Z+ ∪ 0, sooner than its release date ai. In this case, it will undergo

a waiting time ai − t, before leaving node i at time ai.

When dealing with routing problems with time-windows, it is often necessary to define if the time-

windows are hard or soft constraints. Hard time windows consider that a node i ∈ A can not be visited

after its deadline di. On the contrary, when considering soft constraints, the node i might be visited after

the deadline di, but, in this case, a penalty occurs.

16

The objective function of the problem under consideration depends on the specific definition of the

problem, according to its time-window constraints. When dealing with hard constraints, the objective

function is defined by the sum of the costs of each arc belonging to that tour, while when using soft

constraints, the objective function depends on the specific problem definition and the values associated

to the aforementioned penalties.

There are several versions of the TSPTW which introduce time-dependent variations. These varia-

tions usually focus on time-dependent arc costs, setup times, or processing time. This generalization

may occur, for example, as a result of considering the traffic effects associated to real world routing

problems. Below is the formal definition of the ATSPTW with time-dependent travel times and costs

(ATSPTW-TDC), as defined in [38].

Let G = (V,A) be a simple directed graph, V = vini=0 its set of vertices, where v0 is the depot

vertex. Each vertex vi has an associated time window (or service time) [ai, bi], verifying that ai, bi ∈

Z+∪0 and [ai, bi] ⊆ [a0, b0]∀i ∈ 1, ..., n. Every time window [ai, bi] has associated pi = bi−ai instants

of time ai + k − 1pik=1. For simplicity, we will denote tki = ai + k − 1, and therefore tki ∈ Z+ ∪ 0. The

time and the cost of traversing an arc (vi, vj) ∈ A depend on the instant of time tki at which the traversing

is started. Consider ctij ≥ 0 and tijt ∈ Z+ ∪ 0, respectively, the cost and the time of traversing an

arc (vi, vj) starting at instant tki . Furthermore, the waiting times, t ∈ Z+, at every vertex vi have an

associated a waiting time cost cwti(t) ≥ 0.

The proposed goal in this formulation of the ATSPTW-TDC is to find a Hamiltonian cycle inG, starting

and ending at v0, and respecting the time-window [a0, b0], such that:

• Starting the circuit at time tki ≥ a0 involves a waiting time cost cwt0(tk0 − a0) ≥ 0, with cwt0(0) = 0;

• The circuit leaves each vertex vi ∈ V during its associated time window;

• If the circuit arrives at vertex vi ∈ V at time t ∈ Z+, such that t ≤ ai (before the beginning of the

service time), it is allowed a waiting time ai − t with cost cwti(ai − t) ≥ 0, with cwti(0) = 0. In this

case the circuit leaves vertex i at time ai;

• The sum of the costs of traversing arcs and of the waiting time costs is to be minimized.

The authors of the work introduced in [38] propose an exact algorithm for the previously defined

ATSPTW-TDC using several graph transformations, which successively reduce the problem into an

asymmetric TSP, for which several efficient exact algorithms already exist.

2.1.1.D Multi-objective TSP

The multi-objective Traveling Salesman problem is a generalization of the classic TSP, and is part of

much broader class of problems, comprehending the multi objective combinatorial optimization problems

17

[40] and, in particular, the multi objective vehicle routing problems [41].

The multi objective TSP is defined as follows [42]:

Given a list of n cities and a set D = (D1, D2, ..., Dk) of n ∗ n weight matrices, the objective is to

minimize f(π) = (f1(π), f2(π), ..., fk(π)), with fi(π) = (∑n−1j=1 d

iπ(j),π(j+1)) + diπ(n),π(1), where π is a

permutation over the set (1, 2, ..., n).

Note that when D = (D1), this corresponds to the single-objective TSP. Note also that the above

formulation considers that all objective functions calculate the weight of the Hamiltonian cycle, according

to the respective weight matrix.

The quality of the results of the multi objective TSP are usually measured according to its perfor-

mance across the Pareto criteria, defined as follows [43]:

Pareto dominance

A vector ~u = (u1, ..., uk) is said to dominate ~v = (v1, ..., vk), denoted by ~u ~v, if and only if ~u is

partially less than ~v, i.e. ∀ i ∈ 1, ..., k ui ≤ vi ∧ ∃ i ∈ 1, ..., k : ui < vi.

Pareto Optimality

Pareto optimality is defined as a concept of allocation optimality. An allocation is not Pareto optimal

if there is at least one alternative allocation which produces improvements.

A solution x ∈ Ω is said to be Pareto optimal with respect to the solution space Ω if, and only if, there

is no x′ ∈ Ω for which ~v = F (x

′) = (f1(x

′), ..., fk(x

′)) dominates ~u = F (x) = (f1(x), ..., fk(x)).

Pareto Optimal Set

For a given multi-objective problem F (x), the Pareto optimal set (P∗) is defined as :

P∗ = x ∈ Ω|¬∃x′ ∈ Ω : F (x′) F (x)

Although the above mentioned problem refers to the multi-objective TSP, without loss of generality,

the multi-objective optimization can be performed on a time-dependent TSP. There is very few direct

research about multi objective time dependent TSP, but one can cite [44], which proposes a multi-

objective tabu search for single machine scheduling problems with sequence-dependent setup times.

2.1.2 Vehicle Routing Problem

The Vehicle Routing Problem (VRP) is the problem of finding the optimal set of routes for a fleet of

vehicles, to serve a given set of customers. The VRP is believed to have been introduced by Dantzig, in

1959, in a work entilted The Truck Dispatching Problem [45]. Later, it was shown that the VRP, being a

generalization of the TSP, also belongs to the NP-hard class [46].

Being an NP-hard problem, the focus of the research usually revolves around heuristic algorithms,

although there are some procedures which are known to produce optimal solutions ( [47], [48]). As

referred by Donati in [49], citing the work of Blum [50], even when an exact procedure is available, it

18

usually requires large computational time, which is not viable in the time-scale of hours, as is required

by the industry.

In 1992, Malandraki [51] stated that the assumption of constant and deterministic costs is an ap-

proximation of the actual conditions of routing problems, and thus, a time-dependent formulation of

the problem should be considered. In 1999, Gambardella and colleagues proposed a multi ant colony

system for solving the vehicle routing problems using a meta-heuristic approach [49]. Years later, Gam-

bardella expanded this research to include time-dependent variations, [52], as proposed by Malandraki

and Daskin [51]. There are several other works, which propose meta-heuristic solutions to solve the

time-dependent VRP, [53], including the use of simulated annealing [54], and genetic algorithms [55].

The rest of this section is structured as follows. Subsection 2.1.2.A presents a formal definition of the

Vehicle Routing Problem. Its time-dependent and multi-objective variations are covered, respectively, in

subsection 2.1.2.B and 2.1.2.C. Since the TSP occurs only as a generalization of the non-capacitated

vehicle routing problem, the capacitated vehicle routing is out of the scope of this work.

2.1.2.A Problem definition

Following the definition proposed by Laporte [56], let G = (V,A) be a graph, where V = 1, ..., n is

a set of vertices, representing nodes/customers/cities, with the depot located at vertex 1, and A is the

set of arcs fully connecting the nodes. Each arc (i, j), i 6= j, is associated with a non negative weight,

cij . Depending on the context of the work, this weight might represent the distance between nodes, the

travel time, or even the travel cost. It is assumed that a fleet of m vehicles is available. The Vehicle

Routing Problems consists in finding the set of optimal routes such that:

1. each city in V \1 is visited exactly once, by exactly one vehicle;

2. all routes start and finish at the depot;

3. some constraints must be satisfied;

The most common constraints associated to 3) include: capacity restrictions associated with each

vehicle; limit on the number of nodes that each route might visit; total time restrictions; time-windows in

which each node must be visited; precedence relations between nodes.

The goal of the Vehicle Routing Problem usually consists in finding an optimal set of routes, as

to minimize the total cost, where the cost depends on the total distance covered, and the fixed costs

associated to each vehicle. However, depending on the problem under study, the goal may be different,

as to minimize the total travel time, minimize the total number of vehicles, or even both at the same

time [52].

19

2.1.2.B Time-dependent VRP

The Vehicle Routing Problem is a very wide class of optimization problems, whose precise problem

definition usually depends on the characteristics of the problem under consideration. Thus, introducing

time-dependencies on the problem also depends on the specificities of the situation. There are several

authors which consider time-dependent travel costs [54] and the objective is to minimize the total costs,

while others introduce time-dependent travel times [57], and the objective is to minimize the total travel

time. There are also those who consider that the objective function is a function of both travel time and

travel costs, and at least one of these (travel time, travel cost) is time-dependent [58]. The definition

herein proposed follow this last time-dependent variation.

Following the work of Figliozzi [58], the time-dependent VRP is defined as follows. Let G = (V,A)

be a graph where A = (vi, vj) : i 6= j ∧ i, j ∈ V is the set of arcs, and V = v0, ..., vn+1 is the set

of vertices. Vertices v0 and vn+1 denote the depot at which the vehicles are based. It is considered

that each vehicle has an uniform capacity of qmax. It is also expected that each vertex i ∈ V has an

associated demand qi ≥ 0,a service time gi ≥ 0, with the depot having q0 = 0 and g0 = 0. The set

of vertex C = v1, ..., vn specifies the set of n customers. The arrival time of a vehicle at customer i,

i ∈ C, is denoted by ai, and its departure time bi. Each arc (vi, vj) has an associated distance dij ≥ 0,

and a travel time tij(bi) ≥ 0. Note that the travel time is a function of the departure time from costumer

i. The set of available vehicles is denoted by K. Consider that the cost per unit of route duration is

denoted by ct, and the cost per unit of route distance is denoted by cd.

In this formulation, there are two goals for the time-dependent VRP. The first corresponds to the

minimization of the total number of vehicles used. The second corresponds to the minimization of the

total cost, which is a function of both distance and travel time.

The complete definition of the problem follows a mixed integer programming approach, with a total

of 11 constraints. These will not be covered in detail here, as the VRP is not the primary object of study

of this work. Thus, it is important to define in which circumstances the TDVRP can be transformed into

the TDTSP. This is possible by considering only one vehicle, with infinite capacity, and by adapting the

objective function according to the problem under consideration.

We conclude this section with the final remark that the above presented definition of the time-

dependent VRP corresponds to a static version of the time-dependent case. There is a lot of research

around the dynamic case, in which the problem is updated during the execution of the program. This

has major applications in the routing industry,and it is often referred to as real-time Vehicle Routing. For

more information regarding this problem, we refer to [59] [60].

20

2.1.2.C Multi-objective VRP

Multi objective optimization corresponds to the resolution of a combinatorial optimization problem in

which more than one goals are defined. In the case of the Vehicle Routing Problem, [41], the most

common objectives include minimizing the fleet size, the total traveled distance, the total required time,

the total tour cost, and/or maximizing the quality of the service or the collected profit. Note that in most

problems, when multiple objectives are identified, the different objectives often conflict with each other.

Multi-objective optimization usually relies on the use of meta-heuristics, [61]. There are several works

focusing on this problem, and the most promising meta-heuristics for multi-objective optimization include

Evolutionary Optimization, [62] and Simulated Annealing, [63]. There is also some work considering the

Ant Colony Optimization. In particular, a modified ant colony was designed to solve a bi-objective time

dependent vehicle routing problem, in which the main goal was the minimization of the fleet size, followed

by the minimization of the total cost, [52].

2.2 Common Optimization Methods overview

The algorithms that address the Traveling Salesman and other Combinatorial Optimization problems can

be classified as exact, heuristic or meta-heuristic. Some of the algorithms belonging to each of these

three classes will be discussed in subsection 2.2.1, 2.2.2 and 2.2.3, respectively. Exact algorithms

are those that always provide an optimal solution to the problem. Although these might seem the first

choice, exact algorithms are usually inefficient for solving large problems. In its turn, heuristic algorithms

intend to be efficient, handing over the objective of finding the best solution, and focusing on finding

near-optimal solutions in a short time. Heuristic algorithms are usually problem specific, while Meta-

heuristics algorithms are designed in such a way that they can be used for a variety of Combinatorial

Optimization problems, in a fast and efficient way.

2.2.1 Exact algorithms

There are multiple exact algorithms available for the Traveling Salesman Problem [64], including its time-

dependent [65] and time windows [66] variations. These algorithms usually require the problem to be

formulated as an Integer Linear Programming (ILP) instance. In this section, we present ILP definitions

for both the classic and the time-dependent TSP. We also present a brief introduction regarding the

Branch and Bound algorithm, often used to determine optimal or near optimal solutions.

21

2.2.1.A Integer Linear Programming

The Traveling Salesman Problem, defined over the graph G = (V,A), may be formulated as an integer

linear programming problem [64], by associating one binary decision variable xij to every arc aij . Let cij

represent the weight of the arc ai,j . When a decision variable has a value of 1, the corresponding arc

belongs to the solution. The ILP formulation for the TSP is as follows:

min∑i

∑j,j 6=j

cijxij (2.3)

s.t.∑j

xij , i = 1, . . . , n, (2.4)

∑i

xij = 1, j = 1, . . . , n, (2.5)

xij ∈ 0, 1, i, j = 1, . . . , n, i 6= j. (2.6)

The objective function of the TSP is described by equation 2.3. Equations 2.4 and 2.5 represent

the imposed constrains over the variables. In particular, they state that a tour must enter and leave,

respectively, each node exactly once. The final constraint, defined in equation 2.6, forces the decision

variables to binary values. This set of constrains is called the assignment constraints.

It is worth noting that although the assignment constraints force each node to be entered and left

exactly once, the formation of subtours is still possible. For example, if we consider two disjoint subtours,

the assignment constraints all hold, but this does not form a closed cycle,and thus not constitute a

valid solution to the Traveling Salesman Problem. Because of this, it is necessary to introduce subtour

elimination constraints. There are several possible formulations for this effect, in particular, the Dantzig-

Fulkerson-Johnson DFJ and the Miller-Tucker-Zemlin (MTZ ) formulations.

To exclude subtours, Dantzig and colleagues propose a strategy that introduces an exponential num-

ber of constraints [64] (approximately 2n) subtour elimination constraints), presented in equation 2.7.

The exponential number of constraints makes this strategy undesirable, even for medium and small size

instances.

∑i∈S

∑j∈S

xij ≤ |S| − 1, S ⊂ V, |S| ≥ 2 (2.7)

In its turn, the MTZ formulation includes less subtour elimination constraints, at the expense of the

introduction of a new set of variables u = 1, . . . , n, where ui denotes the position of node ni in the tour.

There are approximately n2/2 subtour elimination constrains [67], presented in equation 2.8. Although

the MTZ formulation introduces a new set of variables, compared to the DFJ strategy, the number of

subtour elimination constraints is significantly reduced.

22

u1 = 1

2 ≤ ui ≤ n ∀ i, j ∈ 2, . . . , n,

ui − uj + nxij ≤ n− 1 ∀ i, j ∈ 2, . . . , n, i 6= j

(2.8)

In its turn, the time-dependent Traveling Salesman Problem variant may be formulated as an integer

linear programming problem by associating one binary decision variables xtij to every arc atij . Let ctijrepresent the weight of the arc ati,j . A decision variable xtij takes a value of 1 when the arc atij , which

represents the transition from node ni to node nj at time t, belongs to the solution. Hence, the ILP

formulation for the TDTSP is as follows:

min∑i

∑j

∑t

ctijxtij (2.9)

s.t.∑j

∑t

xtij = 1, i = 1, . . . , n, (2.10)

∑i

∑t

xtij = 1, j = 1, . . . , n, (2.11)∑i

∑j

xtij = 1, t = 1, . . . , n, (2.12)

n∑j=1

n∑t=2

txtij −n∑j=1

n∑t=1

txtij = 1, i = 1, ..., n (2.13)

xtij ∈ 0, 1, i, j, t = 1, . . . , n, i 6= j. (2.14)

The objective function is presented in equation 2.9. Equations 2.10, 2.11 and 2.12 represent the

imposed constraints over the decision variable. Particularly, they state that each city must be entered

exactly once, left exactly once, and visited in exactly one time period, respectively. As occurs with the

classical TSP, the ILP formulation needs to formulate a constraint to eliminate the formation of subtours.

This is presented in equation 2.13. Finally, equation 2.14 guarantees that the decision variable takes

binary values.

2.2.1.B Branch and Bound

Branch and Bound (BB) is one of the most used tools to solve large NP-hard combinatorial optimization

problems. As an example, the software tool Concorde uses a Branch and Bound algorithm, and it was

used to solve all 110 instances of the TSPLib, reporting exact solutions in every problem, including a

instance with 89.900 nodes, although it required more than 110 CPU years. To be precise, B&B should

be classified as an algorithm paradigm, constituted by 3 main parts, which have to be chosen according

to the problem under consideration, and for which many options may exist [68].

23

The force of the B&B comes from it being a search algorithm which (indirectly) searches the com-

plete search space of the problem. Since this is not directly feasible, due to the common exponential

growth of the solution space, B&B takes advantage of bounds, combined with information regarding the

current best solution,to safely discard certain solutions among the search space.

At any point of the algorithm, there is always a current solution and a pool of unexplored subsets

of the solution space. At the beginning of the algorithm, this pool consists of (only) the root node, and

at the end of the algorithm, it will consists of an empty set, meaning that the entire search space was

successfully explored.

The initialization of the B&B requires the incumbent, which denotes the objective function value of

the current solution, to be initialized as ∞. In many cases, it is possible to generate an initial feasible

solution using some heuristic method (as, f.e., the nearest neighbour heuristic, that will be discussed in

subsection 2.2.2.B), this solution is recorded and its objective value is set as incumbent. The process of

generating an initial solution usually has a positive impact on the B&B algorithm.

After the initialization, this algorithm enters an iterative process, until the pool of unexplored subsets

is empty. This process consists of three main steps: i) selection of a node to process; ii) the bound

calculation; and iii) branching.

Branch and Bound algorithms vary according to the established strategies for each of the three main

steps of the iterative process, as well as the initial heuristic. In any case, the selected bounding function

is the key for any good branch and bounding algorithm, because the selection of a bad function can not

be compensated with good choices on the branching and bounding strategies.

As an example, consider the trivial case where the bounding function is the constant value of zero.

It is obvious that this will always be a lower bound to the problem, but it does not produce any quality

information of which solutions to discard. Ideally, the value of the bounding function for a given sub-

problem should be equal to the value of the best feasible solution to that problem. This is usually not

possible, since subproblems may also be NP-hard. Thus, bounding functions are chosen according to

the proximity to the best possible value, and to its time complexity - usually restricted to polynomial time.

To complete this overview it is necessary to reference the most relevant search strategies for the

Branch and Bound heuristic, which usually revolve around the Best First, Depth First and Breadth First

Search [68]. To a more detailed overview of the Branch and Bound algorithm , we refer to J. Clausen’s

Branch and Bound algorithms - principles and examples [68].

2.2.2 Heuristic algorithms

In some cases, exact algorithms can not be used in the resolution of the TSP instance under consid-

eration. This usually occurs when dealing with very large instances, or when there is an urgency in

obtaining solutions in a very fast manner. In these cases, the usage of approximation algorithms may

24

be a good choice. These algorithms are not guaranteed to produce an optimal solution, however, with a

good heuristic, approximation algorithms produce high quality solutions in a reasonable time. Generally,

the heuristic may be classified as one of two classes: construction or improvement heuristics [69].

2.2.2.A Held-Karp Lower Bound

In some cases, the quality of a heuristic solution can not be directly measured, as no exact solution for

the problem under consideration is known. In these cases, it is important to have a way of evaluating

its performance. The standard way of doing this is by comparing the heuristic solution with the solution

generated by the Held-Karp (HK) lower bound [70].

The HK lower bound is the solution of the linear programming relaxation of the ILP formulation of

the TSP. This solution can be found in polynomial time for moderate instance sizes. However, for a

very large problem, solving the relaxed problem directly is not feasible. In these cases, Held and Karp

proposes an iterative algorithm in order to approximate the solution. This method involves computing

a large number of minimum spanning trees. This iterative version of the algorithm will often keep the

solution within 0.01% of the HK lower bound [69].

2.2.2.B Tour construction

A construction algorithm is based on the generation of a valid solution, by using some heuristic function

to guide the construction process. These algorithms stop when a valid solution is found, and they do not

attempt to improve the solution any further. Two different tour construction heuristics for the TSP will be

presented: the nearest neighbour and the greedy heuristics.

The nearest neighbour [71] is a simple and intuitive heuristic for the TSP. It starts with the selection

of a random node. This is followed by the selection of the closest node, belonging to the set of nodes

not yet visited. This step is repeated until all nodes have been visited. Finally, the solution construction

process is completed by returning to the initial node. The computational complexity of the nearest

neighbour is O(n2), and the set of solutions generated with this heuristic are often within 25% of the

optimal solution [69]. To construct a solution using the nearest neighbour heuristic, the following steps

can be taken:

1. Select a random city

2. Select the nearest unvisited node

3. If there are unvisited nodes, repeat step (2)

4. Return to first node

25

In its turn, the greedy heuristic [71] is a construction algorithm which creates a valid solution by

repeatedly selecting the arc with the lowest weights, always taking into account the problem constraints.

Although this heuristic is similar to the nearest neighbour, there are some differences in the initialization

step. The nearest neighbour randomly selects the initial node, while the greedy heuristic is greedy

at every step of the algorithm, including the initialization. The computational complexity of the greedy

heuristic is O(n2log2(n)), and the solutions generated by this heuristic are often within the 20% of the

optimal solution [69]. The pseudocode for the greedy heuristic is presented below.

1. Sort all arcs according to its weight

2. Select the lowest weight arc, if it does not violate any constraint

3. If the constructed solution is not complete, repeat (2)

2.2.2.C Tour improvement

An improvement heuristic is an algorithms that works over a valid and complete solution, in order to

improve it. The most common improvement heuristics are the 2-opt and 3-opt local search procedures

[72], and the Lin-Kernighan algorithm (LK). The latter is a particular implementation of the former two

local searches methods, in which a k-opt local search is employed, but the value of k varies during the

algorithm execution. The LK algorithm is very efficient and capable of presenting high quality solutions

for the symmetric TSP.

The 2-opt is a simple iterative local search procedure, in which two arcs of the solution are removed,

and a new solution is constructed by reconnecting the nodes in a different way. Given that only 2 arcs

are exchanged, there is only one way of creating a different cycle. Note that by performing a 2-opt ex-

change, the path between the two exchanged arcs gets reversed. If the graph is symmetric, the objective

function difference can be calculated by analyzing two arcs only. However, if the graph is asymmetric, it

is necessary to calculate the cost of the new path, as to evaluate the objective function difference. This

makes the 2-opt procedure particularly efficient for symmetric problems, but less adequate for asymmet-

ric problems and, as a consequence, the asymmetric and time-dependent variations of the TSP.

Figure 2.2: The 2-opt local search reconnects two edges, hoping to fold possible crossovers, decreasing the overalltour cost. In the left image, a crossover is identified. In the middle image, the edges belonging to thiscrossover are removed, and in the figure to the right, they are reconnected, forming a new valid tour.

26

The 3-opt search is very similar to the 2-opt, but instead of selecting two edges and reconnecting

the path, the 3-opt selects 3 edges. In this case, there are multiple ways of forming a new valid tour. A

3-opt move can also be seen as two or three 2-opt moves combined in the formation of a new tour. The

iterative cycle of the 3-opt search works in the same way as the 2-opt.

More generally, the k-opt local search is a method for rearranging a tour, by taking k edges and

reconnecting the paths in order to form a new valid tour. Any tour that is known to be k − opt is also

(k − 1)-opt. Some particular problems, as the crossing bridges, illustrated in figure 2.3, can only be

solved with a 4-opt or higher method.

Figure 2.3: The crossing bridges can only be solved by reordering 4 edges. The resolution of this problem withlocal search is only possible with 4-opt or higher.

In its turn, the Lin-Kernighan heuristic [72] is an algorithm for the resolution of the symmetric TSP,

and it was the state of the art for over 15 years. LK is known for producing high quality solutions, some

of them optimal, and for having a time complexity of approximately O2.2n [69]. This heuristic is restricted

to the the symmetric TSP, and using it for asymmetric instances requires a graph transformation, which

transforms the asymmetric instance with n nodes, into an equivalent symmetric one with 2n − 1 nodes

( [73]). Thus, for the same number of nodes, solving an asymmetric TSP with the LK heuristic is usually

4 times harder than solving the symmetric case.

To understand the Lin-Kernighan heuristic, it is necessary to think about the TSP in a slightly different

manner. Consider the following way of defining a combinatorial optimization problem: ”find, from a set

S, a subset T that satisfies some criterion C and minimizes an objective function f .” In the TSP, the

objective is to find, from the set of all edges (S) of a complete graph, the subset (T ) which forms a valid

tour (C) and minimizes the objective function (f ).

Any non-optimal but feasible solution T is non-optimal because k elements x1, ..., xk in T are out of

place. To improve this solution, and make it optimal, one would have to substitute the set of k elements

x1, ..., xk with the elements y1, ..., yk of S\T . Because there is no knowledge about how many elements

are misplaced, Lin and Kernighan consider that setting the value of k a priori would seem artificial. Thus,

they propose an iterative procedure in which the algorithm dynamically estimates the best value for k. In

order to do this, the LK first estimates the most out of place elements, x1 and y1. With these values set

aside, it tries to repeat this process for x2 and y2, and so on. This inner loop stops when no improvement

seams plausible and, at this point, it replaces the current solution T with the new solution generated from

27

replacing the now selected elements, and restarts the whole process.

2.2.3 Meta-Heuristic algorithms

Unlike classic heuristics, metaheuristic algorithms are designed to be applied to any combinatorial opti-

mization problem, and not to a specific problem of this class. Meta-Heuristic gain importance during the

1990’s, and have become one of the most important class of algorithms in computer science.

More formally, a meta-heuristic is an iterative generation process, which guides an underlying heuris-

tic by combining intelligently different concepts, for exploring and exploiting the search space, using

learning strategies to structure information, as to efficiently find optimal or near-optimal solutions [74].

This subsection will introduce a few of the most relevant meta-heuristics in the resolution of the

Traveling Salesman Problem. In particular, it will focus on the Ant Colony Optimization (ACO) and the

Simulated Annealing (SA), presented in subsections 2.2.3.A and 2.2.3.B, respectively. There is a vari-

ety of meta-heuristics which are not discussed here, but which have also been successfully applied to

the TSP. Examples of these meta-heuristics are the Tabu-Search, Evolutionary Algorithms, in particu-

lar the Genetic Algorithm, and many other Swarm Intelligence algorithms, from which the Ant Colony

Optimization is the oldest and most widely used.

2.2.3.A Ant Colony Optimization

The Ant Colony Optimization [11, 75, 76] is based on the behaviour of real ants, and was developed by

M. Dorigo et. al, based on the generalization of the double bridge experiment [77], [78], illustrated in

figure 2.4. This led to an adaptation of this experiment, substituting the double bridge with a graph, the

pheromone trail with artificial pheromone, and the real ants with artificial ants, which presented some

extra capabilities intended to facilitate the resolution of more complex problems [11].

Figure 2.4: The double bridge experiment. On the left, two bridges with the same length. Experimental resultsshow that ants distribute themselves evenly amonst both bridges. On the right, one of the bridges islonger than the other. Experimental results show that ants use the shorter bridge more often.

28

By using the model of a static combinatorial optimization problem, as defined in section 2.1, it is pos-

sible to derive a generic pheromone model that can be exploited by the Ant Colony Optimization. This

means that both the classical and the time-dependent TSP, which can be formulated by the aforemen-

tioned model, may be solved by the ACO metaheuristic. The following steps represents the algorithmic

skeleton for the ACO model, and each of its parts will be explained with more detail below.

Algorithm 1 ACO metaheuristic

1: procedure ANTCOLONYOPTIMIZATION2: Initialization3: while (termination condition not met) do4: ConstructAntSolutions5: ApplyLocalSearch . optional6: GlobalPheromonesUpdate

The general process of the Ant Colony Optimization algorithms is as follows. The algorithm starts

with a parameter initialization. This is also responsible for setting the pheromones levels to some initial

value τ0. This value is usually chosen using a heuristic function. For the TSP case, the chosen heuristic

is often the nearest neighbour.

After initialization, and until some specific termination condition is met, the ACO algorithm runs

in a loop, which consists of 3 main steps: i)solution construction; ii)local search (optional); and iii)

pheromone update. Each of these steps will be detailed below.

The solution construction is a process that is carried out by each of a specified number of ants.

Each ant starts with an initially empty solution, sp, and at each iteration step expands its solution with

a valid solution component, cji . This construction function is what differentiates the ACO algorithm for

every combinatorial optimization problem. By restricting the construction method to the agents (the

ants), the rest of the algorithm does not have to be heavily adapted to the specific model. However,

the function that is responsible for selecting the feasible solution components has to be aware of the

decision variables and the set of constraints. It has to determine those variables whose addition to the

partial solution do not constitute a violation to the set of constraints of the model. This set is represented

by N(sp).

Having the set of all feasible solutions components, N(sp), it is necessary to choose a single com-

ponent, cji . This selection is done probabilistically, and the choice takes into account both pheromone

(exploitation) and heuristic information (exploration). The algorithmic parameter q0 is responsible for

defining both method’s relative importance. The outline execution of the function is presented in equa-

tion 2.15, and is as follows. A random value, q, is set. If this value is lower than the algorithms parameter

q0, the selection of the node is done with the heuristic rule, (see equation 2.16). Otherwise, the Ant

System rule (presented in 2.17) is used.

29

cji =

heuristic rule, ifq ≤ q0ant system rule, otherwise

(2.15)

argmaxl∈Nkiτil[ηil]

β (2.16)

p(cji |sp) =ταij [η(cji )]

β∑cli∈N(sp)

ταil [η(ci,l)]β(2.17)

After finalizing the construction of valid solutions, the ACO algorithm may implement a local search.

Although this step is optional, it has been demonstrated that ACO algorithms reach their best per-

formance when local search is applied. The ant’s construction method is biased by the pheromone

information, while the pheromone values are biased by the quality of the solutions. Local search, also

called Daemon Actions, are techniques which intend to work on the existing solutions, exploring and

expanding the search space, and ultimately improving the quality of the solutions. The most widely used

local search methods are the 2-opt search, the 3-opt search, and the Lin-Kernighan heuristic. The final

step of each loop of the algorithm is the pheromone update, presented in equation 2.18.

τij = (1− ρ)τij +∑

s∈Supd|cji∈s

g(s) (2.18)

The pheromone update is responsible for making solution components that belong to good solutions

more desirable in the following iterations. To achieve this objective, two methods are implemented. The

first is pheromone deposition, which increases the pheromone intensity of the solution components be-

longing to the most promising solutions. The amount of solutions that are used to deposit pheromone is

a parameter of the algorithm. The second method to achieve this step’s goal is pheromone evaporation.

While it may seam counter-intuitive to deposit pheromones and, at the same time, also evaporate them,

this step is crucial to avoid a rapid convergence to sub-optimal solutions. Pheromone deposition alone

is responsible for making good solutions more desirable, while pheromone evaporation reduces both

the desirability of bad solutions and the sub optimal convergence of good solutions, favoring a better

exploration of the search space.

2.2.3.B Simulated Annealing

The Simulated Annealing metaheuristic was developed using an analogy between the physical anneal-

ing in solids, and finding the minimum cost configuration in combinatorial optimization problems. In the

physical world, annealing is the process of heating a metal until the melting point, and reducing the

temperature slowly and in a controlled way. The decrease of the temperature results in a particle re-

30

arrangement, in which lower energy states are reached. When the heating temperature is very high,

and the temperature is decreased very slowly, this will result in the ground state of the solid - its min-

imum energy state. The analogy between physical world and the combinatorial optimization problems

is achieved by considering that the energy of the metal corresponds to the cost of the solution, and the

particle rearrangement consists in the selection of a neighbourhood solution [79].

Most simulated Annealing algorithms consist in an iterative improvement algorithm, which stochasti-

cally accepts up-hill moves. More precisely, the procedure starts with the selection of a feasible solution,

together with the initialization of some algorithmic specific parameters, such as the temperature, which

is used as a control variable. After this, the SA enters an iterative process (Markov chain). At the core of

this iterative process is a local search heuristic, which is executed a fixed number of times per iteration.

After this local search procedure is complete, the temperature is decreased according to the predefined

cooling schedule. After this, the local search restarts and the cycle continues, until either the execution

time is reached, or the temperature reaches the value of zero.

Simulated Annealing differs from other iterative improvement algorithms due to its ability to escape

local minima, by accepting up-hill moves (as happens also, f.e., with the Tabu Search). More precisely,

at each stage of the local search procedure, the difference in the energy level (∆f ) between the current

state (f(x)) and the newly generated state (f(y)) is calculated. If ∆f is negative, the new state is better

than the current one, and it is always accepted. On the contrary, if ∆f is positive, the state is accepted if

Metropolis criteria, presented in equation 2.19 is verified. Otherwise, the new solution is rejected. Note

that by using this criteria, as the temperature approaches zero, the metaheuristic becomes increasingly

greedy.

py =

1, if f(y) ≤ f(x),

e−∆ft , otherwise

(2.19)

Time-dependent scheduling problems can also be solved using this meta-heuristic [80], as the algo-

rithm solely relies on the search of a neighbourhood set. The following algorithmic skeleton defines the

Simulated Annealing procedure.

Algorithm 2 SA metaheuristic

1: procedure SIMULATED ANNEALING Initialization2: while Termination condition not met do3: while Length of Markov chain not reached do4: Generate candidate solution5: Apply Metropolis acceptance criteria

Update temperature

In the first published works about the Simulated Annealing [81], it was proven that if the temperature

is cooled very slowly, the process will converge to the optimal solution. More precisely, if temperature

31

drops no more quickly than C/log(n), where C is a constant and n is the number of steps taken so

far [71]. However, this result is not as relevant as it first seems, because this cooling schedule is very

slow. Some authors refer that it is faster to do exhaustive search than to follow this cooling schedule [71].

Hence, the Simulated Annealing metaheuristic requires the definition of a cooling schedule, a neigh-

bourhood function, and the Markov chain length. There are several reports that describe the influence

of these modules in the overall performance of the SA procedure [82].

32

3Flying Tourist Problem

Contents

3.1 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.2 Relation to the Traveling Salesman Problem . . . . . . . . . . . . . . . . . . . . . . . 36

3.3 Graph representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.4 Dimensional overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.5 Optimization methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

33

As referred to in chapter 1, the main goal of this work is to develop an application to solve uncon-

strained multi-city flight requests, as well as simpler flight requests, such as the one-way and round-trip

flights. In order to develop a system capable of addressing both simple and complex requests, we will

formally introduce, in section 3.1, a problem definition called the Flying Tourist Problem (FTP). In section

3.2 it will be shown that the FTP occurs as a generalization of the Traveling Salesman Problem, and it is

thus NP-hard. In its turn, section 3.3 shows how to construct a graph to represent this problem. Finally,

section 3.5 presents the strategies to solve the problem, which include both heuristic and metaheuristic

optimization algorithms (subsection 3.5.1 and 3.5.2, respectively).

3.1 Problem formulation

Consider a tourist who wishes to take a trip that visits every node (city) vi in the set of nodes V , |V | =

N , with no particular order. The start node will be denoted as v0, while the return node as vn+1, and the

complete set of nodes is given by Vc = V ∪v0∪vn+1. The trip must start at a time t ∈ T0 = [T0m, T0M ].

Upon visiting a node, the tourist will stay there for a duration of d time-units (days). Consider that for each

node to be visited, there is a range for the value d might take, that di ∈ [dim, diM ] and diM ≥ dim ≥ 1

(nights). The complete set of durations associated to each city is given by D, and |D| = N . Furthermore,

to each city vi ∈ V , there is an associated time-window TW , |TW | = |V | = N , which defines the set of

dates in which the city vi may be visited.

By following this definition, the FTP is completely defined by a structure G = (Vc, A, T0, D, TW ),

used to create a multipartite graph describing the request. This multipartite graph is divided into k

layers, where each layer corresponds to a particular moment in time. Besides this, every node in a layer

is connected to all nodes in the subsequent layer. The set of arcs that connects these nodes is given by

A. To each arc atij ∈ A, which connects node vi to node vj at time t, it is associated a cost ctij (ticket

cost) and a processing time ptij (flight duration), which depend upon the routed nodes, as well as the

time in which the arc transition is initiated, that is, ∀atij ∈ A, ctij ≥ 0 and ptij ≥ 0.

A valid solution s to the FTP is a set of arcs (commercial flights) which start from node v0 during the

defined start period, visit every node vi in V during its defined time-window TW (vi), and respect the

duration of stay in each node, defined by D(vi), before finally returning to node vn+1. The set of all valid

solutions is given by S. The goal of the FTP is to find the global minimum s∗ ∈ S, with respect to the

considered objective function.

The objective function associated to this problem depends on the user criteria. While some users

might consider the total cost to be the most important factor, others may consider that the total flight

duration is of crucial importance. Thus, a total of three different objective functions shall be herein

considered: (i) the expended cost (see eq. 3.1), (ii) the flight duration (see eq. 3.2), and (iii) the

34

Figure 3.1: Illustration of a Flying Tourist Problem using a multipartite graph. To each node (A,B,C) it is associateda waiting period of respectively (1,2,3) time units. The red arrows represent a possible solution to theproblem.

resulting entropy (see eq. 3.3), where the latter corresponds to a weighted sum between the former two.

Fc(s) =

N+1∑n=0

c(s[n]) (3.1)

Ft(s) =

N+1∑n=0

p(s[n]) (3.2)

Fe(s) =

N+1∑n=0

wc ∗ c(s[n]) + wp ∗ p(s[n]) (3.3)

Figure 3.1 illustrates the multipartite graph associated to a simple instance of the FTP with vn+1 =

v0 = X, one possible start date (t = 0), 3 nodes to visit (A,B,C), with a fixed duration of respectively

(1,2,3) time-units, and no constraints relative to the time-window of each city. A possible solution to this

problem instance corresponds to the set of arcs (a0X,B , a2B,A, a

3A,C , a

6C,X).

Despite the apparent complexity of the proposed definition, it can be used to state very simple flight

searches, including one-way and round-trip flights. For example, the problem of finding a single flight

from A to B at date T can be instantiated as a FTP given by v0 = A, vn+1 = B, T0 = T , and V = D =

TW = . In its turn, a round-trip flight involving the same two cities and the same start date, in which

the staying period in B is b days, is given by v0 = vn+1 = A, T0 = X, V = B, D = b and TW = .

Thus, this definition is adequate either for simple and complex trips, which can be customized according

to the user search criteria, by setting either an extended start dates, or flexible durations.

35

3.2 Relation to the Traveling Salesman Problem

The Traveling Salesman is the problem of, given a list of cities, finding the best route to visit them all,

according to some objective function. In its turn, the Flying Tourist Problem proposed in section 3.1

intends to find the best route, schedule, and set of flights to visit a given list of cities. This section will

explore some of these characteristics which distinguish the FTP from the TSP.

It is possible to reduce the Flying Tourist Problem into the Traveling Salesman Problem, by consider-

ing the following set of restrictions:

1. vn+1 = v0;

2. T0 = 0;

3. TW (vi) = [0,+∞[, ∀ vi ∈ V ;

4. D(vi) = 1, ∀ vi ∈ V ;

5. ctij = cij , ∀ vi, vj ∈ V , ∀t.

Constraint 1) operates over the depot, forcing the initial and final node to be the same. While this

constraint is not forced in the FTP, because a user might not necessarily want to finish the trip where it

was initiated, the TSP considers a single depot.

Constraint 2) is responsible for limiting the start-period to a single time-unit. In a real-world flight

search application, this constraint is extremely undesirable, as it reduces the overall quality of the search.

Constraint 3) removes the time-windows constrains imposed to each city. This mean that any city

might be visited during any time period.

Constraint 4) forces each city to be visited during exactly one time-unit. Once again, this constraint

is extremely undesirable in a flight search application, since in most cases, users do not want to spend

only one night in a destination.

Finally, constraint 5) removes the time-dependencies of the cost matrix. It should be noted that the

characteristics of commercial flights contradict this constraint.

Applying constrains (1-4) to the proposed FTP leads to the time-dependent Traveling Salesman

Problem [83], as defined in section 2.2.1.A.

Constraints (1-4) together with (5) reduce the proposed FTP into the asymmetric Traveling Salesman

Problem. In order to reduce it to the classical symmetric TSP, it would be necessary to apply an additional

constraint which would force the symmetry of the cost matrix.

Given that the FTP occurs as a generalization of the TSP, and given that the latter belongs to the

class of NP-hard problems [20], then so does the former one.

36

3.3 Graph representation

As described in section 3.1, a FTP instance is completely described by a structureG = (Vc, A, T0, D, TW ).

Note that the previously referenced structure requires a set of arcs A connecting every pair of nodes

belonging to Vc. Thus, it is possible to represent this information in a weight matrix, where each of its

entries corresponds to a particular arc.

Since the presented formulation of the FTP wishes to addresses a real-world situation involving

commercial flights, upon constructing the weight matrix, it is necessary to take the characteristics of

commercial flights into account. For any pair of cities, at any given moment in time, there are several

commercial flights that connect these two cities in that particular moment. These flights are denoted as a

family of arcs. Given that there is a direct mapping between commercial flights and FTP arcs, then there

are multiple arcs for each weight matrix entry. Moreover, a commercial flight connecting two cities, does

not have a constant price over time. Thus, the value of an arc connecting two cities, varies over time.

This means that a weight matrix of a FTP instance is three dimensional, with three variables i, j, t, where

i is the origin node, j the destination node, and t the moment in time at which the transition occurs.

In particular, it is possible to consider a weight matrix in which, for every entry, there are multiple

values, corresponding to the different possible arcs for that pair of nodes and time. Consequently,

accessing a particular arc requires not only the triplet (i, j, t), but also some information about which

particular arc to select.

Given that the dimensions of a FTP weight matrix are considerable, considering a family of arcs

for each weight matrix entry is not recommended, since it would increase the weight size even more.

Instead, a pragmatic strategy is followed. Upon constructing a weight matrix for the FTP instance, the

objective function is taken into account. This means that, instead of considering that there are multiple

arcs for the triplet (i, j, t), it is considered that there is only one: the one that has the minimum value

according to the objective function. For example, if the objective function intends to minimize the total

trip cost, upon constructing the weight matrix, for each family of arcs (i, j, t), only the minimum cost arc

would be selected. By using this strategy there is only one available arc, for each cost matrix entry, and

it is the one which minimizes the objective function.

The price of a commercial flights usually depends on the cities, dates, and direction of traversal.

This means that the weight matrix of the FTP is not necessarily symmetric. Moreover, there is also

no guarantee that a commercial flight between two cities exist. In fact, there are many cities which

do not have a direct flight connection. Fortunately, many commercial flight providers have this into

consideration, and try to establish an indirect connection between any two cities by adding connecting

flights. However, this is not always the case, and it may be necessary to initialize each entry of the

weight matrix to a very high value, in order to discard these non-existent flights from a possible solution.

To conclude the analysis of the characteristics of the FTP weight matrix, it is worth mentioning that

37

the matrix does not need to be complete, because not every arc is relevant for the construction of a

solution. While it is necessary to have every arc connecting two pair of nodes belonging to the set of

nodes to be visited, it is not necessary to have every arc connecting the initial and final nodes to the

others. In reality, the arcs leaving from and returning to the initial and final node, respectively, are only

necessary in particular moments of time. To better understand this, consider figure 3.2, which illustrates

the necessary arcs to construct a solution to a FTP instance. Every arc of a FTP instance can be

classified into three different groups, according to their characteristics: initial, transition and final arcs.

Figure 3.2: Illustration of the distribution of the initial, final and transition arcs.

The initial arcs are those which might initiate the trip. Consequently, they must start at node v0,

at t ∈ T0 = [T0m, T0M ], connecting v0 to every node in V . In its turn, the final arcs are those which

connect every node in V to the return node, vn+1 at t ∈ [Tfm, TfM ], where Tfm = T0m +∑D and TfM

= T0M +∑D, and

∑D is the total trip duration, calculated by adding the number of nights in every

destination. There are a total of ki = T0M − T0m + 1 = TfM − Tfm + 1 initial and final arc layers. In the

example depicted in Figure 3.1, there is a single initial and final layer, since there is only one possible

start date.

The transition arcs are those which fully connect the N nodes belonging to V . In the example

presented in figure 3.2, the earliest transition arc occurs at a time no sooner than t1 = T0m + min(D),

where min(D) corresponds to the lowest value of the set of durations. Hence, if the trip starts by

traversing an initial arc at time T0m, the first transition arc must only be traversed min(D) time-units

later. By following a similar approach, the latest transition arc can occur no latter than t2 = T0M +∑(D)−min(D). Thus, there are a total of k2 = t2− t1 + 1 transition layers, and k2 ∗n ∗ (n− 1) transition

arcs.

The union of the initial, transition and final arcs gives the set A of all the arcs, which may be used to

construct a solution to the requested trip.

38

3.4 Dimensional overview

This section presents a brief analysis of the weight matrix dimensions of an FTP instance, in particu-

lar, the number of entries and its size in memory. It also includes an overview of the solution space

associated to a FTP request.

The FTP is described by a tri-dimensional array (weight matrix), with shape (|V c|, |V c|, |T0|+∑D),

where Vc is the complete set of nodes (note that|V c| = (n + 2)), and T0 is the length of the start period

and∑D is the total trip duration. The total number of entries (ne) of the weight matrix is given by

ne = (n+ 2)2 × (T0 +∑D).

Consequently, the total number of entries of the weight matrix depends mostly on the number of cities

to visit, since there is a quadratic relation. Furthermore, it is always true that 1 ≤ (T0 +∑D) ≤ 365,

since commercial flights are not sold with more than one year in advance. Thus, it is possible to simplify

the expression of the number of entries of a weight matrix by considering ne = k(n + 2)2, where k is

(T0 +∑D).

Each entry of the weight matrix shall be represented with a 32bit integer and thus, it is possible to

calculate the size of the weight matrix in memory (mgraph), given by mgraph = 4 ∗ ne (bytes).

Hence, when considering a TSP instance n cities to visit, there are n! possible routes. The goal of

the FTP is to find both the best route and schedule for a trip. Note that in a FTP problem, the trip start

length is given by T0, and thus, there are at most T0 different possible schedules. Thus, the size of the

solution space to a FTP is given by |S| = T0 ∗ n!

From the above, it is possible to conclude that the size of the search space of an FTP instance is

closely related to that of the TSP. In particular, if there is a single start date, the size is the same. As the

length of the start period increases, than so does the search space. This increase is always linear, and

it is usually below 3 orders of magnitude.

Note that the procedures considered above are only true after reducing the family of arcs into a single

arc, according to the objective function, as described in the previous section. Prior to this, instead of

having a single 32bit integer representing a weight matrix entry, there is a structured data object (JSON)

with flight information. In general, each flight roughly occupies 3500 bytes, and there are multiple flights

(≈1-100) for each pair of cities, and for each date.

3.5 Optimization methodology

This section introduces the considered optimization algorithms to produce a solution to a Flying Tourist

Problem, as defined in section 3.1. Given the real-world application under development, and its goals,

the devised optimization system should be capable of providing a stream of responses in finite-time. Due

to these objectives, the considered optimization strategies are based on heuristic algorithms (subsection

39

3.5.1) and metaheuristics algorithms (subsection 3.5.2), as the characteristics of these algorithms fit the

goals of the system.

3.5.1 Heuristic algorithms

3.5.1.A Pseudo-random construction procedure

Formally, the method introduced in this subsection is not an optimization algorithm, but rather a solution

construction procedure for the Flying Tourist Problem. This procedure is relevant for two reasons. First,

it can be used to construct a preliminary solution to a given request in a very fast manner. Naturally,

the quality of this solution may be extremely poor, but it is very useful as an initial and fast response

to a user request. Furthermore, this construction procedure is also relevant, because some optimiza-

tion algorithms, like the Simulated Annealing (discussed in section 3.5.2.B), require an initial valid and

complete solution.

The method introduced below will be hereinafter called pseudo-random construction procedure, and

requires an instance of the Flying Tourist ProblemG = (V c,A, T0, D, TW ) where there are no restrictions

regarding the time-windows, that is TW (i) = [0,+∞[, ∀i ∈ V . The steps of this procedure can be

summarized as follows:

1. set an initial empty solution: s = ();

2. set the current time to one of the possible start dates: t ∈ T0 = [T0i, T0f ];

3. set the current node to the start node: vc = v0;

4. if the set of nodes to visit, V , is empty, go to step 11); else, continue to step 5)

5. select the next node by choosing a random node from the set of nodes to visit: vi ∈ V ;

6. remove the selected node from the set of nodes to visit: V = V \vi;

7. extend the solution with the arc: atvc,vi ;

8. increment the time according to the duration of visit of the selected node: t = t+ d(vi);

9. update the current node: vc = vi;

10. go to step 4);

11. extend the solution with the final arc atvc,vn+1.

The proposed pseudo-random construction procedure is expected to be adequate for single-flights

and round-trips, as well as small multi-city requests. As the number of cities to visit increases, the

quality of the solutions presented by this solutions is expected to fall. In any case, despite the size of the

instance under resolution, this procedure is expected to be very fast and able to return a solution to a

request in a very short time.

40

3.5.1.B Nearest neighbour procedure

The nearest neighbour is a solution construction procedure that starts with an initial empty solution, and

at each step of the algorithm updates the current solution be extending it with a solution component (i.e.,

an arc). Thus, this construction procedure is very similar to the one described in subsection 3.5.1.A.

However, while the previous construction procedure selects the next node to visit in a pseudo-random

way, the nearest neighbour heuristic takes a different approach, by selecting the next node according to

some particular objective.

During the development of this work, two different nearest neighbour heuristic were used. The first

only takes into account the distance between nodes, visiting always the closest node relative to the

current one. This is exactly the nearest neighbour procedure applied to the Traveling Salesman Problem.

In the second approach, instead of considering the distance between nodes, it considers the targeted

objective function. That is, if the objective is to minimize the total cost, than this heuristic will always

select the node according to the minimum cost arc. In its turn, if the objective is to minimize the flight

time, or any other criteria, then it is this criteria that is used upon selecting a node to visit, always

choosing the node which minimizes the increase in the current objective function.

In order to distinguish the two nearest neighbour heuristic, we will denote them as dNN and rNN,

that is, distance nearest neighbour and refined nearest neighbour, respectively. Note that the first only

takes into account the distance between nodes, and not directly the objective function, while the latter

takes into account the objective function, and completely disregards the distance between nodes.

The nearest neighbour construction procedure can be adapted from the previously introduced pseudo-

random construction procedure by simply replacing the construction step number 5). Thus, the distance

nearest neighbour considers:

• select the next node by choosing the one closest to the current node:

vi ∈ V : d(vc, vi) ≤ d(vc, vj), ∀ vj ∈ V \ vi

while the refined nearest neighbour considers:

• select the next node by choosing the one which increases the objective function the least:

vi ∈ V : f(vc, vi) ≤ f(vc, vj), ∀ vj ∈ V \ vi

It is worth nothing that the application of the distance and the refined nearest neighbour heuristics

require different levels of information. On one hand, the distance nearest neighbour requires the knowl-

edge of the distances between each pair of cities. On the other hand, the refined nearest neighbour

requires a complete weight matrix regarding the objective function.

41

3.5.2 Metaheuristic algorithms

3.5.2.A Ant Colony Optimization procedure

The considered Ant Colony Optimization (ACO) algorithm receives, as input, a weight matrix with the

information regarding all solution components of the problem. It also receives other relevant parameters

for the solution construction process, as the initial and final node, v0 and vf , the start period T0, and the

set of waiting periods D.

The initialization of the ACO metaheuristic requires the construction of an initial pheromone matrix.

Each entry of this matrix is set to an initial pheromone value, according to Eq. 3.4, where n is the number

of nodes and Cnn is the cost of the nearest neighbor heuristic [84].

τ tij = τ0 =1

nCnn(3.4)

The initialization of the metaheuristic also requires the definition of a variety of algorithm-specific

parameters, such as the number of ants m, the pheromone evaporation rate ρ, the heuristic relative

influence β, the pheromone relative influence α, and the exploration rate Q0.

After the initialization, and until the termination condition is met, the algorithm enters an iterative

cycle, where every ant belonging to the colony constructs a solution to the problem. This is followed by

a pheromone update phase, to reflect the colony search experience. A new iteration may only start after

all ants have finished the solution construction process and the pheromone matrix has been updated.

The construction process undertaken by each ant is as follows. First, the current time is set to a

value belonging to the allowable trip start dates, t ∈ T0, and the current node is set to the start node

v0. Each ant enters an iterative cycle until all nodes belonging to the set of nodes to visit, V , are

visited exactly once. At every step of this cycle, an ant chooses one of the remaining valid solution

components. After the selection of a solution component, the current time is updated, according to the

duration of the selected city. Furthermore, it is also necessary to apply a local pheromone update (see

equation 3.8), after the selection of each solution component, as to reduce the probability of other ants

selecting the same one in the current iteration [84]. By following this iterative construction process, a

valid but incomplete solution is found. To complete this solution, it is necessary to add an extra solution

component, which closes the route by adding the return node, vn+1.

In the construction process described above, each ant selects the next solution component by either

exploiting or exploring the search space. That is, exploitation is the process of selecting the next solution

component mostly based on the previous ants’ search experience, while exploration intends to diversify

the traversed search space. The decision of exploiting or exploring depends on the algorithm parameter

Q0 and on a pseudo-random value q, calculated at run-time. The selection of the solution component

vj , which identifies the next city to be visited, is given by Eq. 3.5.

42

vj =

exploitation (Eq. 3.6), if q ≤ Q0

exploration (Eq. 3.7), otherwise(3.5)

The exploitation of the search space utilizes the pseudorandom proportional rule, defined by Eq. 3.6,

which determines the next solution component of the ants’ solution. The Jk(i, t) term represents the

set of solution components that might be selected to form a valid solution component by an ant in its

current state, where the state refers to the current ant position of the trip it has constructed so far. In the

presented equations, η is the inverse of the weight matrix value.

argmaxj∈Jk(i,t)[τ(i, j, t)][η(i, j, t)]β (3.6)

On the other hand, the exploration is given by Eq. 3.7, with pa(i, j, t) representing the probability of

ant a (which is currently at node i at time t) selects j as the next node to visit.

pa(i, j, t) =

[τ(i, j, t)][η(i, j, t)]β∑

u∈Jk(i,t)[τ(i, u, t)][η(i, u, t)]β, if j ∈ Jk(i, t)

0, otherwise(3.7)

After each ant finishes its iterative solution construction process, the ACO metaheuristic enters into

its pheromone update step. Depending on the chosen ACO algorithm, the pheromone update may vary.

This work follows the Ant Colony System (ACS) strategy, whose pheromone update requires both a

deposit and an evaporation step. Unlike many other ACO algorithms, the pheromone update applies

only to the arcs belonging to the best solution found so far, Sbs. This results in the update of the

pheromone values by means of Eq. 3.9, where (∆τ tij)bs is given by 1/Cbs, where Cbs represents the

objective function value of the best solution.

τ tij = (1− ρ)τ tij + ρτ0 (3.8)

τ tij = (1− ρ)τ tij + ρ(∆τ tij)bs (3.9)

It is common (and often recommended) to combine ACO algorithms with local search heuristics,

also denoted daemon actions, that try to improve the quality of the constructed solutions, after each

of the ants’ iterative cycle. However, this was not applied to the proposed optimization, due to the

nonexistence of adequate local search procedures for the time-dependent TSP. In fact, even the k-opt

exchange procedures, widely used in the classical TSP as local search, are not efficient for the time-

dependent TSP because it requires, at each step, the computation of the entire trip cost, as opposed to

just the cost difference regarding the k arcs, as in the symmetric TSP.

43

3.5.2.B Simulated Annealing procedure

The considered Simulated Annealing algorithm receives, as input, a weight matrix with the information

of the solution components of the problem. It must also receive other relevant parameters for the so-

lution construction process, as the initial and final node and the set of waiting periods D. This specific

information about the instance under optimization is crucial, as it enables the validation of a solution and

the calculation of the respective objective function value.

The general procedure of the Simulated Annealing metaheuristic is as follows. Given an initial so-

lution (x), at each step of the inner cycle (also called Markov chain), a new candidate solution (y) is

constructed based on a neighbourhood function, which is usually problem specific. Depending on the

quality of the new solution, and the current temperature of the algorithm, this solution may or may not be

accepted. This process of constructing and conditionally accepting a new solution occurs a fixed number

of times per outer cycle - this number is referred to as the Markov chain length. Having completed one

Markov chain, the temperature of the state is decreased, according to a predefined cooling schedule.

Given the above described procedure, the Simulated Annealing metaheuristic requires:

1. an initial solution - over which the local search operates;

2. a neighbourhood function - used to construct candidate solutions;

3. an acceptance criteria - used to conditionally accept the candidate solutions;

4. a cooling schedule - to decrease the temperature of the state.

As a local search metaheuristic, the Simulated Annealing conditionally requires an initial solution. It is

possible to construct this initial solution by applying the pseudo-random construction procedure (section

3.5.1.A), or by using the nearest neighbour (section 3.5.1.B). In general, the quality of the initial solution

does not affect the quality of the best solution found by the algorithm [85].

The considered neighbourhood function selected for the generation of new candidate solutions is the

2-opt swap procedure. Hence, at each iteration step of the Markov chain, it selects two random nodes

and swaps the corresponding path. It is also necessary to take into account both the initial and final

nodes in order to produce a valid solution. Since this swapping procedure may change the dates at

which each node is visited, which consequently changes the solution arcs, it is necessary to adjust the

flight dates and calculate the objective function value of the candidate solution.

The acceptance criteria is a function that determines the probability of accepting a candidate solution.

The developed SA algorithms uses the Metropolis acceptance criteria [86], presented in equation 3.10,

which can be summarized as follows: i) if a candidate solution y is better than the current solution x, it

is always accepted; ii) if a candidate solution is worse, it may, or may not, be accepted.

py =

1, if f(y) ≤ f(x),

e−∆ft , otherwise

(3.10)

44

The probability by which a worse solution is accepted depends upon the difference in the objective

function values (∆f ) of the two solutions and the current temperature of the system (Eq. 3.10). As ∆f

increases, and as the temperature decreases, the probability of accepting a worse solution is reduced.

With such an approach, the Metropolis acceptance criteria allows up-hill moves, which enable the algo-

rithm to escape from local minimum. Notwithstanding, as the temperature reaches very low values, the

algorithm becomes increasingly greedy.

The developed SA optimization uses a geometric cooling schedule. It starts with an initial tempera-

ture t0, and at each outer iteration, the temperature is decreased, using equation 3.11, where k is the

iteration counter of the outer loop and λ is the cooling parameter.

tk+1 = λ ∗ tk (3.11)

The cooling schedule parameters t0, tf and λ must be calculated beforehand based on the prob-

ability of accepting a worse solution during the first iteration (p0) and during the last iteration (pf ), and

on the total number of outer iterations (k). The defined algorithm establishes p0 as 0.98 and pf as a

positive value close to zero. The total number of iterations is set according to the time available for the

optimization process, and the length of the Markov chain (M ) is set to the number of nodes m.

To calculate the value of t0 and tf , the algorithm starts by generating some candidate solutions

using the neighborhood function and the current solution x [87]. These candidate solutions are used to

calculate the average absolute difference in the objective function ∆avg. This allows the calculation of

the t0 value according to Eq. 3.12, based on the Metropolis criteria. The final temperature tf is given

by tf = λkt0. This allows the calculation of λ with Eq. 3.13. Given t0, tf and λ, the geometric cooling

schedule is completely defined.

t0 =−∆avg

ln(p0)(3.12)

λ =

(−∆avg

ln(pf )t0

)1/k

(3.13)

3.6 Summary

This chapter enabled the definition of a problem (FTP) which is adequate for the characterization of both

simple and complex trips. The instantiation of this problem requires a weight matrix where each of its en-

tries corresponds to a single flight. This problem was shown to be a generalization of the TSP and thus,

it belongs to the class of NP-hard problems. Due to its computational complexity, the methodologies

proposed to solve the problem rely on heuristic and metaheuristic optimization algorithms.

45

4System Design and Implementation

Contents

4.1 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

4.2 Underlying technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

4.3 Client Side Application implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 56

4.4 Server Side Application implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 60

4.5 Optimization system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

47

While chapter 3 presented a formal definition of the problem and proposed some methodologies

to solve it, this chapter addresses the design choices and implementation details related to the devel-

opment of a flight search application to solve unconstrained multi-city flight requests. It presents an

overview of the architecture (section 4.1), structure and design choices (subsections 4.1.1 and 4.1.2).

It also presents the implementation details of the developed application. In particular, it describes the

underlying technologies (section 4.2), the structure of the CSA (subsection 4.3) as well as the current

user interface (subsection 4.3.3), and the SSA API and DMS (section 4.4).

4.1 System Architecture

The aimed web application should allow users to search for the best schedule, route and set of flights, for

both one-way and round-trip flights, as well as unconstrained multi-city trips. During the formal definition

of the problem, in section 3.1, it was shown that these problems can be converted into a FTP instance,

which is a generalization of the TSP. Because of this, as the number of cities increases, the problem

becomes increasingly more difficult to solve. In order to cope with this, each user defined request is

solved using the optimization algorithms previously presented in chapter 3.5.

The proposed system is structured into two separate applications: the Client-side application (CSA)

and the Server-side application (SSA). The client side application is designed to solely interact with

the user, redirecting requests to the server side application, whose goal is to solve these requests.

Thus, there is a complete separation of concerns between both applications: the CSA serves only as

an input/output port, and the application logic and intelligence is handed by the SSA. The communica-

tion between both application relies on the Hypertext Transfer Protocol (HTTP) and on Asynchronous

JavaScript and XML (AJAX). This means that the CSA may request data from the SSA using a simple

HTTP protocol, and that this communication is asynchronous, allowing the user to continue to interact

with the application, even while the response is being prepared. The structure of the proposed appli-

cation, and the data-flow associated to the resolution of a user defined request are presented in figure

4.1.

Figure 4.1: Structure and data flow of the proposed application.

48

4.1.1 Client Side Application

The CSA is a web application designed to interact with the user, allowing the definition of flight requests

and the presentation of meaningful solutions. The response to these requests are not processed directly

by the CSA, as it would consume too much resources (CPU and RAM) of the users device, and are,

instead, redirected to the SSA to be solved. Thus, the CSA intends to be only an input/output port

between the user and the SSA.

Each user defined request is constructed in such a way that it can be used to instantiate a FTP, as de-

fined in section 3.1. Each FTP request is a specific resource, uniquely identified by a particular Uniform

Resource Identifier (URI). This will be further detailed in section 4.4.2. Each of these resources are

used to request a solution to the request from the the server side application. Following this convention,

the user interface must enable the collection, from the user, the following information:

• the start city and return city, v0 and vn+1, respectively;

• a list of cities to be visited V , and the durations D associated to each;

• the start dates T0 associated to the trip.

Upon receiving a solution to a user request, the User Interface (UI) must be updated, displaying the

relevant information of these solutions. Each solution to a request is an object which contains at least

one set of flights that satisfy the user defined itinerary. However, a response to a request should contain

several valid solutions, so that the user might choose the most adequate to his needs. Each solution

is composed of one or several flights, and each of the presented flights should contain, at least, the

following information:

• the flight cost;

• the flight duration;

• the date, departure and arrival time;

• the number of layover flights.

Having a clear idea of the objectives and structure of the CSA, it is possible to discuss its actual

design. The User Interface can be separated into two independent views: the Request and the Re-

sponse view. These views allow the definition of user requests, and the visualization of the constructed

response, respectively. It would also be useful to define and implement a third view: a Map view. While

the request and response views are essential to the overall function of the application, the map view is

not. However, a map capable of displaying the routes of the selected flights, as well as other relevant

information, would certainly contribute to a better and more complete user experience.

49

A web application can be accessed by multiple devices, as phones, tablets and computers, and

each of these devices has different screen dimensions. Because of this, the proposed web application

should be responsive. That is, the design and dimensions of the user interface should be adequate and

responsive to the size of the device rendering it.

With this in mind, the proposed UI should follow the design illustrated in figure 4.2. Notice that there

are a total of three views: the request and response views, which are essential, and thus are always

present; and the map view, which is not, and thus can be discarded on some smaller devices.

Figure 4.2: Proposed User Interface layout for small/medium and large devices. There are two essential views andone optional view.

4.1.2 Server Side Application

The SSA is responsible for producing a solution to a user defined FTP instance. To do so, the SSA

shall be implemented as an Application Programming Interface (API), listening for requests to particular

resources, which uniquely identify these FTP instances. Each received request corresponds to a shallow

FTP instance, that is, a problem in which the cities are specified, but the flights are not. To overcome

this problem, and collect the relevant flight data, the SSA will rely on a Data Managemet System (DMS),

discussed in section 4.1.2.A. Having a complete FTP instance, the SSA will use the Optimization system

discussed in section 4.1.2.B to construct a solution to these user requests.

4.1.2.A Data Management System

Upon receiving a user defined request, the set of nodes, as already well as the start time and durations,

are well defined. On the other hand, the set of arcs which connects these nodes are not. For example,

50

a user request may correspond to a single flight from A to B, at time t, and, upon receiving this request,

there is no available information regarding the possible flights (arcs) between these two cities. In fact,

that is exactly what the user is searching for. Thus, the goal of the Data Management System is to

collect the necessary information to construct the list of arcs associated to a user request.

The communication with a flight API utilizes a simple HTTP protocol, and every request is identified

according to an Uniform Resource Identifier, whose syntax is defined by the API provider. A response

to a request usually consists in a data tree, by using a structured data format, such as JavaScript Object

Notation (JSON). Every response includes a list of possible flights, and each flight has a vast number of

attributes, as the cost, flight duration, departure time, and so on.

Although there are several publicly available flight data API’s, the information provided by each of

those might be considerably different. Multiple flight search applications were compared in chapter 1

from a simple services offer perspective, and the results were presented in tables 1.1 and 1.2. The

analysis of this comparison from the corresponding API perspective shows that the flight data presented

by each of these API’s varies considerably. For example, the cost of a single flight may be up to 44%

higher, according to the API flight data provider.

Hence, one of the main goals of the proposed web application is to find the best set of flights for

a given query, according to some objective function and, in particular, the minimization of the total

flight cost. Given that there are considerable differences among flight APIs, ideally, the proposed web

application should query multiple APIs.

The role of the Data Management System is of crucial importance for the development of a high

quality flight search web application. This is because every user request seeks to find a given set of

flights to a satisfy a particular itinerary. Thus, it is of extreme importance to have the most up-to-date

flight data for each of the possible flights.

4.1.2.B Optimization system

The main goal of the Operating System (OS) is to produce solutions to user defined FTP instances.

Upon defining an itinerary and translating it to a FTP instance, the arcs which connect the nodes are not

defined, and thus, no valid solution can be produced. Thus, the OS is heavily dependent on the Data

Management system, as it requires relevant flight data to process the requests.

Depending on the particular request being processed, the required time to collect all the necessary

flights, or the time to run a computationally heavy optimization algorithm, might be very high. Because of

this, the optimization system is distributed into different layers, as illustrated in figure 4.3, which require

different amounts of information regarding the flights, and produce multiple solutions using different

heuristics, at different times. The goal of this is to reduce the latency that is sensed by the user, by

producing an initial solution as soon as possible, and continue to search for a better one afterwards.

51

Figure 4.3: Simplified illustration of the optimization system, which utilizes different algorithms to produce a solutionto a user defined request.

The first layer of the optimization system implements a random solution, as proposed in section

3.5.1.A. This procedure allows the construction of a solution in a very fast manner, as it randomly selects

one node after another, and only requires information regarding a very limited number of flights. Despite

producing a very fast solution, the quality of it is expected to be low.

The second layer of the OS implements the nearest neighbour heuristics proposed in section 3.5.1.B.

The implementation of this heuristic requires information regarding the entire weight matrix associated

to the problem. Because of this, it must run only after constructing the initial solution. It is expected that

the solution constructed by this procedure is of higher quality to the initial one.

In its turn, the third layer of the OS implements the meta-heuristic optimization algorithms proposed in

section 3.5.2.A and 3.5.2.B. Similarly to the nearest neighbour heuristic, these optimization algorithms

require information regarding the entire weight matrix. Thus, they must also run only after the initial

solution construction procedure, but they might run in parallel with the nearest neighbour heuristic. It

is expected that the solutions constructed by these meta-heuristics are of much better quality than the

initial and the nearest neighbour solutions.

4.2 Underlying technologies

By following the architecture proposed in section 4.1 and illustrated in figure 4.1, the developed system

consists of two web applications: the CSA and SSA. The CSA is the application responsible for rendering

the User Interface, allowing the definition of user requests, which are processed by the SSA. Thus,

although these two applications run separately, the CSA is dependent upon the SSA.

The developed applications are hosted on Heroku, a cloud platform which, upon request, creates

two separate runtime environments, one for each application. Each application requires a server to

52

listen to requests and serve content. In particular, the CSA and SSA run on node.js and django servers,

respectively (see figure 4.4). Furthermore, since the User Interface will render a map, the CSA requires

access to the Google Maps API. To enable the implementation of a modern web application, the CSA

also uses several other frameworks, in particular, React and Redux. In its turn, the SSA requires access

to real flight data and thus, will interact with Kiwi’s flight data API. The described application structured

is illustrated in figure 4.4, and by Bfly App and Bfly API denote, respectively, the the CSA and SSA.

The previously mentioned underlying technologies will be discussed with further detail in the following

subsection.

Figure 4.4: Technology stack used by the developed application.

4.2.1 Heroku

Heroku is a cloud platform as a service (PaaS), which allows applications to be built, deployed, monitored

and scaled. Customers who use Heroku do not need to worry about implementation details specific to

infrastructure and software, such as the hardware and the servers [88]. In terms of services, Heroku

competes directly with other cloud platform services, such as Google’s App Engine [89] and Amazon’s

Web Services [90]. A succinct comparison of the advantages and inconveniences among these different

services can be consulted in [91], while a more comprehensive (although somehow outdated) overview

is available at [92].

Heroku provides a detailed explanation of its services and how to use them in order to deploy and

manage an application [93]. Following this overview, an Heroku application can be defined as a slug

which runs on dynos.

53

A slug is typically the result of the deployment and bundling of an application. An application is

defined by its source code, and in order to run it on Heroku, it is necessary to specify the programming

language runtime, its dependencies, and and any compiled output of the built system. An applications

can be deployed to Heroku via git or using Heroku’s own API. Upon deployment, Heroku builds the

application, and a ready to be executed slug is produced.

A dyno is a virtualized UNIX container that provide the necessary environment to run the slug of the

application. Hence, it is possible to run applications without costs by using a single cost free dyno. As

the number of requests to an application increases, a single dyno may not be sufficient to serve all users

without compromising the connection time. To overcome this, it is possible to scale the developed appli-

cation by increasing the total number of dyno. This is particularly useful for the SSA, as the optimization

process of the FTP requests require significant ammount of RAM and CPU.

During the development of this work, Heroku is used as the host service for both the client and

the server side applications. Both of these applications run on a single free dyno, and do not use any

database or other add-ons. It should be noted that the single free dyno has one inconvenient: upon

30 minutes of inactivity, the application will sleep. This means that the next connection to a sleeping

application requires some additional time.

4.2.2 Node.js

Node.js [94] is an open-source and cross-platform JavaScript runtime environment, which executes

JavaScript code on the server [95]. Node enables the development of fast and scalable web servers

using JavaScript only, by implementing an asynchronous event loop, a low-level input/output API, and

Google’s V8 JavaScript engine. This last technology is a fundamental part of the Node.js stack, since it

allows the compilation of JavaScript source code into native machine code.

Upon the creation of the first web browsers, JavaScript was utilized as a scripting language used to

modify, in run time, the content of a web page. This enabled the creation of the first dynamic web pages.

Today, due to Node.js, this scripting language is not restricted to the browser, and can be used in the

server to create dynamic web pages, even before the page is served to the client [96].

Thus, Node.js enables the unification of the web development environments around a single pro-

gramming language, which may be used both on the client and the server. It also includes a package

manager (npm) that allows the execution of third-party software, that can be used, for example, in the

management of databases, networking, file system I/O and data streams.

In the development of this work, Node.js is used to create a web server for the Client Side Applica-

tion. Upon receiving a client request, the web server responds with a JavaScript and HyperText Markup

Language (HTML) bundle file. This bundle is the result of the compilation of the dynamic React appli-

cation, illustrated in figure 4.6 in the following section. This bundle contains the necessary information

54

for the browser to render, manage and update the User Interface upon interaction with the user and

third-party APIs.

4.2.3 Django

Django is a python framework for the development of web servers and APIs [97]. During the development

of this work, django was used to develop the SSA, by creating a web server to run an API to process

requests from the CSA. This API, described with further detail in subsection 4.4.1, interprets the requests

using regular expression matching, following the protocol defined in subsection 4.4.2. Upon interpreting

the request, it is possible to execute particular sets of functions, as the flight data requests to third party

APIs, and the optimization algorithms to solve the FTP requests.

4.2.4 React and Redux

React is a JavaScript library for the development of user interfaces for web and mobile applications [98].

React is based on some core principles, which include the concepts of components, props and state, a

JavaScript language extension called JSX and the concept of a virtual Document Object Model (DOM)

[99],

The interfaces rendered by React are built using components, which combine the markup of HTML

with the dynamic utilities of JavaScript, and the styling of CSS. This is achieved by using JSX, which

bundles these three technologies under the same file, creating a single independent component. These

components may receive input arguments, denoted as props, allowing them to be flexible and reusable.

Usually, a component is a pure function, rendering always the same content for the same input. Com-

ponents may also be called by other components, allowing the creation of complex architectures. In

general, an application consists of multiple different components.

The state of an application is a data structure with some relevant information for the construction

of the user interface. In a flight search application as the one being developed, the state contains

information regarding the user request and the solutions associated to it. In general, components do not

access the state directly, but they may receive specific parts of it as props.

As applications grow bigger, it becomes increasingly difficult to manage their state. React, by itself,

is not well prepared to do so, but it is possible to use third party libraries for this effect. One of these

libraries is Redux [100]. Redux is a JavaScript library for the management of the state of an application.

It is often called a predictable state container, because it does not allow the state to be changed directly,

but instead requires a description of these changes using a plain object, called action. Dispatching an

action triggers the execution of a function to manipulate the state, called reducer. Reducers are always

pure functions, producing the same result for the same input. Because of this, the state of the application

55

is deterministic or, in other words, predictable.

The user interface of a React application is always up to date with the current state. However, it does

not re-render the entire page every time the state is updated. Instead, it compares the current DOM

structure to a virtual DOM introduced by React, and identifies the components that require an update.

This enables a fast and effective update of the user interface.

During the development of this application, React and Redux were used together to build the client

side application. This means that React is responsible for rendering the user interface, while Redux

manages the state and interacts with the SSA using a simple HTTP protocol.

4.3 Client Side Application implementation

The Client Side Application is implemented as a web service to interact with the user and allow the

resolution of complex flight requests. The CSA is constructed as a single page application, and is

hosted using heroku, being publicly available 1.

The CSA, built using react and redux, evolves around a concept called state. The state of the appli-

cation is stored in the redux store, and it contains the relevant information associated to the application

at each instance. In the developed application, the state tree is divided into requests and responses.

Requests are associated to the user input, while responses come from the SSA. The state cycle of the

developed application is illustrated in figure 4.5.

Figure 4.5: Block diagram of the state cycle of the Client Side Application.

To each branch of the state (request and responses) it is associated with a container (or a panel

view). Containers are top level components, which access the state directly, and call specific compo-

1The developed CSA has the following URL: https://desolate-castle-31305.herokuapp.com/

56

https://desolate-castle-31305.herokuapp.com/

nents, often using parts of the state as input (props) for them. Thus, the developed application has two

main views, as it was proposed in section 4.1.1. There is also a third container, for the map, although it

does not have a state branch for itself, but instead derives the necessary information from the other two.

The complete architecture of the implemented react/redux application, including some of the con-

cepts previously discussed, is illustrated in figure 4.6. This figure illustrates the top down hierarchy,

with a single component in the top of the hierarchy (denoted app.js), which instantiates the redux store,

and renders the complete application, by invoking the containers which are connected to the relevant

presentational components. This figure also illustrates the interaction with the store and with third-party

API’s.

Figure 4.6: Building blocks of an application built using React And Redux.

During the remaining of this section, it will be shown how to collect user input (subsection 4.3.1) and

how to communicate with the SSA (subsection 4.3.2). Finally, subsection 4.3.3 illustrates the developed

web application, by presenting some actual screen-shots of it.

4.3.1 User Input

There are three types of requests that will be processed by the application: single flight, round trip and

multi-city trip. A user must initially select the desired type of search, and follow this with the completion

of a series of forms that collect the relevant input.

Every request requires at least an origin and a destination, as well as a start date. There are other

57

search attributes which might be relevant, according to the selected trip type, and the duration associated

to each city that will be visited. A complete list of the input that may be collected from the user is

presented in table 4.1 (left column).

Table 4.1: Parallelism between User Input, Actions and Reducers. To each user defined input corresponds anaction, declaring the intent of changing the state with some specific data, and a reducer, which actuallymodifies the state.

User input Action Reducer

origin actOrigin(origin) setOrigin(state, action)

destination actDest(dest, index) setDest(state, action)

duration actDur(dur, index) setDuration(state, action)

start date actDate(date, index) setStartDate(state, action)

submit actRequest(request) setResponse(state, action)

Then, to every user input it is associated an action and a reducer. Table 4.1 also defines the action

that is dispatched each time the input is updated (center column), and the corresponding reducer that is

responsible for updating the state of the application (right column). Note that the last user input (submit)

is processed by an asynchronous action. Thus, after the submission of the request, the reducer will be

called only upon receiving the response from the SSA, updating the state by storing the received data,

and triggering the update of the user interface.

The developed application forces the user to submit the request, by clicking on a button that dis-

patches an action. During the development of the application, the possibility of removing this button,

and to automatically dispatch a request was considered. However, it was subsequently rejected be-

cause of the difficulty to know if a certain request is complete. For example, given a single flight request,

knowing if the request is complete is simply a matter of verifying if an origin, destination and departure

date exist. For a round-trip, the duration would also be a requirement. However, for multicity requests,

unless the user specifies how many cities are to be visited, there is no way of knowing when a request is

complete. Furthermore, any change to an already complete request would trigger another new request.

Thus, it must be an user defined action to declare the intention of submitting a request.

4.3.2 Communication with the SSA API

Upon the submission of a user defined trip, the CSA communicates with the SSA in order to produce a

solution to it. This communication relies on HTTP, and utilizes the SSA API protocol that will be described

in subsection 4.4.2. The URL extension of the request, called the Uniform Resource Identifier (URI),

enables the specification of each particular resource under request. Thus, the URI may be composed of

the necessary fields that specify the attributes which characterize the user selected resource. It follows

that a particular resource is identified by a collection of (key, value) pairs, which identify the resource

58

attribute and its user defined value.

Although the application logic of the developed CSA is managed by Redux, this framework is not able

to make HTTP requests or provide the desired asynchronous behaviour. In order to do so, it is necessary

to use two third-party libraries: redux-thunk and superagent. Redux-thunk is a store enhancer, or

a middleware, providing additional functionalities to the store: and superagent is a library for AJAX

requests. Together, these two libraries enable the dispatch of asynchronous actions, which are used to

make HTTP requests for specific resources to the SSA API.

4.3.3 User Interface

The User Interface consists in a single page application, divided into three main views: the Request,

the Response and the Map view, as proposed in section 4.1.1 and illustrated in figure 4.7. Since the

application is developed using React, every view is managed by a container, which reads the state from

the store, calls the rendering of presentational components, and may dispatch actions on user input

or other events. It is also worth nothing that, due to the modular nature of react components, they

are reusable. This is particularly useful in the construction of the request view. As an example, one

component (city form) is used to collect information regarding all cities: origin, return city and cities to

visit. This is possible because the underlying structure is the same. The only difference is the reference

to these components, which update different parts of the state.

Figure 4.7: Structure of the developed user interface

The user interface was designed to be mobile friendly, by being responsive to the device size. This

was achieved using the Bootstrap grid system, a web application design paradigm in which the user

screen is divided into 12 columns, and each element of the user interface may specify a variable number

of columns, depending on the screen size.

Figure 4.8 and 4.9 illustrate two possible views of the developed application. The first image illus-

trates the application in a desktop device and the second in a mobile. It is worth noting that these two

59

Figure 4.8: Application rendered on a desktop computer. Figure 4.9: Application renderedon a mobile device.

screenshots correspond to the same application. The design differences between these two views is a

result of the responsiveness of the application, which is responsible for resizing the application elements

according to the device size. It also includes toggles in the Request and Response views, as to generate

more space to the map view.

4.4 Server Side Application implementation

The Server Side Application is the module that is responsible for producing a solution to a user request.

The architecture and implementation details of the SSA are presented in section 4.4.1. Each request

processed by the SSA is defined by a set of parameters that specify the aimed trip, as will be described

in section 4.4.2. Producing a solution to a user request also involves the communication with third

party API’s, as to obtain the necessary flight data, which is handled by the Data Management System,

described in section 4.4.3. The actual production of a solution is managed by the Optimization System,

described in section 4.5.

4.4.1 SSA dataflow

As defined in section 4.1.2, the goal of the Server Side Application is to serve the CSA with the solution

to FTP requests. Its corresponding API is publicly accessible 2 and is used by the CSA to communicate

with the CSA. Upon establishing a connection, the Uniform Resource Identifier (URI) of the FTP request

is used to identify a particular resource corresponding to the the set of user selected input arguments.

This is discussed with more detail in subsection 4.4.2.2The developed API has the following URL: https://safe-plateau-51528.herokuapp.com/documentation

60

https://safe-plateau-51528.herokuapp.com/documentation

To construct a solution to a given request, it is necessary to sequentially execute a particular set of

steps, denoted as the main cycle of the SSA, which can be summarized as follows:

1. generate the list of necessary flights;

2. request flight data from third-party API;

3. construct the weight matrix for the objective functions;

4. execute the optimization algorithms;

5. construct a solution to the request.

Each step of the main cycle is executed by a particular set of functions. Steps number 1), 3) and 5)

are managed by a predefined python class, which implements the logic associated to each FTP request.

This class is also responsible for the communication with the DMS and OS, which execute steps number

2) and 4), respectively. It should be noted that the execution of the optimization step (4) is only required

for multi-city flight requests, since, for the single-flight and round trip, determining the best solution is a

trivial task.

The implemented SSA dataflow is illustrated in figure 4.10. By using the utilities offered by django, the

SSA implements a server with a single endpoint, where the requests from the CSA are received. These

requests must be validated, in order to discard meaningless queries. This validation occurs immediately

after receiving the request. It uses regular expressions to parse the request, followed by an analysis of

the input arguments, according to the rules defined in table 4.2 of the following subsection. After the

validation, the resources are translated into a FTP request, and, in particular, to a python object that

executes the previously described main cycle.

4.4.2 SSA interface protocol

Every CSA request may be defined as a FTP instance, as proposed in section 3.1. Thus, each request

is described by a limited number of attributes (origin, start date, etc.), and it may be uniquely identified

according to its uniform resource identifier (URI).

Hence, each flight resource is described by a predefined list of pairs of keywords and values. This is

illustrated in table 4.2, where the first and second column represent a FTP variable name and symbol,

respectively. In its turn, the third column defines the keyword for the URI, while the fourth column

provides a description about the acceptable values for each keyword.

It should be noted that the defined pairs of keywords and values must undergo a validation process,

not only to verify if the values provided for each keyword are acceptable, but also to assert that the

request is a valid FTP formulation.

61

Figure 4.10: Server Side Application dataflow

Table 4.2: Keyword/value pairs of the SSA protocol to uniquely identify a resource.

Name Symbol Keyword Details

start city v0 flyFrom City name or ICAO code.

return city vn+1 returnTo City name or ICAO code.

destinations V citiesCities to be visited.Accepts multiple cities, separated by commas.Accepts city name or ICAO code.

durations D durationLength of stay (in days) in each city.Accepts positive integers.Must be the same length as the number of cities.

start date T0minDatemaxDate

minDate is the earliest start date (T0i),maxDate is the latest start date (T0f ).Accepts date format dd/mm/yyyy.

62

4.4.3 Data Management System

The Data Management System (DMS) is responsible for the communication with third-party flight data

APIs. It allows the request of flight data in behalf of the CSA. This implementation of the DMS is based

on a low-level API wrapper, around the KIWI flight API, and it is constructed as a worker/consumer

factory to execute concurrent HTTP requests (see figure 4.11).

Given a list of necessary flights, usually requested by the CSA, the DMS communicates with third-

party APIs as to request the relevant flight data. This communication relies on the HTTP protocol, and

uses the URI as to identify a particular resource. This identification follows the protocol defined by the

third-party API [8]. A response to a request is given by a structured data tree, usually as a JSON data

type. Hence, each worker of the DMS worker/consumer factory executes the following tasks:

1. generate the URL for the intended resource;

2. execute an HTTP request to the given URL;

3. parse the response.

When communicating with third-party APIs, the bottleneck of the system is usually the time necessary

for the server to process the request and produce a response to it. The implemented worker/consumer

approach takes advantage of this bottleneck. Instead of waiting for a request to complete, this approach

uses the waiting period to spawn more requests. Thus, the waiting period associated to one request is

not imposed to the others. Despite this, executing multiple requests will increment the servers workload,

increasing the time necessary to respond to one (isolated) request. This approach will be discussed,

and its results illustrated, in subsection 5.2.4.

Figure 4.11: The Data Management System uses concurrent HTTP requests to communicate with third-party flightAPI’s.

63

Table 4.3: Algorithm specific parameters.

Alg. Parameter Value

ACO

Pheromone relative influence (α) 1Heuristic relative influence (β) 5Pheromone evaporation rate (ρ) 0.1Exploration rate (Q0) 0.9Number of ants (m) 10

SA

First iter. acceptance prob. (p0) 0.98Last iter. acceptance prob. (pf ) 10−300

Initial temperature (t0) see Eq. 3.12Final temperature (tf ) see Eq. 3.11Cooling parameter (λ) see Eq. 3.13Markov chain length (M ) N

4.5 Optimization system

Given a FTP request, the Optimization System (OS) is the module that determines the respective so-

lutions, using the strategies described in subsection 4.1.2.B. This module was implemented using the

Python3 programming language, and uses the numpy library [101] for the construction and management

of the multi-dimensional (and often very large) arrays, which describe an FTP instance.

The developed optimization system implements each of the three proposed optimization strategies:

the nearest neighbour heuristic, the ACO and the SA metaheuristics (section 3.5.2). The adopted imple-

mentation of the nearest neighbour algorithm closely follows the steps presented in subsection 2.2.2.B.

The remaining of this subsection will focus on the implementation details of the ACO and the SA meta-

heuristics.

The implementation of the ACO follows the methodology described in subsection 3.5.2.A and is

illustrated in figure 4.12. This metaheuristic receives, as input, a data structure describing the FTP

requests, which includes the weight matrix and the constraints of the problem. This metaheuristic starts

by initializing the algorithmic specific parameters, using the values presented in table 4.3. It is also

necessary to initialize the pheromone matrix, by setting each entry to a value defined by equation 3.4.

Figure 4.12: Flowchart of the implemented Ant Colony System procedure.

64

This is followed by the initialization of all ants belonging to the colony. Each ant is implemented as

an independent thread, which allows for the colony to run in parallel. Thus, all ants construct a solution

to the problem in an autonomous way, using the transition rules described in equations 3.5, 3.6 and

3.7. At each step of the construction process, it is necessary to apply a local pheromone update, by

using equation 3.8, which decreases the probability of that solution component being selected by other

ants, in the same iteration. After each ant finishes the construction of a solution, it is necessary perform

the pheromone update (equation 3.9), as to reflect the search experience. Given that each ant is an

independent thread, it is necessary to lock the pheromone matrix, as to protect it from being accessed

and manipulated by multiple threads at the same time.

In its turn, the implementation of the SA follows the methodology described in subsection 3.5.2.B, as

it is illustrated in figure 4.13. As it happens with the ACO, the SA metaheuristics receives as input the

relevant data to describe the FTP, and initializes its algorithmic specific parameters according to table

4.3. This is followed by the determination of the cooling schedule. To do so, it is necessary to construct

a number of solutions, as to calculate the average difference in the objective function, which allows the

calculation of the initial temperature (equation 3.12) and cooling parameter (equation 3.13).

Figure 4.13: Flowchart of the implemented Simulated Annealing procedure.

After this initialization step, the SA system starts the optimization cycle, by constructing a new candi-

date solution y based on the current solution x, using a 2-opt move, as described in section 2.2.2.C. To

accept the candidate solution y, it is necessary to apply the Metropolis acceptance criteria (see equa-

tion 3.10). Unlike the ACO implementation, in which the the solution construction process is parallel,

the developed Markov cycle follows a serial approach, given that the candidate solution y is always

constructed based on the current solution x. If there were multiple threads constructing solutions at the

same time, the current solution would be an uncertain entity. After completing a cycle, the tempera-

ture of the system is reduced, by applying equation 3.11. As the temperature decreases, the algorithm

becomes increasingly greedy and converges to a local optimum.

Both metaheuristics have a stop-criteria that is evaluated at the end of each solution construction

cycle (the ants construction procedure, and the Markov cycle). This criteria is either set to a maximum

65

number of iterations, or to a maximum allowable execution time. When the stop-criteria is reached, these

heuristics return the best solution found so far.

It is worth noting that the solution returned by the ACO and the SA modules correspond to a set

of solution components , which define both the pair of cities and the schedule of the arc traversal. If

the problem under resolution is a user request issued by the CSA, it is necessary to construct the set

of flights which characterize the solution to the FTP request. Thus, the SSA must map each solution

component to a particular flight, as to produce the itinerary for the requested trip.

4.6 Summary

This chapter presented the design and implementation details of the proposed flight search web applica-

tion. In the first part, covering the sections related to the design process, it was presented the considered

architecture for the web application, which separates the user interface from the logic associated to the

resolution of FTP requests. It also established a design goal for the user interface, and proposed an

architecture for the optimization algorithm, as to reduce the latency that is sensed by the users. During

the remaining of the chapter, the implementation details regarding these topics were provided.

66

5Experimental Results

Contents

5.1 Optimization module evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

5.2 Flying Tourist Problem evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

5.3 Comparison with Kiwi ’s Nomad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

67

In order to validate and evaluate the performance of the developed system, several tests were devel-

oped and executed. First, by using a set of benchmarks, the implemented optimization algorithms are

tested as to evaluate their overall quality (section 5.1). Then, the overall utility of the proposed system is

evaluated, by performing a series of tests on the Flying Tourist Problem (section 5.2). Finally, a thorough

comparison with the only known state-of-the-art alternative for the devised FTP is performed (section

5.3). This is achieved by considering a comprehensive set of real-world multi-city flight problems, using

different objective functions as to evaluate the systems performance.

These experiments were executed on a 2.6GHz Intel i7-6700 CPU, with 8GB of RAM, and all the

code was developed using the Python3 programming language.

5.1 Optimization module evaluation

The main difficulty behind the aimed evaluation of the developed optimization module results from the

absence of publicly available FTP benchmarks (with a priori known optimal solutions) that could be used

to validate the obtained results. To circumvent this adversity, it was decided to conduct such evaluation

using closely-related NP-hard problems, such as the Asymmetric TSP (subsection 5.1.1) and Time-

Dependent TSP (subsection 5.1.2).

5.1.1 Asymmetric TSP evaluation

The set of benchmarks (and respective input data) considered for the Asymmetric TSP were collected

from the publicly available TSPLib and correspond to problems with 17, 35, 53, 124 and 323 nodes

(br17, ftv35, ft53, kro124p, rbg323, respectively). The adoption of such graph dimension ranges allows

the evaluation of the developed optimization module with different complexity levels: small instances (17

nodes), corresponding to a dimension more closely-related to the targeted flying tourist application sce-

nario; medium instances (35-53 nodes), used to evaluate more computationally demanding scenarios;

and large instances (124-323 nodes), used to evaluate the scalability of the developed system.

Each benchmark was independently solved by the two considered optimization algorithms (ACO and

SA). The execution of each problem was repeated 5 times and the obtained results were averaged.

Furthermore, since there are strict limitations on the maximum time a user is willing to wait for a

result in a real-world application, it was considered a stop criteria to limit the total optimization time. In

accordance, during the execution of these tests, each algorithm may run for no longer than 60 seconds.

The results of both meta-heuristic algorithms on the considered set of asymmetric benchmarks are

presented in Table 5.1.

For small instances (17 nodes), both ACO and SA consistently present optimal solutions. For medium

instances (35-53 nodes), both algorithms perform in the range of 5-16% relative error. As for bigger

68

problems (124-323 nodes), the performance of the ACO slightly decreases (17-25%), followed closely

by the SA (22-29%).

By observing the number of iterations that were executed during the 60-seconds interval, it is clear

that the SA is much faster than the ACO, performing 38 to 68 times more iterations in the same time

interval. However, this greater number of iterations does not seem to directly contribute to a better final

result of the SA. This happens because the ACO search strategy is guided, taking into account the

previous search experience in the selection of the next solution. This does not occur with the classical

implementation of the SA, which utilizes a simple and non-guided local search.

On the other hand, while the considered stop-criteria value may be sufficient to reach the meta-

heuristic stationary result for small problem instances, meaning that continuing the optimization further

may not affect the final error in a significant way, the higher relative error and lower number of iterations

for bigger problem instances suggest that improvements may still occur.

To analyze the advantages of further optimization, all problems were solved once more, by using

the same procedure previously described, this time for a total of 300 seconds. The obtained results are

presented in Table 5.2, which allow the comparison of the final result as a function of the execution time.

Columns RE60 and RE300 present the relative error after 60 and 300 seconds, respectively. As it can be

seen, increasing the optimization time leads to an improvement in the final result for both metaheuristics

and for almost every benchmark problem. These improvements are more significant for bigger problem

instances, and affect the small instances only slightly.

Hence, the observed performance of the implemented metaheuristic algorithms on the asymmetric

TSP, for the level of complexity of the targeted flying tourist application scenario, showed to be highly

promising, leading to final solutions which are either optimal or whose relative error is close to 10%.

Table 5.1: Performance of the ACO and SA on the asymmetric TSP benchmarks, taken from the TSPLib (stop-criteria = 60 seconds).

Alg. Bench #Iterat. Optimal Comput. Errormark solution solution [%]

ACO

br17 6893 39 39 0ftv35 1469 1473 1552.4 5.39ft53 689 6905 8269.6 16.13kro124p 275 36230 44102 17.00rbg323 20 1326 1660.4 25.21

SA

br17 264319 39 39 0ftv35 63659 1473 1641 11.41ft53 36189 6905 7963 15.32kro124p 17533 36230 44102 21.73rbg323 1361 1326 1706.2 28.67

69

Table 5.2: Effects of increased optimization time on the final result.

ACO SABenchmark RE60 RE300 RE60 RE300

br17 0 0 0 0ftv35 5.39 4.75 11.41 11.73ft53 16.13 13.71 15.32 13.20

kro124p 17.00 13.91 21.73 17.21rbg323 25.21 18.78 28.67 10.27

5.1.2 Time-dependent TSP evaluation

Due to the absence of standardized benchmarks for the time-dependent case in the TSPLib, it was

necessary to define specific problems (whose best known solutions are known a priori) by using a

method described in ( [102]), based on the duality principle over the Integer Linear formulation for the

time-dependent TSP. The defined benchmarks were constructed as to have the same dimensions of

those used for the Asymmetric TSP, and their names were appended with an asterisk suffix to distinguish

from the asymmetric TSP case.

Executing the same evaluation strategy on the time-dependent TSP problems leads to the results

presented in Table 5.3. Once again, it is possible to compare the efficiency of the ACO and the SA, by

evaluating the relative error of each set of problems.

For small instances (17 nodes), ACO and SA present a small relative error, ranging from 7% and

11%. For medium instances (35-53 nodes), the relative error decreases for both algorithms, ranging

from 4% to 7%, with SA offering better results than ACO. This may occur because the high number

of performed iterations in small and medium instances leads to an intensive search space exploration.

However, when bigger problems (124-323 nodes) are considered, the performance of the ACO increases

(possibly due to its guided search exploration), reducing the relative error to close to 3% and surpassing

that of the SA.

Table 5.3: Performance of the ACO and SA on the time-dependent TSP benchmarks (stop-criteria = 60 seconds).

Alg. Bench #Iterat. Optimal Comput. Errormark solution solution [%]

ACO

br17* 9720 2458 2729 11.03ftv35* 2590 5131 5500 7.20ft53* 1099 7930 8370 5.53kro124p* 71 25483 26402 3.61rbg323* 13 48991 50261 2.59

SA

br17* 219517 2458 2631 7.04ftv35* 80262 5131 5406 5.38ft53* 37450 7930 8265 4.23kro124p* 3521 25483 26427 3.70rbg323* 901 48991 51926 5.99

70

In any case, the performance of these two algorithms on the time-dependent TSP is highly promising,

leading to final solutions which are consistently below the 10% relative error mark.

5.1.3 Discussion

A precise interpretation of the factors involved in these results is not straightforward. Still, the following

conclusions seem plausible. When comparing the time-dependent TSP with the asymmetric TSP, the

former is usually expected to be more heavily constrained than the latter. In those cases, finding the

problem itself simplifies the search for the overall optimum, as bad solutions are much easier to identify.

In particular, both ACO and SA seem to explore this property well, as it can be inferred by the fact that for

the corresponding benchmarks the time dependent version obtains a better relative error rate with less

iterations. Moreover, the better relative error rates themselves seam to be problem induced, meaning

that the asymmetric TSP contains big gaps in the objective function. This happens, in particular, from

the optimal value to a close by maximum, whereas the time-dependent TSP contains smaller gaps and

a higher concentration of near maximums.

5.2 Flying Tourist Problem evaluation

To demonstrate and quantify the actual benefits of the proposed system, a series of FTP instances were

defined, ranging from just 1 city to visit (which corresponds to a round-trip), up to a total of 20 cities. For

each problem instance, several solutions were obtained using four different approaches:

• a pseudo-random approach, i.e., a closed tour randomly generated that connects all the ;

• a metric nearest-neighbor heuristic that promotes the nodes proximity to define the traveling route

(this approach closely approximates the strategy usually followed by a human solver);

• a regular nearest-neighbor heuristic using the flight price as the objective function;

• the proposed ACO and SA meta-heuristics (where the best of these two results is chosen), con-

sidering two different objective functions (minimization of the total cost and of the entropy).

In this experiment, the number of nodes was varied and multiple requests (more than 5) were made

for each set of nodes, averaging their results. In every case, the trip starts and returns to Lisbon (Portu-

gal) and visits a given set of cities, randomly chosen from the following set: Abuja, Atlanta, Barcelona,

Beijing, Cairo, Casablanca, Dubai, Dublin, Frankfurt, Hong Kong, Istanbul, Johannesburg, Kiev, Los An-

geles, Madrid, Miami, Moscow, New-York (JFK), Oslo, San Francisco, Sidney, Singapore. The start date

was set to be the same for all requests(1 November 2018), which, upon the execution of the tests, was

71

Table 5.4: Comparison of different Flying Tourist Problem solutions obtained with distinct algorithms and optimiza-tion criteria, considering the Metric Nearest Neighbor approach (shaded gray) as reference.

#Nodes 1 3 5 7 10 15 20

PR

ICE

Random 635 1455 (+1.2%) 2194 (+7.0%) 3436 (+26.6%) 4791 (+28.8%) 7222 (+63.5%) 9154 (+68.8%)Metric Nearest Neighbor 635 1438 2051 2715 3721 4417 5422Regular Nearest Neighbor 635 1438 1993 (-2.8%) 2553 (-6.0%) 3412 (-8.3%) 3911 (-11.5%) 4678 (-13.7%)Proposed (Price) 635 1398 (-2.8%) 1727 (-15.8%) 1911 (-29.6%) 2466 (-33.7%) 3051 (-30.9%) 3699 (-31.8%)Proposed (Entropy) 876 (+38.0%) 1761 (+22.5%) 2203 (+7.4%) 2749 (+1.3%) 3687 (-0.9%) 4123 (-6.7%) 4707 (-13.2%)

DU

RA

TIO

N

Random 61 125 (+5.0%) 183 (+3.4%) 270 (+31.7%) 369 (+48.2%) 497 (+63.0%) 638 (+92.7%)Metric Nearest Neighbor 61 119 177 205 249 305 331Regular Nearest Neighbor 61 119 181 (+2.3%) 212 (+3.4%) 257 (+3.2%) 323 (+5.9%) 358 (+8.2%)Proposed (Price) 61 121 (+1.7%) 151 (-14.7%) 179 (-12.7%) 258 (+3.6%) 292 (-4.3%) 319 (-3.6%)Proposed (Entropy) 29 (-52.5%) 57 (-52.1%) 82 (-53.7%) 92 (-55.1%) 104 (-58.2%) 140 (-54.1%) 160 (-51.7%)

50 days into the future. The waiting period on each city was set to a random value between 1 and 5

days.

5.2.1 Quantitative evaluation and improvement

The result of the execution of the described test is summarized in Table 5.4, where both the total flight

cost and duration are presented, as a function of the optimization approaches and the number of nodes.

This table also presents the observed improvement, relative to the metric nearest-neighbor solution,

which is used as reference due to its proximity to the human search approach.

The first insight into these results allows a preliminary evaluation of the utility of the proposed system.

It can be seen that, for a small number of nodes (1 and 3), the metric and regular nearest neighbour

present the same results. However, for greater instances (5-20) nodes, the metric nearest neighbour

starts to present worse results when compared to the regular variant. The regular variant performs

better (reducing the cost between ≈ 3% and 13%) because it always selects the node with the lowest

cost. Despite the positive results presented by the regular nearest neighbour approach, the proposed

meta-heuristics are still capable of improving them. The metaheuristic approach enables the cost to be

reduced by an extra 3% to 25%, when compared to the regular nearest neighbour, and up to a total of

34%, when compared to the metric one.

5.2.2 Balancing the total flight price and duration

It is possible to execute the proposed metaheuristics with different objective functions, as proposed in

subsection 3.1. In particular, it is possible to introduce both the price and flight duration in the objective

function (see equation 3.3). With this approach, a better balance between price and flight duration is

envisaged, by minimizing an entropic metric defined as a weighted value, where the price and flight

duration contribute with 70% and 30%, respectively.

As it can be observed in Table 5.4, such an entropic minimization leads to significant improvements

72

Figure 5.1: Variation of the total flight price and duration when minimizing the entropy objective function.

in the flight duration, although it introduces some penalization in the flight price for a small number of

nodes (1-5). However, as the number of nodes increases, the flight price significantly decreases, but

it is always higher than that provided by the former price-only meta-heuristics. This happens because

the two objective functions (price and duration) can hardly be simultaneously minimized, and thus, a

compromise has to be reached. In this case, compromising means slightly increasing the price, to

significantly reduce the duration.

This compromise between flight price and duration is also illustrated in Fig. 5.1, which presents the

relative duration improvement as a consequence of the increase in price. This figure shows that, in

general, increasing the price by around 20% leads to a decrease in the flight duration by around 50%,

when compared to the price-only metaheuristic approach.

5.2.3 Impact of the trip start interval

To evaluate the influence of the trip start interval on the obtained results, the same queries and data

sets were used to solve these FTPs using start periods of different lengths. These results are illustrated

in Fig. 5.2, where NN refers to the metric nearest neighbour heuristic and M − 1, M − 15 and M −

31 represents the proposed meta-heuristic algorithms, with start periods of length 1, 15 and 31 days,

respectively.

The analysis of Fig. 5.2 shows that increasing the interval of the start date may lead to great im-

provements, with flight price reductions as high as 15%, even for medium size instances with up to 20

nodes.

5.2.4 Response time

The response time to a requests depends mostly on two difference procedures: i) the data gathering,

which is handled by the DMS; and ii) the optimization, handled by the OS. This subsection evaluates

the response time to a given request, by comparing the relative influence of these two steps.

73

Figure 5.2: Price improvement as a function of the trip start interval.

As to collect the necessary flight data to a request, it is necessary to communicate with third-party

APIs. The module responsible for this is the DMS, which implements a concurrent architecture to make

HTTP requests. Figure 5.3 illustrates the required time to receive the response to 100 flight queries,

using the KIWI flight API. By varying the number of threads, it also shows the speed-up obtained by

implementing the concurrent system.

Figure 5.3: Required time to perform 100 HTTP requests to a third-party API, as a function of the number ofconcurrent requests.

Given a serial approach, which corresponds to the case in which there is only one worker thread,

the system requires approximately 100 seconds to perform all queries. Thus, the time for the remote

API server to respond to one query is, on average, one second. It is possible to increase the number

of requests per second, by opting for a concurrent approach. Figure 5.3 shows that by performing two

concurrent requests, the response time drops to half. As the number of concurrent requests increases,

the response time decreases.

However, this decrease is not always linear. While the first concurrent requests have a very positive

effect in the reduction of the response time, this behavior eventually reaches a stagnation point, in which

74

Figure 5.4: Total response time to a request, as a function of the number of nodes and length of start period.

continuing to increase the number of concurrent requests has a negative effect. Thus, it is recommended

to be sensible upon defining the number of concurrent requests. Despite this, the proposed DMS allows

the collection of 100 requests in less than 5 seconds, which is over 20 times faster than the serial

approach.

After collecting the necessary set of flights, the proposed system determines the response to a

request by running the appropriate optimization algorithms. Each request is solved using the nearest

neighbour heuristic, and the SA and ACO metaheuristics. For each request, it is necessary to run the

optimization algorithms for a total of three times, one for each objective function (price, flight duration

and entropy). While the nearest neighbour runs until a solution is found, it is possible to define multiple

stop criteria for the metaheuristics, as it was referred in section 5.1. In this particular evaluation, it was

defined that each optimization algorithm may run for a maximum of 1 second, or 10.000 iterations.

As a result, the total time that is necessary to respond to a request, as a function of the number of

nodes and length of the start period, is illustrated in figure 5.4. The analysis of this figure shows that

requests with up to 10 nodes are solved in less than 60 seconds. It also shows that the response time

increases non-linearly as the number of nodes increases. On the other hand, increasing the length of

the start interval has low influence for small instances (up to 10 nodes), but has a significant impact for

greater instances.

5.3 Comparison with Kiwi ’s Nomad

At the present time, Kiwi ’s Nomad is the only (non-disclosed) tool that is capable of addressing the

formalized Flying Tourist Problem in the form of an unconstrained multi-city routing problem, although

limited to only 10 different nodes. To facilitate the comparison of the conceived optimization system with

this tool, the definition of the user requests of the proposed FTP (see section 3.1) was kept as similar

as possible to Kiwi ’s Nomad interface. The user is asked to specify the departing/arriving city, together

with the start date, the set of cities to be visited and the duration of the stay in each city.

75

The results provided by both applications were directly compared against each other, according to

each considered objective function. The difference in the total flight price and duration (for each query)

was also measured and analyzed as a function of the query parameters. The former evaluation will be

called absolute comparison (subsection 5.3.1), while the latter quantitative evaluation (subsection 5.3.2).

The execution of these tests involved over 100 different queries, by varying not only the number of

nodes (2-10), but also the length of the trip start interval (1-15 days). All queries that were performed

on both applications had its start and return city set to Lisbon (Portugal), while each city to be visited

belongs to the same set of hub airports that were considered in the previous subsections. These queries

were executed during the period between 15 and 16 of June 2018 and the base start date was set to

the 1st of August 2018, which, at the time of the tests, was 45 days in the future. The staying period in

each city was set to a random value between 1 and 5 days. For extended start periods, the base start

date was extended by 31 days.

5.3.1 Absolute comparison

Both applications respond to each query with three different sets of flights, serving the following different

optimization criteria: the cheapest, the fastest and the recommended. For each query, a winner was

determined according to the following criteria. The cheapest set of flights is determined according to

the total flight price, while the fastest depends solely on the total flight duration. The recommended set

of flights depends on both the price and the duration, and the winner for this criteria must have both

lower prices and duration. While the optimization criteria used by Kiwi is not known, it is a reasonable

assumption that the criteria for the cheapest flight is solely its price. For the same reason, the flight

duration if the only criteria to determine the fastest set of flights. However, for the recommended set of

flights, both the price and the flight duration should be considered, but it is possible to also include other

optimization criteria, giving each an heuristic weight (f.e., it is possible to consider that the departure

time is a relevant factor for the recommended flight).

Figure 5.5: Comparison of the results provided by the proposed tool and by Kiwi ’s Nomad application.

Fig. 5.5 illustrates the obtained comparison, by presenting the total number of times that an applica-

tion outperformed the other, for each of the three different optimization criteria. It also shows the number

76

of cases in which the responses were very similar.

The analysis of this figure indicates that the developed application presents better solutions for a

significant amount of queries. In fact, while the fastest set of flights is only achieved in 42% of the

queries, it presents the cheapest set of flights 95% of the times and the best recommended result 75%

of the times.

The developed application presents high quality results for two of the three objective functions, in

a consistent way. However, it does not perform particularly well in the minimization of the total flight

duration. This occurs because of an implementation detail in the DMS. Upon receiving a list of flights for

a query, the DMS selects only a subset of these flights, as to reduce the required amount of memory.

This subset always selects the cheapest set of flights. However, in general, if a flight is fast, its price

is high. Thus, upon selecting the subset of flights, the most promising solution components for the

minimization of the total flight duration are discarded.

5.3.2 Quantitative evaluation

To evaluate the difference of the responses provided by both applications, the total flight price and

duration of the recommended set of flights was also quantitatively measured (see Fig. 5.6). The values

presented in these graphs refer to the developed application response and were normalized using the

Kiwi ’s Nomad response as reference.

(a) Single start date. (b) Extended start period (31 days)

Figure 5.6: Comparison of the recommended flights price and duration, as a function of the number of nodes andthe length of start interval. The presented values refer to the proposed application response, and werenormalized with respect to Kiwi ’s Nomad response value.

Figure 5.6a presents the results of the queries performed for a single start date. Its analysis shows

that, for a small number of nodes (2 and 3), the developed application recommends flights that are

slightly more expensive (≈ 10% to 19%) but have a much lower flight duration (≈ 33%-46%). For

requests with more nodes (5 to 10), the results presented by the developed application have both lower

prices (≈ 2%-18%) and flight duration (≈ 9%-24%).

Figure 5.6b depicts the obtained results when the length of the start interval was extended to 31

days. With such an extended start period, every recommended set of flights provided by the proposed

77

application has a lower price and duration. The price presents the most significant change: the minimum

improvement is 8%, while the maximum is 29%.

Finally, it is worth noting that all the presented experiments only consider up to 10 different cities to be

visited by the traveler. The reason why more nodes were not considered arises not from the developed

application (which could easily accommodate more cities), but is motivated by a strict limit presented by

Kiwi ’s Nomad application, which does not support more than 10 nodes in the planned route.

78

6Conclusions

Contents

6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

6.2 Achievements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

6.3 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

79

6.1 Conclusions

The present work was developed with the aim of simplifying the planning and scheduling of complex

trips. In particular, its main goal is to find the best schedule, route and set of flights, for any given flight

request. This includes the resolution of the unconstrained multi-city flight routing problem.

Despite the proximity to the Traveling Salesman Problem (TSP), the problem under resolution has

several attributes that distinguish it from the TSP and its time-dependent variation. These differences

led to the proposal of a formal definition of the problem, denoted by the Flying Tourist Problem (FTP),

that better describes the characteristics of the problem under resolution.

In order to solve the problem in an efficient manner, the presented work proposes different optimiza-

tion strategies that address different goals. The first goal relates to the ability of the system to present

an initial solution in a very short time. This is achieved by implementing simple heuristics, such as the

nearest neighbour. The second goal considers the ability of the system to produce high quality solu-

tions. To achieve this, this work implements two different meta-heuristic optimization algorithms: the Ant

Colony Optimization and the Simulated Annealing. The final goal of the system is to produce different

solutions to different objective functions. To achieve this, the optimization system is run multiple times,

using different representations of the problem, according to the objective function under consideration.

The presented work also considers an analysis of the quality of the solutions constructed by the

different optimization algorithms. The analysis of these results show that the solutions of the developed

meta-heuristics present a much higher quality than those provided by simpler heuristics, such as the

nearest neighbour. This improvement is modest for very small instances (3 nodes), but it is significant

for instances of medium and large sizes (5-20 nodes).

The discussion presented above leads to the conclusion that the developed system completly achieves

its main goals. First of all, it is successful in the resolution of unconstrained multi-city flight requests.

Second, the developed application is successful in the integration of the proposed solution. This, to-

gether with the analysis of the success rate of the different optimization algorithms, leads us to the

conclusion that the devised system is successful in the resolution of the FTP, and it is capable of saving

time and money in the planning and scheduling of trips of different sizes and complexities.

6.2 Achievements

The main achievements of the present work can be summarized as follows:

• The definition of the Flying Tourist Problem (FTP) allows the construction of single-flights, round-

trips and multi-city flight requests. This definition also integrates the concepts of extended start

dates, variable durations and time-windows, as to allow the construction of personalized requests.

80

• The development of an optimization system that implements the Ant Colony Optimization and

Simulated Annealing meta-heuristics, as well as other optimization methods, in the resolution of

the FTP.

• The development of a web application that allows the construction and resolution of user defined

FTP requests.

• The development of a back-end system, available via API, that integrates the developed optimiza-

tion system, and constructs solutions to the FTP requests.

• A comprehensive analysis of the success rate of different optimization algorithms in the resolution

of FTP instances with different sizes and parameters.

6.3 Future work

Given the present work, and considering the goal of developing better search tools for the planning and

scheduling of complex trips, there are some interesting extensions and possible improvements to the

developed system:

• Since the developed optimization system is implemented using the Python3 programming lan-

guage, there could be significant speedups in the optimization time by rewriting this module in a

language like C++, and by using parallel programming techniques to exploit the multiple processors

to accelerate this task,

• It is necessary to device a more efficient way to collect the flight data necessary for the resolution

of a FTP requests. For big instances, the bottleneck of the system is usually the data collection

process, and not the actual optimization.

• Finally, it would be particularly useful and interesting to extend the developed system by consid-

ering different public transportation means, such as bus and railroad services, in the resolution of

the proposed routing problem. This would extend the search space of the problem, and possibly

contribute to the construction of better itineraries and solutions to the requests.

81

Bibliography

[1] Kiwi.com, “Travelling Salesman Challenge,” https://travellingsalesman.cz/, June 2018.

[2] N. Rosenberg, “Innovation and economic growth,” Innovation and Growth in Tourism, OECD, pp.

43–52, 06 2006.

[3] W. Travel and T. Council, “Travel and Tourism: world economic impact in 2017,” https://www.wttc.

org/-/media/files/reports/economic-impact-research/regions-2017/world2017.pdf, June 2018.

[4] M. Westcott, Innovation and Growth in Tourism. B.C. Open Textbook project, 2006. [Online].

Available: https://www.oecd-ilibrary.org/content/publication/9789264025028-en

[5] A. M. Research, “Online Travel Market by Mode of Booking, Types of Platform, Service Types, and

Age Group: Global Opportunity Analysis and Industry Forecast, 2014-2022,” 2016.

[6] E. Inc., “Expedia API documentation,” https://hackathon.expedia.com/docs/public/api/, June 2018.

[7] A. I. group, “Amadeus for developers,” https://developers.amadeus.com/enterprise, June 2018.

[8] Kiwi.com, “API documentation,” https://docs.kiwi.com/, April 2018.

[9] E. Lawler, “Combinatorial Optimization: Networks and Matroids,” University of California at Berke-

ley, 1976.

[10] A. Schrijver, “Combinatorial Optimization: Polyhedra and Efficiency,” Springer Berlin Heidelberg,

2002, 2002.

[11] M. Dorigo and T. Stutzle, “Ant Colony Optimization: Overview and Recent Advances,” IRIDIA –

Technical Report Series ISSN 1781-3794, 2009.

[12] A. Punne, “The Traveling Salesman problem: applications, formulations and variations,” Science

+ Business, Springer, 2007.

[13] G. Dantzig, R. Fulkerson, and S. Johnson, “Solution of a Large-Scale Traveling-Salesman Prob-

lem,” Journal of the Operations Research Society of America, 1954.

83

https://travellingsalesman.cz/

https://www.wttc.org/-/media/files/reports/economic-impact-research/regions-2017/world2017.pdf

https://www.wttc.org/-/media/files/reports/economic-impact-research/regions-2017/world2017.pdf

https://www.oecd-ilibrary.org/content/publication/9789264025028-en

https://hackathon.expedia.com/docs/public/api/

https://developers.amadeus.com/enterprise

https://docs.kiwi.com/

[14] L. Shen, “Computer Solutions of the Traveling Salesman Problem,” The Bell System Technical

Journal ( Volume: 44, Issue: 10, Dec. 1965 ), 1965.

[15] L. Paquete, M. Chiarandini, and T. Stutzle, “Pareto Local Optimum Sets in the Biobjective Traveling

Salesman Problem: An Experimental Study,” Metaheuristics for Multiobjective Optimisation pp

177-199, 2007.

[16] Y. Caseau and F. Laburte, “Solving Small TSPs with Constraints,” Proceedings of the 14th inter-

national conference on logic programming, 1997.

[17] A. Punne, “A travelling salesman problem with allocation, time window and precedence constraints

and application to ship scheduling,” Intl. Trans. in Operation Research 7 (2000) 231-244, 2000.

[18] M. Savelsbergh, “Local search in routing problems with time windows,” Anuals of Operation Re-

search 4 (1986/6) 285-305, 1985.

[19] J. Kruskal, “On the shortest spanning subtree of a graph and the traveling salesman problem,”

American Mathematical Society, 1956.

[20] R. Karp, “Reducibility among combinatorial problems,” Complexity of Computer Computations,

vol. 40, pp. 85–103, 01 1972.

[21] ——, “Reducibility among combinatorial problems,” The IBM Research Symposia Series, 1972.

[22] T. Oncan, I. Altinel, and G. Lapote, “A comparative analysis of several asymmetric traveling sales-

manproblem formulations,” Computers Operation Research 36 (2009) 637-654, 2007.

[23] A. Punne, “A Traveling-Salesman-Based Approach to Aircraft Scheduling in the Terminal Area,”

Nasa Technical memorandum 100062, 1988.

[24] N. Agatz, P. Bouman, and M. Schmidt, “Optimization Approaches for the Traveling Salesman

Problem with Drone,” SSRN Electronic Journal. 10.2139/ssrn.2639672., 2016.

[25] F. Furini, “The Time Dependent Traveling Salesman Planning Problem in Controlled Airspace,”

Transportation Research Part B: Methodological. 90. 38-55. 10.1016/j.trb.2016.04.009., 2015.

[26] “The air Traveling Salesman,” https://sites.google.com/site/travellingcudasalesman/, March 2018.

[27] J. Schneider, “The time-dependent traveling salesman problem,” Physica A 314 (2002) 151 – 155,

2002.

[28] L. Gouveia and S. Vob, “A classification of formulations for the (time-dependent) traveling sales-

man problem,” European Journal of Operational Research 83 (1995) 69-82, 1993.

84

https://sites.google.com/site/travellingcudasalesman/

[29] D. Venkatesan, K. Kannan, and S. Balachandar, “A New Genetic Algorithm for Time Depen-

dent Combinatorial Optimization Problem,” Natl. Acad. Sci. Lett. DOI 10.1007/s40009-016-0433-5,

2015.

[30] C. Malandraki, “Time dependent vehicle routing problems: formulations, properties and heuristic

algorithms,” Transportation Science 26(3):185-200 · August 199, 1992.

[31] S. Porta, P. Crucitti, and V. Latora, “The network analysis of urban streets: A dual approach,”

Physica A 369 (2006) 853–866, 2006.

[32] J. MacGregor and Y. Chu, “Human performance of the traveling salesman and related problems:

a review,” The Journal of Problem Solving • volume 3, no. 2 (Winter 2011), 2011.

[33] G. Gutin and A. Punne, “The Traveling Salesman Problem and its variations,” Science + Business,

Springer, 2007.

[34] H. Abeledo and R. Fukasawa, “The time dependent traveling salesman problem: polyhedra and

algorithm,” Springer and Mathematical Optimization Society 2012, DOI: DOI 10.1007/s12532-012-

0047-y, 2013.

[35] J. Picard and M. Queyranne, “The Time-Dependent Traveling Salesman Problem and Its Applica-

tion to the Tardiness Problem in One-Machine Scheduling,” Operations Research, 1978.

[36] E. M. Loiola, N. M. M. de Abreu, P. O. Boaventura-Netto, P. Hahn, and T. Querido, “A survey

for the quadratic assignment problem,” European Journal of Operational Research, vol. 176,

no. 2, pp. 657 – 690, 2007. [Online]. Available: http://www.sciencedirect.com/science/article/pii/

S0377221705008337

[37] J. Albiach, D. Soler, and E. Martines, “A way to optimally solve a time-dependent Vehicle Routing

Problem with Time Windows,” Operations Research Letters 37 (2009) 37-42, 2008.

[38] C. Cheng and C. Mao, “A modified ant colony system for solving the travelling salesman problem

with time windows,” Mathematical and Computer Modelling 46 (2007) 1225–1235, 2007.

[39] N. Ascheuer, M. Fischetti, and M. Grotschel, “A Polyhedral Study of the Asymmetric Traveling

Salesman Problem with Time Windows,” Networks, Vol. 36(2), 69-79 2000, 2000.

[40] E. Ulungu and J. Tehhem, “Multi-objective Combinatorial Optimization Problems: A Survey,” Jour-

nal of Multi-Criteria Decision Analysis, 1994.

[41] N. Jozefowiez, F. Semet, and E. Talbi, “Multi-objective vehicle routing problems,” European Journal

of Operational Research, 2008.

85

http://www.sciencedirect.com/science/article/pii/S0377221705008337

http://www.sciencedirect.com/science/article/pii/S0377221705008337

[42] Z. Yan, L. Zhang, L. Kang, and G. Lin, “A New MOEA for Multi-objective TSP and Its Convergence

Property Analysis,” Springer-Verlag Berlin Heidelberg, 2003.

[43] D. Veldhuizen and G. Lamont, “Multiobjective Evolutionary Algorithms: Analyzing the State-of-the-

Art,” Evolutionary Computation, 2000.

[44] F. Choobineh, E. Mohebbi, and H. Khoo, “A multi-objective tabu search for a single-machine

scheduling problem with sequence-dependent setup times,” European Journal of Operational Re-

search, 2006.

[45] G. Dantzig and J. Ramser, “The Truck dispatching problem,” Management Science, Vol. 6, No. 1

(Oct., 1959), pp. 80-91, 1959.

[46] J. Lenstra and A. Kan, “Complexity of vehicle routing and scheduling problem,” Networks 11, 221-

227, 1981.

[47] G. Laporte, “The vehicle routing problem: An overview of exact and approximate algorithms,”

European Journal of Operational Research 59 (1992) 345-358, 1992.

[48] D. Soler, J. Albiach, and E. Martinez, “A way to optimally solve the time-dependent Vehicle Routing

Problem with time windows,” Operations Research Letters 37 (2009) 37–42, 2009.

[49] L. Gambardella, E. Taillard, and G. Agazzi, “MACS-VRPTW: Vehicle Routing Problem with time

windows,” New Ideas in Optimization, McGraw-Hill, London, 1999, pp. 63–76, 1999.

[50] C. Blum, “Metaheuristics in Combinatorial Optimization: Overview and Conceptual Comparison,”

ACM Computing Surveys, Vol. 35, No. 3, September 2003, pp. 268–308, 2003.

[51] C. Malandraki and M. Daskin, “Time Dependent Vehicle Routing Problems: Formulations, proper-

ties and heuristic algorithms,” Transportation Science 26(3):185-200, 1992.

[52] A. Donati, M. R, N. Casagrande, A. Rizzoli, and L. Gambardella, “Time dependent vehicle routing

problem with a multi ant colony system,” European Journal of Operational Research 185 (2008)

1174–1191, 2006.

[53] M. Gendreau, J. Potvin, O. Braysy, G. Hasle, and A. Looketagen, “Metaheuristics for the Vehicle

Routing Problem and Its Extensions: A Categorized Bibliography,” The Vehicle Routing Problem:

Latest Advances and New Challenges pp 143-169. Operations Research/Computer Science In-

terfaces, vol 43. Springer, Boston, MA, 2007.

[54] Y. Kuo, “Using simulated annealing to minimize fuel consumption for the time-dependent vehicle

routing problem,” Computers & Industrial Engineering 59 (2010) 157–165, 2010.

86

[55] J. Soojung and A. Haghani, “Genetic algorithm for the time-dependent vehicle routing problem,”

Transportation Research Record 1771 Paper No. 01-0261, 2001.

[56] G. Laporte, “The vehicle routing problem: An overview of exact and approximate algorithms,”

European Journal of Operational Research, vol. 59, pp. 345–358, 06 1992.

[57] C. Malandraki and M. Daskin, “Time Dependent Vehicle Routing Problems: Formulations, Prop-

erties and Heuristic Algorithms,” Transportation Science 26(3):185-200., 2017.

[58] M. Figliozzi, “The time dependent vehicle routing problem with time windows: Benchmark prob-

lems, an efficient solution algorithm, and solution characteristics,” Transportation Research Part E

48 (2012) 616–636, 2012.

[59] H. Chen, C. Hsueh, and M. Chang, “The real-time time-dependent vehicle routing problem,” Trans-

portation Research Part E 42 (2006) 383–408, 2006.

[60] A. Haghani and S. Jung, “A dynamic vehicle routing problem with time-dependent travel times,”

Computers & Operations Research 32 (2005) 2959–2986, 2004.

[61] D. Jones, S. Mirrazavi, and M. Tamiz, “Multi-objective meta-heuristics: An overview of the current

state-of-the-art,” European Journal of Operational Research 137 (2002) 1-9, 2002.

[62] K. Deb, “Multi-objective optimization usingevolutionary algorithms: an introduction,” KanGAL Re-

port Number 2011003, 2011.

[63] ——, “A Simulated Annealing-Based Multiobjective Optimization Algorithm: AMOSA,” IEEE trans-

actions on evolutionary computation, 2011.

[64] G. Laporte, “The traveling salesman problem: An overview of exact and approximate algorithms,”

European Journal of Operational Research, vol. 59, no. 2, pp. 231–247, June 1992. [Online].

Available: https://ideas.repec.org/a/eee/ejores/v59y1992i2p231-247.html

[65] J. Albiach, M. Sanchis, and D. Soler, “An asymmetric TSP with time windows and with time-

dependent travel times and costs: An exact solution through a graph transformation,” European

Journal of Operational Research 189 (2008) 789–802, 2007.

[66] M. Soloman, “An exact algorithm for the traveling salesman problem with time windows,” Opera-

tions Research, Vol. 43, No. 2 (Mar. - Apr., 1995), pp. 367-371, 1992.

[67] G. Pataki, “Teaching integer programming formulations using the traveling salesman problem,”

Siam Review - SIAM REV, vol. 45, pp. 116–123, 03 2003.

[68] J. Clausen, “Branch and bound algorithms – principles and examples,” 1999.

87

https://ideas.repec.org/a/eee/ejores/v59y1992i2p231-247.html

[69] B. Golden, L. Bodin, T. Doyle, and W. Stewart, “Approximate traveling salesman

algorithms,” Operations Research, vol. 28, no. 3, pp. 694–711, 1980. [Online]. Available:

http://www.jstor.org/stable/170036

[70] M. Held and R. M. Karp, “The traveling-salesman problem and minimum spanning

trees,” Operations Research, vol. 18, no. 6, pp. 1138–1162, 1970. [Online]. Available:

https://doi.org/10.1287/opre.18.6.1138

[71] D. Johnson and L. McGeoch, “The traveling salesman problem: A case study in local optimization,”

Local Search in Combinatorial Optimization, vol. 1, 01 1997.

[72] S. Lin and B. Kernighan, “An Effective Heuristic Algorithm for the Traveling-Salesman Problem,”

Bell Telephone Laboratories, Incorporated, Murray Hill, N.J., 1971.

[73] R. Jonker and T. Volgenant, “Transforming asymmetric into symmetric traveling salesman prob-

lems,” Operations Research Letters Volume 2, Issue 4, November 1983, Pages 161-163, 1983.

[74] I. Osman and J. Kelly, “Meta-Heuristics: An Overview,” Luwer academic publishers, 1996.

[75] M. Dorigo and G. Caro, “Ant Colony Optimization: a newmeta-heuristic,” Evolutionary Computa-

tion, 1999. CEC 99. Proceedings of the 1999 Congress on, 1999.

[76] M. Dorigo and T. Stutzle, “Ant Colony Optimization,” MIT Press, Cambridge, MA, 2004, 2004.

[77] S. Goss, S. Aron, J. L. Deneubourg, and P. J. M, “The Self-Organizing Exploratory Pattern of the

Argentine Ant,” Journal of lnsect Behavior, Vol. 3, No. 2, 1990, 1989.

[78] ——, “Self-organized shortcuts in the Argentine Ant,” Naturwissenschaften 76: 579-581, 1989.

[79] E. Aarts, J. Kost, and W. Michiels, “Simulated annealing,” Naturwissenschaften 76: 579-581, 1989.

[80] S. Radhakrishnan and J. Venture, “Simulated annealing for parallel machine scheduling with

earliness-tardiness penalties and sequencedependent set-up times,” International Journal of Pro-

duction Research, 38:10, 2233-2252, 2010.

[81] D. Henderson, S. H. Jacobson, and A. W. Johnson, The Theory and Practice of Simulated An-

nealing. Boston, MA: Springer US, 2003, pp. 287–319.

[82] Y. Nourani and B. Andresen, “A comparison of simulated annealing cooling strategies,” Journal

of Physics A: Mathematical and General, vol. 31, no. 41, pp. 8373–8385, oct 1998. [Online].

Available: https://doi.org/10.1088%2F0305-4470%2F31%2F41%2F011

88

http://www.jstor.org/stable/170036


https://doi.org/10.1088%2F0305-4470%2F31%2F41%2F011

[83] J.-C. Picard and M. Queyranne, “The time-dependent traveling salesman problem and its

application to the tardiness problem in one-machine scheduling,” Operations Research, vol. 26,

no. 1, pp. 86–110, 1978. [Online]. Available: https://doi.org/10.1287/opre.26.1.86

[84] M. Dorigo and L. M. Gambardella, “Ant colony system: a cooperative learning approach to the

traveling salesman problem,” Central European Researchers Journal, vol. 1, no. 1, pp. 53–66,

April 1997.

[85] J. Szabo, “Comparison of methods for generating initial solution for simulated annealing,” Faculty

of Management Science and Informatics, University of Zilina, vol. 2, no. 1, pp. 37–41, June 2016.

[86] Metropolis, A. Rosenbluth, M. Rosenbluth, A. Teller, and E. J. Teller, “Equation of state calculations

by fast computing machines,” Journal of Chemical Physics, vol. 21, pp. 1087–1092, 01 1953.

[87] C. Y. Wang, M. Lin, Y. Zhong, and H. Zhang, “Solving travelling salesman problem using multiagent

simulated annealing algorithm with instance-based sampling,” International Journal of Computing

Science and Mathematics, vol. 6, pp. 336–353, 09 2015.

[88] H. D. Center, “What is Heroku,” https://www-staging.heroku.com/what, June 2018.

[89] G. Cloud, “Google App Engine,” https://cloud.google.com/appengine/, June 2018.

[90] Amazon.com, “Amazon Web Services,” https://aws.amazon.com/, June 2018.

[91] Stackshare, “Heroku vs. Google App Engine vs. AWS Elastic Beanstalk,” https://stackshare.io/

stackups/aws-elastic-beanstalk-vs-google-app-engine-vs-heroku, June 2018.

[92] H. Jin, S. Ibrahim, T. Bell, W. Gao, D. Huang, and S. Wu, Cloud Types and Services. Springer

US, 2010, pp. 335–355. [Online]. Available: https://doi.org/10.1007/978-1-4419-6524-0 14

[93] H. D. Center, “How Heroku Works,” https://devcenter.heroku.com/articles/how-heroku-works,

June 2018.

[94] N. Foundation, “Node.js,” https://nodejs.org/en/, June 2018.

[95] S. Tilkov and S. Vinoski, “Node.js: Using javascript to build high-performance network programs,”

vol. 14, pp. 80 – 83, 01 2011.

[96] M. Cantelon, M. Harter, T. Holowaychuk, and N. Rajlich, Node.Js in Action, 1st ed. Greenwich,

CT, USA: Manning Publications Co., 2013.

[97] D. S. Foundation, “Django,” https://www.djangoproject.com/, June 2018.

[98] F. Inc., “React: a JavaScript library for building User Interfaces,” https://reactjs.org/, June 2018.

89


https://www-staging.heroku.com/what

https://cloud.google.com/appengine/

https://aws.amazon.com/

https://stackshare.io/stackups/aws-elastic-beanstalk-vs-google-app-engine-vs-heroku

https://stackshare.io/stackups/aws-elastic-beanstalk-vs-google-app-engine-vs-heroku

https://doi.org/10.1007/978-1-4419-6524-0_14

https://devcenter.heroku.com/articles/how-heroku-works

https://nodejs.org/en/

https://www.djangoproject.com/

https://reactjs.org/

[99] M. K. Caspers, “React and redux,” Rich Internet Applications, Carl von Ossietzky University, Old-

enburg, pp. 14–18, 02 2017.

[100] D. Abramov, “Redux: a predictable state container for JavaScript applications,” https://redux.js.

org/, June 2018.

[101] O. ravis E, “A guide to numpy,,” 2006, uSA: Trelgol Publishing,.

[102] R. J. V. Wiel and N. V. Sahinidis, “Heuristic bounds and test problem generation for the

time-dependent traveling salesman problem,” Transportation Science, vol. 29, no. 2, pp. 167–183,

1995. [Online]. Available: https://doi.org/10.1287/trsc.29.2.167

90

https://redux.js.org/

https://redux.js.org/

https://doi.org/10.1287/trsc.29.2.167

Flight Time and Cost Minimization in Complex Routes · Declaration I declare that this document is...

Documents

Transcript of Flight Time and Cost Minimization in Complex Routes · Declaration I declare that this document is...