Centre for Geo-Information Thesis Report GIRS-2017-07

89
26-01-2017 Centre for Geo-Information Thesis Report GIRS-2017-0 7 Site selection for small retailers in the food industry using demographic data and network-constrained kernel density estimation Bob Houtkooper

Transcript of Centre for Geo-Information Thesis Report GIRS-2017-07

26

-01

-20

17

Centre for Geo-Information

Thesis Report GIRS-2017-07

Site selection for small retailers in the food industry using

demographic data and network-constrained kernel density

estimation

Bob Houtkooper

ii

iii

Site selection for small retailers in the food industry using

demographic data and network-constrained kernel density

estimation

Bob Houtkooper

Registration number 93 06 23 368 090

Supervisor:

dr.ir. S (Sytze) de Bruin

A thesis submitted in partial fulfilment of the degree of Master of Science

at Wageningen University and Research Centre,

The Netherlands.

26-01-2017

Wageningen, The Netherlands

Thesis code number: GRS-80436

Thesis Report: GIRS-2017-07

Wageningen University and Research Centre

Laboratory of Geo-Information Science and Remote Sensing

iv

v

Foreword

It was quite odd to work on an individual project for such a long time. I have experienced it to be

fairly learnful. I want to thank everyone who helped me through the thesis process. Sytze de Bruin,

thank you for your help and contributions.

Bob Houtkooper

23-01-2017, Wageningen

vi

vii

Abstract

Despite the high financial risks of site selection, the implementation of spatial data to

support site selection for small food retailers is still limited. This thesis proposes a site

selection method to help finding multi-location food ventures using a simultaneous

approach. The used objective function incorporates variables concerning competition, demand,

and site attractiveness. Competition was computed with a network kernel density estimation

(KDE). Socio-demographic data, which describe the demand and attractiveness variables, were

retrieved and overlaid on a road network. The road segments with the highest utility were found

with an optimisation method. In this research, two heuristic optimisation methods were

examined: simulated annealing and greedy algorithm. A case which concerned finding the

optimal locations of three bakeries in Amersfoort was executed. A user interface was developed

to allow the adjustment of weights for the individual variables of the objective function, as well

as the number of stores, and the food retailer type. A sensitivity analysis was conducted to

assess the sensitivity of the approach to parameter settings. Main findings indicated that the

weights for the variables are unknown and differ among food specialists. Owing to the

computation load of the network KDE, the method could only be applied at the scale of a city.

Also, the available data were not all up-to-date and future demographic predictions were not

included. Therefore, the experimental results should be considered as an illustration of the

approach. A case was executed for the city of Amersfoort, the Netherlands. The used greedy

algorithm was an applicable optimisation method to find single optimal locations. However,

the chance that the global optimal was found decreased with increasing numbers of locations.

Therefore, when multiple stores locations were to be found, the simulated annealing

optimisation method was preferable. Under the default settings, the optimal locations were

found around the west and south side of the city centre. This area can be considered attractive,

because it is populated by people with higher incomes, demand is high, and competition is

generally, zero or close-to-zero.

Keywords: site selection, small food retail stores, network-constrained point patterns, spatial

optimisation, simulated annealing, greedy algorithm, sensitivity analysis.

viii

ix

Table of contents

1. Introduction ....................................................................................................................... 1

1.1. Context and background .............................................................................................. 1

1.2. Problem definition ....................................................................................................... 1

1.3. Research questions ...................................................................................................... 3

1.4. Structure of the report .................................................................................................. 4

2. Review ................................................................................................................................ 5

2.1. Site preferences in relation to space ............................................................................ 5

2.2. Spatial networks .......................................................................................................... 7

2.3. Network kernel density estimation .............................................................................. 7

2.4. Spatial optimisation ..................................................................................................... 8

2.5. Food retailer location data ........................................................................................... 9

3. Methods ............................................................................................................................ 10

3.1. Objective function ..................................................................................................... 10

3.1.1. Composite utility ................................................................................................ 10

3.1.2. Study area ........................................................................................................... 14

3.2. Optimisation method ................................................................................................. 20

3.2.1. Optimal locations ............................................................................................... 20

3.2.2. Greedy algorithm ................................................................................................ 20

3.2.3. Simulated annealing ........................................................................................... 21

3.2.4. From road segments to areas .............................................................................. 24

3.3. Case study .................................................................................................................. 24

3.3.1. Optimal solutions ............................................................................................... 24

3.3.2. User interface ..................................................................................................... 25

3.4. Sensitivity analysis .................................................................................................... 26

4. Results .............................................................................................................................. 29

4.1. Objective function ..................................................................................................... 29

4.1.1. Explorative analysis ........................................................................................... 29

x

4.1.2. Composition ....................................................................................................... 41

4.2. Optimisation method ................................................................................................. 42

4.2.1. Greedy algorithm ................................................................................................ 42

4.2.2. Simulated annealing ........................................................................................... 42

4.3. Case results ................................................................................................................ 44

4.3.1. Optimal solutions ............................................................................................... 44

4.3.2. User interface ..................................................................................................... 48

4.4. Sensitivity analysis .................................................................................................... 51

5. Discussion ......................................................................................................................... 54

5.1. Objective function ..................................................................................................... 54

5.2. Optimisation method ................................................................................................. 55

5.3. Case ........................................................................................................................... 56

5.3.1. Optimal solutions ............................................................................................... 56

5.3.2. User interface ..................................................................................................... 56

5.4. Sensitivity analysis .................................................................................................... 56

6. Conclusions, limitations and recommendations ........................................................... 58

6.1. Conclusions ............................................................................................................... 58

6.2. Limitations ................................................................................................................. 59

6.3. Recommendations ..................................................................................................... 60

7. References ........................................................................................................................ 62

8. Appendix I: Network KDE Python-code ...................................................................... 67

9. Appendix II: JSON to CSV R-code ............................................................................... 70

10. Appendix III: Location points for Google Places API query ...................................... 72

11. Appendix IV: R-code for interactive map .................................................................... 73

xi

List of figures

Figure 3.1: Diagram composite objective function; hexagons represent input, squares are

functions, and ellipses represent outputs. ................................................................................. 12

Figure 3.2: Study area. ............................................................................................................. 14

Figure 3.3: Flowchart network KDE calculation; the hexagons represent data, rectangles

represent actions, and the diamond represents a conditional statement. .................................. 17

Figure 3.4: Fishnet over the study area. ................................................................................... 19

Figure 4.1: Locations of retrieved food retailers within the study area based on Google maps.

.................................................................................................................................................. 30

Figure 4.2: The Amersfoort study area. ................................................................................... 32

Figure 4.3: Location of bakeries, butchers, supermarkets and greengroceries in the Amersfoort

study area. ................................................................................................................................. 33

Figure 4.4: A categorised road network of the city centre of Amersfoort on the number of

retailers that are closest by. ...................................................................................................... 34

Figure 4.5: Service areas around source points. Notice that service areas may overlap. ......... 35

Figure 4.6: Network KDE densities of Amersfoort with a bandwidth of 250 metres. ............. 36

Figure 4.7: Network KDE densities of Amersfoort with a bandwidth of 500 metres. ............. 37

Figure 4.8: Normalised demand in the Amersfoort study area. ............................................... 38

Figure 4.9: Normalised demand variable on a network space ................................................. 39

Figure 4.10: Normalised number of elderly people. . .......................................................... 40

Figure 4.11: Normalised number of people with high income. ............................................... 40

Figure 4.12: Normalised urbanity classification. ..................................................................... 40

Figure 4.13: Normalised attractiveness variable. ..................................................................... 40

Figure 4.14: Existing food retailers on attractiveness variable map in the city centre of

Amersfoort. .............................................................................................................................. 41

Figure 4.15: Composite utility map of Amersfoort corresponding to scenario 1..................... 42

Figure 4.16: Probability trace of simulated annealing for five simultaneous store locations .. 43

Figure 4.17: Trace of the joint utility of the accepted road segment composition ................... 44

Figure 4.18: Three optimal locations for new bakeries in Amersfoort determined with the

greedy algorithm. ..................................................................................................................... 45

Figure 4.19: Competition variable map, prior (left) and after (right) adding two stores by

optimisation. ............................................................................................................................. 46

xii

Figure 4.20: The road network is visualised together with the optimal sites and optimal locations

for four iterations; GA stands for greedy algorithm and the number presents the store location

number. ..................................................................................................................................... 47

Figure 4.21: Five optimal locations for new bakeries in Amersfoort determined with simulated

annealing. ................................................................................................................................. 48

Figure 4.22: GUI showing a map of Amersfoort with an information bar, a demand raster,

demand legend, disabled parameter inputs, and a disabled execute button. ............................ 49

Figure 4.23: GUI showing a map of Amersfoort with two optimal locations, two optimal areas,

optimal area legend, enabled parameter inputs, and an enabled execute button. ..................... 50

Figure 4.24: Optimal locations if the number of stores is set to five (greedy algorithm). ....... 51

Figure 4.25: Visualisation of results, neglecting one variable; GA stands for greedy algorithm,

com0 was when the competition variable is set to zero, dem0 was when the demand variable

was set to zero, and the att0 was when the attractiveness variable was set to zero.................. 52

List of tables

Table 3.1: Simulated annealing settings. .................................................................................. 22

Table 3.2: Scenario one case study. ......................................................................................... 24

Table 3.3: Scenario two case study. ......................................................................................... 25

Table 3.4: Scenarios to assess the sensitivity of the food retailer type. ................................... 27

Table 3.5: Scenarios to assess the sensitivity of the variables. ................................................ 28

Table 4.1: Number of food retailers in the study area. ............................................................. 31

Table 10: Location points Google Places API query ............................................................. 722

List of boxes

Box 3.1: Pseudo code greedy algorithm in a network space. ................................................... 21

Box 3.2: Pseudo code simulated annealing. ............................................................................. 23

xiii

List of abbreviations

AHP = Analytic Hierarchy Process

API = Application Programming Interface

CBS = Central Bureau of Statistics

CSV = Comma Separated File

GDAL = Geospatial Data Abstraction Library

GIS = Geo-Information Science

GUI = Graphical User Interface

GWR = Geographically Weighted Regression

JSON = JavaScript Object Notation

KDE = Kernel Density Estimation

MGI = Master Geo-Information science

NWB = Nationaal Wegen Bestand

OAT = One-factor-At-the-Time

OGR = OpenGIS simple features Reference implementation

OS = Operating System

OSM = Open Street Map

RD = RijksDriehoekstelsel

URL = Uniform Resource Locator

XML = eXtensible Markup Language

HTML = HyperText Markup Language

xiv

1

1. Introduction

In section 1.1 and section 1.2, the background of the problem is defined and the major findings

of this field of interest are stated. Subsequently, the knowledge gap is given, which describes

the general need for this study. Moreover, the target audience is defined. In section 1.3, the

research questions are stated. Finally, in section 1.4, the structure of this report is presented.

1.1. Context and background

Analysis of spatial data is useful for several strategic business activities. Strategic management

refers to the overall direction of a company (Sarkar, 2007). An example of a strategic business

activity is doing market analysis. A market analysis is conducted to find out if a certain market

is attractive for a certain organisation. The distribution of competitors and customers is critical

information. Besides strategic management, spatial data is used to optimise operational business

activities, for instance resource allocation, the reduction of operating costs, and monitoring of

business activities (Sarkar, 2007). These operational activities control the day-to-day

operations. Geo-information system (GIS) concepts and techniques are suitable for regional

business, because they can support decision-making in regional marketing, spatial planning,

and logistics (Cui et al., 2012). GIS is already widely used in various industries, for example in

the insurance, retail, transportation, tourism, real estate, and telecommunication industries (Cui

et al., 2012).

1.2. Problem definition

Small food enterprises need support. (ING, 2014; Central Bureau of Statistics, 2015; Rabobank,

2007). The economic bureau of the ING bank (ING, 2014) revealed that the number of bakeries,

butchers, and greengroceries decline in the Netherlands, owing to competition of supermarkets.

The number of greengroceries declined the most; between 2008 and 2013 15% of the

greengroceries disappeared. In the same period, butchers and bakeries faced a decline of 10%

and 7%, respectively (ING, 2014). Data of the Central Bureau of Statistics endorse the declining

numbers of poultry shops, butchers and greengroceries between 2007 and 2014 (CBS, 2015).

The Rabobank (2007) found that various consumer trends negatively affect specialised food

retailers, for instance growth of hard discount and online food, changing consumer preferences

(ready-to-eat meals), and a decline of customer loyalty. In conclusion, traditional food shops

are disappearing in the Netherlands.

2

Today, most consumers buy everything at one place: the supermarket. This trend of buying all

groceries at the supermarket causes a decline in social contact (Raven, Lang & Dumonteil,

1995; Pettinger, Holdsworth & Gerber, 2008; Blythman, 2012). Researches endorse that small

food enterprises encounter problems, due to competition of supermarkets (Raven et al., 1995;

Blythman, 2012). Multiple small food ventures can easily be swallowed by one single

supermarket. The decline of small food retailers is deplored, mainly because they hold

communities together (Pettinger et al., 2008). Pettinger et al. (2008) conducted research on

shopping behaviour differences between England and France. Results pointed to a single

difference in shopping behaviour: the French often shop in individual traditional shops, whereas

the English prefer shopping in supermarkets. England faces higher obesity rates than France. A

correlation was assumed between obesity rates and shopping culture in a country (Pettinger et

al., 2008). The authors claimed that supermarkets generally offer less healthy products and that

they tempt consumers to buy cheaper products. A more far-fetched reason was that consumers

get less exercise when going to only one store when doing grocery shopping. However, the

authors warned that more in-depth research is required to infer causal relations between health

and shopping at supermarkets.

Traditional food shops require help in their survival of the growing competition of

supermarkets. One way to help entrepreneurs is to assist them in determining best locations for

new shops. This can be done with a site selection method. An aid for site selection is useful,

since opening a new food specialist store faces high financial risk in the retail sector (Roig-

Tierno et al., 2013).

Church (2002) mentioned it is easy to conclude that the success of many location applications

in the future may be intimately tied to GIS. Sarkar (2007) emphasised that the barriers to

implement GIS in business is deteriorating. He stated that computing power is increasing, data

availability is wider, software is more extensive available, integration with corporate databases

is easier, and the internet is more and more used to share data and software. Therefore, a GIS

method is proposed.

Especially in the Netherlands, a lot of potentially useful spatial data are available, which can be

applied in a site selection method. GIS could turn these data into information by combining

different data sources and methods. The method I propose aims to be utilised by small retail

owners in the food industry, for finding multiple locations or a single location for their venture.

This will help small retail owners in selecting a site for their venture. Other researches primarily

focussed on site selection for supermarkets outside the Netherlands with inputs that are not

3

easily accessible (Suárez-Vega et al., 2014; Turk et al., 2014; Rui et al., 2016). Suárez-Vega et

al. (2014) utilised a commercial and an industrial index. Turk et al. (2014) distributed 1,100

questionnaires. Rui et al. used store brandings and store types.

In conclusion, a site selection method, designed for bakeries, butchers and greengroceries in

the Netherlands, was developed in this research. As mentioned by Rui et al (2016) network

KDE in combination with demographic data can be applied in order to explore site selection for

retail stores. According to my knowledge, no site selection method, based on demographic data

and a network-constrained point pattern, yet exists for small food ventures in the Netherlands.

Other researches required inputs from which the data are not easy to retrieve and therefore not

easily applicable. The method is also feasible when a retail owner wants to open more than one

store, as my method makes use of a heuristic spatial optimisation method.

This thesis targets small retailers in the food industry, especially those from the Netherlands.

The focus is mainly on the Netherlands, since the required data may not readily available in

other countries. Besides that, the attractiveness factors were based on market research in the

Netherlands. The site selection method was established by the following research design. First,

a literature study was conducted to define an objective function. Second, data sources required

to be retrieved and altered and methods were implemented. Therefore, an explorative analysis

was conducted to examine if the data could be attained and transformed and to examine the

methods. Third, a user interface was developed, in which users can adjust weights and other

settings, which on their turn can affect the objective function, and visualise the results. Fourth,

a case study was conducted to test the method. Finally, a sensitivity analysis was conducted to

assess the sensitivity of the approach to parameter settings. The research aims to answer the

research questions described below.

1.3. Research questions

The objective of this thesis is to create a site selection method for small retail owners in the

food industry. Retail owners can use the method as an aid in their site selection decision process.

4

To reach the main objective, four research questions need to be answered:

Which objective function is usable for site selection for food retail stores?

Which optimisation method can be implemented to find optimal locations?

Where in a case study area lie opportunities to start a new, small multi-location food

venture?

How sensitive are the outcomes of the site selection for different parameter settings?

1.4. Structure of the report

This thesis consists of six sections. Chapter 2 is a review of the literature on location analysis

and GIS. Also, the methods which were used in this thesis are reviewed. Chapter 3 explains the

methodology. Section 3.1.1 presents the composition of the objective function. Then, section

3.1.2 describes how the data of the variables of the objective function were retrieved and

calculated. Moreover, the study area was shown. Section 3.2 presents the methodology of the

optimisation methods. Section 3.3 explains how the case was executed and section 3.4 presents

the sensitivity analysis. In chapter 4 the methods are applied on a case study. First, an

explorative analysis is described that aimed to check if all the data could be retrieved. Chapter

5 discusses the results and compares them to results found in literature. Chapter 6 covers the

conclusion, limitations and recommendations. The methods, results, discussion, and conclusion

sectors are ordered the same as the sequence of the research questions.

5

2. Review

The review chapter presents the results and methods from relevant literature. Ideas and

methods from various researchers are summarised to give an overview of the main concepts in

the site selection field.

2.1. Site preferences in relation to space

One of the main factors which influences the feasibility of a retail store is the spatial dispersion

of retailers and consumers (Davis, 2006). Geo-demand and geo-competition hinge on this

factor. Geo-demand is the position of customers and geo-competition relates to the location of

competition (Roig-Tierno, Baviera-Puig, Buitrago-Vera & Mas-Verdu, 2013). However, the

site selection depends on more than just spatial dispersion (Wood & Reynolds, 2012).

Suárez-Vega et al. (2012) noted that there are two frequently conflicting objectives in site

selection: maximisation of the total market share captured by the firm and minimisation of

market share losses for its existing facilities due to being captured by the new firm. The authors

combined location models and GIS to create tools to help locating one new store in a franchise

distribution system in a continuous space (Suárez-Vega et al., 2012). They concluded that GIS

tools can broaden the vision of an entrepreneur when opening a new store. Suárez-Vega et al.

(2014) used two methods: a weighting and a constraint method. In the weighting method a

weight was assigned to each function and the weighted sum of these functions was optimised.

The constraint method showed the cannibalisation cost. Cannibalisation occurs when a new

chain store causes a decline in customers for another store in the same chain. The

cannibalisation cost was subtracted from the weight value.

In contrast, Turk et al. (2014) did not consider competition. Their regional model only

considered socio-demographic variables which were used for creating consumption maps to

find optimal store locations. Turk et al. (2014) distributed questionnaires to obtain information

about consumption on a region scale.

Another well-known site selection method commonly used in the field of retail distribution is

the Huff model (Huff, 1964). It is the most common way to delimitate trade area (Baray &

Cliquet, 2007). The Huff model assumes that two factors determine the trade area: the distance

between customers and stores and the attractiveness of the stores (Huff, 1964). Suárez-Vega et

al. (2015) presented an adjusted Huff model with spatial nonstationary parameters: distance,

6

attractiveness, and competition. While global models ignore individual customer preferences,

their local Huff model did account for it. Suárez-Vega et al. (2015) calibrated the Huff model

via geographically weighted regression (GWR). GWR is a local regression technique to

estimate parameters for every point in the study area, showing the variability over the analysed

space (Brunsdon et al., 1996). This technique is based on the theory about things that are close

by that close things tend to have similar values. Locally observed data are more influential than

data from more remote locations.

In another research (Sadler, 2016), local expert knowledge was used to optimise new sites for

healthy food ventures in social-distressed areas. In this research, a three-step process was

introduced, which included an analytic hierarchy process (AHP), direct mapping, and point

allocation of key variables. AHP is a method to assess the importance of variables in a multi-

criteria decision making. Experts score variables pair-wise, experts decide which of the two

variables is more important. In the direct mapping process, experts were asked to mark were

they thought a new location would be of most value. In the point allocation process, experts

were asked to divide a certain amount of points to variables.

Roig-Tierno et al. (2013) combined a kernel density estimation with an AHP. The kernel

density estimation was used to find hotspots of potential customers. Those hotspots were ranked

with the help of an AHP.

Scott and He (2012) developed a constrained model to predict destination choice for shopping.

Two of their results are interesting for this thesis. First, seniors prefer shopping at bakeries.

Second, the income determines where people prefer to shop. These statements coincide with

findings of Detailhandel Nederland, a Dutch foundation which provides information on retail

in the Netherlands (Raatgever, 2014). Detailhandel Nederland expects that in the future, only

small food retailers in city centres will be feasible (Raatgever, 2014). The market research

company GfK claimed that food retail specialists gain large turnovers from wealthy elderly

(Holla, 2013), who tend to live in certain neighbourhoods.

Some studies (Suárez-Vega et al., 2015; Goodchild, 1984) use store attractiveness instead of

site attractiveness. For example, Suárez-Vega et al. (2015) only consider store size. The bigger

the store size, the higher the attractiveness value, which may apply to supermarkets but perhaps

not to specialist food shops. Besides, store size data are not easily available on a bigger scale.

Goodchild (1984) states that store attractiveness characteristics are not only physical, but also

short-term variables, like pricing and advertising. Two problems occur according to Goodchild

7

(1984) when implementing store attractiveness characteristics. First, if you implement more

than one store, then assumed locations influence the competition variable, and store size can

never be assigned to a hypothetical location. Second, again, store size is hard to retrieve.

Therefore, I have chosen to use site attractiveness criteria.

2.2. Spatial networks

Cui et al. (2012) proposed a method to precisely delimitate a trade area, focusing on chain

businesses. These are individual stores with the same brand name, such as “Bakker Bart” or

“Keurslager”. The authors stressed the importance of measuring trade areas since this improves

the understanding of existing market opportunities and it facilitates predicting sales. As a

consequence, decision-making entails lower risks. To determine the trade area, Cui et al. (2012)

used spatial networks.

Spatial networks predict accessibility to stores in a more realistic way than a planar space (Cui

et al., 2012). A spatial network is a street network, which is turned into a graph. Graphs consist

of nodes and edges. Nodes are created on every road intersection and the edges are the links

between two nodes. All edges in the network space have attributes, such as travel time or

distance. The time attribute is defined by the time it would take to get from one node to another

node over an edge. The distance attribute equals the distance between two linked nodes.

2.3. Network kernel density estimation

Spatial networks can be used to create network-constrained density maps. Rui et al. (2016)

studied retail hot-spot areas using two network-constrained point pattern analysis methods:

network kernel density estimation (KDE) and network K-function. The network KDE and

network K-function were applied as indicators of the distribution of retail stores.

KDE is a popular method for point-event distributions. Xie and Yan (2008) introduced a KDE

in a network space. It is computed, using equation 1:

8

λ(s) = ∑1

𝑟𝑘 (

𝑑𝑖𝑠

𝑟) ,

𝑛

𝑖=1

(1)

where

dis is the distance between point i and s, this is important because it concerns the network space,

λ is the density,

s is the point location,

r is the bandwidth; the distance wherein the points are taken into account when calculating the density,

for a particular location,

k is the kernel.

The density is calculated over a point unit. Point i is a source point. A source point is a point

where a store is closest by. The kernel decides the weight of point s at distance dis. Thus, the

kernel function returns a weight value based on the distance between point i and point s. The

three kernel functions which are mostly utilised are the Gaussian function, the quartic function,

and the minimum variance function (Schabenberger & Gotway, 2005).

2.4. Spatial optimisation

Optimisation algorithms select the best results for a given objective function (Neun et al., 2008).

According to Tong and Murray (2012), there are two possible methods to find an optimal

solution of a site selection objective function: exact and heuristic. The exact search method

extensively calculates all possibilities, which results in finding an optimal solution (Tong &

Murray, 2012). Heuristic methods are based on strategies and rules of thumb procedures to

solve an optimisation problem (Tong & Murray, 2012).

Simulated annealing is an example of a heuristic method for spatial optimisation. Kirkpatrick

et al. (1983) and Cerny (1985) independently introduced simulated annealing. When an

objective function is available, but no solution can be calculated within a feasible time limit,

then simulated annealing is a possibility. Simulated annealing iteratively tries to find the

maximum or minimum of a function (Steiniger & Weibel, 2005). Simulated annealing starts

with an initial state. Then, another state is proposed. If the utility of that state is better, then the

proposed state is accepted. The proposed state becomes the current state. However, if the utility

of the proposed state is lower than the utility of the current state, the proposed state is excepted

with a certain probability. The probability is determined by a temperature, a variable that

decreases in every iteration, and the difference in utility between the proposed and current state.

9

The acceptance probability function (Kirkpatrick et al., 1983) is shown in equation 2.

Noticeable is that equation 2 strongly depends on temperature.

Probability = 𝑒−(𝑆′−𝑆 )/𝑇 , (2)

where

S’ is the utility of the proposed state,

S is the utility of the current state,

T is the current temperature.

If the difference between the utility of the proposed state and the utility of the current state is

close to zero, then the probability of acceptance increases. Also, the higher the temperature, the

higher the possibility that a proposed state is accepted. In every iteration, the temperature

decreases with a certain cooling ratio. Therefore, with every iteration, the probability decreases

that a state with a lower value is accepted as the current state.

Other heuristic optimisation methods are known as greedy algorithms. Aboolian, Berman and

Krass (2007) showed that for store sitting problems the greedy algorithm is an efficient method

to obtain close-to-optimal quality solutions. In site selection, a greedy algorithm in turn finds

the location with the highest utility derived from an objective function, given the current

locations of existing stores. This location is saved as an optimal solution and is added to the

stock of existing stores. The process is repeated until the number of selected locations is equal

to n new stores. Hence, the greedy algorithm method is only locally optimal; it finds the true

optimal location if only one store needs to be found. The greedy algorithm may not find the

jointly optimal solution if multiple locations need to be found.

2.5. Food retailer location data

Sienkiewicz a former Master Geo-Information (MGI) student, described a method to extract

the Google Places data and how to save it into a CSV format (Sienkiewicz, 2015). She compared

the Google Places data to two other food retailer location sources and found that the Google

Places data is the most complete data source for food retailer locations.

10

3. Methods

The method chapter clarifies how the research questions were answered. It ensures that this

research is reproducible, by explaining how the data were obtained and which methods were

used. Moreover, justifications are provided why certain methods were used.

3.1. Objective function

3.1.1. Composite utility

Three commonly used variables in site selection are demand, competition, and attractiveness

(Goodchild, 1984; Huff, 1964). With these variables, it is possible to predict the number of

consumers that is likely to shop at a particular retail venture. The proposed site selection method

considers site attractiveness, (and not store attraction, due to the unknown target store(s)), the

scale of the research, and the availability of data. Suarez-Vega et al. (2015) claim that traveling

cost is more important than the size of a supermarket. In this thesis, it is assumed that this also

holds for small traditional food retailers. For food retailers and convenience stores, it is known

that demand is elastic with respect to distance, meaning that when the distance to a store

becomes greater, the chance is bigger that a customer goes to a competitor (Goodchild, 1984).

Food is a primary good, therefore my method assumes that demand is proportional to the

population. In this thesis’ utility assessment, network KDE and the demographic data were

combined with a site attractiveness variable.

Each individual road segment in the road network received a site suitability value, also called

utility value. A weighted sum criterion is proposed, based on the variables which are used in

the market share model (Goodchild, 1984) and the variables of the multiplicative competitive

interaction model (Suárez-Vega et al., 2015). The weighted sum function has three normalised

terms, each pondered by a weight. A single utility was computed, using equation 3:

𝑈𝑗 = 𝑤1 ∗ ((𝑑𝑗 − 1) ∗ −1) + 𝑤2 ∗ 𝑝𝑗 + 𝑤3 ∗ 𝑎𝑗 , (3)

where

U is the utility,

j is a road segment,

w are the weights,

d is a normalised density variable of food retailers,

p is a normalised demand/population variable,

a is a normalised site attractiveness variable.

11

The input data were normalised to range between zero and one. The weights can be set

according to individual preferences of the user, because the importance of the criteria may shift

among entrepreneurs, especially among specialisations. For example, a butcher could have

more interest in the demographic characteristics while a bakery may be more affected by local

competition. The weights are also limited to a range between zero and one. After setting the

selected number of stores to ≥1, the joint utility is calculated. The joint utility equals sum of all

utilities when optimising multiple ventures (equation 4):

𝐽𝑈𝑗 = ∑ 𝑈𝑗

𝑛

𝑖=1

, (4)

where

JU is the joint utility,

j is a road segment,

n is the number of stores,

U is the utility.

Figure 3.1 shows a diagram of the objective function composition. The site selection objective

function is an output and a function. An explorative analysis was required to discover whether

the required input variables can be acquired, to test the used methods, and to provide the data

for a case and a sensitivity analysis.

12

Figure 3.1: Diagram composite objective function; hexagons represent input, squares are functions, and

ellipses represent outputs.

Composite attractiveness

Areas with a high urban classification, a high percentage of elderly people (65+), and high

incomes were deemed attractive sites for small food retailers. Households were categorised as

households with high incomes, when they belong to the top 20% incomes in the Netherlands.

The data for the attractiveness variable were retrieved from the CBS (2014). The attractiveness

criterion did not distinguish weight values for the separate elements in the attractiveness

criterion, for the simple reason that the true importance unfortunately was not known; this was

not found in literature or any available market research. Therefore, the elements of the

attractiveness variable contributed approximately proportionately, which made that data

normalisation was necessary. The percentage of high income, and the number of elderly people

were normalised using equation 5.

𝑥′ = 𝑥 −𝑚in (𝑥)

max(𝑥)−min (𝑥) , (5)

where

x’ is the normalised value,

x is the original value.

13

This normalisation technique was not usable for the urban classification, since the original

urban classification was categorised by the CBS as an ordinal value between one and five;

divided as follow:

Dense urban areas (address density of more than 2500 addresses/km) have a value of

one,

Urban areas (address density of 1500 to 2500 addresses/km) have a value of two,

Moderate urban areas (address density of 1000 to 1500 addresses/km) have a value of

three,

Minor urban areas (address density of 500 to 1000 addresses/km) have a value of four,

Non-urban areas (address density of less than 500 addresses/km) have a value of five.

To normalise the results, the classification values one, two, three, four, and five were turned

into one, 0.75, 0.50, 0.25, and zero, respectively. The areas, classified as one by the CBS,

represent the most attractive sites for the urbanity factor. Small food retailers are only feasible

in urban areas as explained in section 2.6. Therefore, non-urban areas were assigned a value of

zero.

The site attractiveness variable was again normalised, because it was an input of the utility

formula, which worked with weights to determine the impact per element. Accordingly, the

attractiveness variable (equation 6) is the sum of three elements, divided by the maximum

value:

𝑎𝑟𝑒𝑔𝑖𝑜𝑛 = 𝐻𝑖𝑛𝑐𝑟𝑒𝑔𝑖𝑜𝑛 + 𝐸𝑙𝑑𝑟𝑒𝑔𝑖𝑜𝑛+𝑈𝑟𝑏𝑟𝑒𝑔𝑖𝑜𝑛

max(𝑎), (6)

where

region is one 500 by 500 metres square,

a is the attractiveness variable,

Hinc is the normalised high income value,

Eld is the normalised number of elderly people,

Urb is the normalised urbanity category.

Section 1.2 claimed that greengroceries, bakeries, and butchers suffer from the competition of

supermarkets. Therefore, store location data from those food retailers were required.

Competition also happens within the same branch. The point input data (the location of food

retailers) for the network KDE existed of supermarket and the specific food specialist location

14

data, to fully represent the competition. For instance, bakery locations and supermarket

locations were taken into account when calculating the network KDE for a bakery.

3.1.2. Study area

Location

The data collection was executed for a region in the Netherlands, covering the cities

Amersfoort, Arnhem, Apeldoorn, as well as several smaller cities and villages. This area covers

a wide variation in population and food retailer densities. The study area has a size of

approximately 1,900 square kilometres and is visualised in figure 3.2.

Figure 3.2: Study area.

15

Explorative analysis

An explorative analysis was conducted to test the feasibility of the calculation of the network

KDE and to examine whether the required data and analyses are feasible. For convenience

purposes, the maximum time required for the network KDE calculations was set to 48 hours.

The network KDE was applied to find hotspots of roads near food retailers. Hotspots are roads

which are located near large numbers of stores. How far someone can travel on a network space

is determined by constraints. The constraint can be distance or time. With a distance constraint

you can calculate how far someone can travel for a certain distance. When a time constraint is

utilised, then you can find out how far someone can travel within a certain time. Only a distance

attribute, the length of a road, was considered as a constraint, thus not a time attribute. This had

two reasons. First, because of the scale of the study. Second, some consumers shop by car,

others by bike or afoot, which made it really hard to successfully determine the traveling time

of roads. Network KDE was computed according to the method of Xie and Yan (2008).

The midpoints of existing road segments were determined to construct the network KDE. To

avoid excessive computation time, the segments were not split into equal lengths. Since stores

are typically located along short segments in areas with many road intersections, this strategy

was assumed to have at most a minor effect on model outcomes. For every store, the road

segment which was closest by was determined. If the number of stores was at least one, then

the segment was saved as a source segment. The count values, i.e. the number of stores, were

assigned to the midpoints of the source segments.

Also, a bandwidth, r, was established. Every source midpoint got a so-called service area or

catchment area. A service area is a polygon which presents all roads that are located within the

bandwidth of a source midpoint. Only the KDE of midpoints that fall within the service areas

were calculated. The KDE of the remaining midpoints was set to zero.

A Python script was developed to perform the network KDE (Appendix I) with the ArcPy

package. For the explorative analysis I used the location of supermarkets and bakeries as the

input point data. Two density maps were developed. One with a bandwidth set to 250 metres

and the other one with a bandwidth set to 500 metres. The kernel was set to quartic, which is

defined by equation 7:

16

𝐾𝑗 = 3

𝜋(1 − (

𝑑𝑖𝑠𝑗𝑣2

𝑟2 )), (7)

where

K is the kernel value,

j is a road segment,

disjv is the distance-decay-effect; distance between a road segment and the corresponding source road

segment,

r is the bandwidth.

The KDE values were normalised, in order to fit in the composition of the objective function.

The network KDE is computing-intensive, because for every source segment, approximately

300 road segments were found in a service area, when the bandwidth set to 500 metres. For all

these road segments, a distance-decay required to be calculated.

A summary of the network KDE calculations is shown in a flowchart in figure 3.3:

17

Figure 3.3: Flowchart network KDE calculation; the hexagons represent data, rectangles represent actions,

and the diamond represents a conditional statement.

18

Socio-demographic data were retrieved from the Central Bureau of Statistics (2014). Network

data were acquired through Open Street Map (OSM). The network data provided by OSM uses

data from the Nationaal Wegen Bestand (NWB) (OSM, 2016). The network data were clipped

on the study area. OSM also has data on shops data, such as supermarkets, and the bigger chains,

like Hema, Action. It is questionable whether OSM contains enough data to create a site

selection method. The Google Places API is more complete. The problem of the Googles Places

API is that a restricted number of food retailers is returned per query.

CBS provides demographic data at a resolution of 500 x 500 square metres. This was deemed

a suitable size, because most consumers go to a food retailer at a distance between 250 and 500

metres (Veenstra, et al., 2010). These corresponding squares, called regions in this thesis,

provided extensive socio-demographic data. For the demand variable, only the most recent

population quantities (2014) were required. Data on the percentage of high incomes, the urban

classification, and the number of people aged above 65 years originate from the year 2011,

2013, and 2014, respectively. It turned out that the spatial extent of the socio-demographic data

did not exactly match that of the study area. Demand values of -99,998 denoting zero population

were converted into zero. The data squares were overlain to the midpoints of road segments.

Subsequently, with the help from a spatial join, the values were added to the road network.

Google Maps retailer location data were retrieved using the Google Places API (Google, 2017).

The API returns data in XML or JSON. You can choose to search nearby by adding a location

to the URL, but it is also possible to do a text search using keywords. Even a third option was

available: radar search. The radar search is almost identical to the search nearby option. The

difference is that it returns more results in a single query, but with less attribute information.

The radar search was the handiest, since the area hosts a large number of food retailers and

extra attribute information were not needed. Three compulsory parameters need to be defined

in the URL query: key, location, and radius. A personal API key was mandatory to use the

Google data. The key is easily requested at the developers’ website of Google. The location is

defined by WGS84 longitude and latitude, and the search radius defines the buffer around this

location in which retailers are to be searched. The query returned the geographic coordinates of

the location of retailer stores.

An R-script (Appendix II) was developed to obtain the data and to transform these from JSON

into CSV. A fishnet was created in ArcGIS. A fishnet is a net of rectangular polygons. This was

utilised, because the Google API returned a maximum of 200 results per query and the study

area contains more food retailers than that. The fishnet existed of 70 points which were placed

19

in the centre of a square with sides of 5,200 metres. The identification number, the latitude and

longitude of these points can be found in Appendix III. It was assumed that the number of food

retailers situated within the search radius never exceeded 200. An edge length of 5,200 metres

for the fishnet was convenient, because the thus obtained 70 squares approximately fit the data

collection area.

The search radius parameter in the Google API query had to be properly set to fully cover the

study area. Figure 3.4 shows the used fishnet. The green dots represent the points which were

used as location input in the query.

Figure 3.4: Fishnet over the study area.

Squared areas avoid overlap in the queries. Contrary, circled areas gave overlapping results.

The Google Places API only returns results from circled areas, whereas squared areas were

more convenient. Therefore, results outside the area and duplicate retailer stores occurred, these

were removed.

Also a second problem occurred. The bakery query required an extra filter, because otherwise,

companies using the common Dutch family name “Bakker” would also be returned. The “types”

filter was added to the bakery query. No types filter was available for the butcher and

greengrocery. Fortunately, after a short examination on Google Maps, only butchers and

greengroceries were found with the corresponding keywords. Four URL queries were generated

to subtract supermarket, bakery, butcher, and greengrocery locations:

20

https://maps.googleapis.com/maps/api/place/radarsearch/json?location=LOCATIONIN

PUT&radius=4000&keyword=supermarkt&key=AIzaSyBwZvcMFCkzj3TZGwHP-

MhbWH1QI7dwVA4

https://maps.googleapis.com/maps/api/place/radarsearch/json?location=LOCATIONIN

PUT&radius=4000&keyword=bakker&types=bakery&key=AIzaSyBwZvcMFCkzj3T

ZGwHP-MhbWH1QI7dwVA4

https://maps.googleapis.com/maps/api/place/radarsearch/json?location=LOCATIONIN

PUT&radius=4000&keyword=slager&key=AIzaSyBwZvcMFCkzj3TZGwHP-

MhbWH1QI7dwVA4

https://maps.googleapis.com/maps/api/place/radarsearch/json?location=LOCATIONIN

PUT&radius=4000&keyword=groentewinkel&key=AIzaSyBwZvcMFCkzj3TZGwHP

-MhbWH1QI7dwVA4

3.2. Optimisation method

3.2.1. Optimal locations

Numerous methods are available for locations optimisation. Every allocated location changed

the competition variable. Therefore, an exhaustive search optimising three locations in a study

area with 24,000 road segments would have 2.3 x 1012 potential solutions. Each trial would

require re-computation, of the density, which is computationally expensive. An optimisation

method was implemented to support a food specialist retailer who wants to open multiple stores.

In that case simultaneously more than one location required to be found.

In this case, only heuristic optimisation methods were feasible, because there is no analytic

solution while exhaustive search over the solution space is infeasible. The heuristic approaches

which were tested in this thesis were a greedy algorithm and simulated annealing.

3.2.2. Greedy algorithm

In this greedy algorithm one store is added at the time; each time re-computing the competition

variable and selecting the next optimal solution until n stores are allocated. In case of multiple

equally optimal sites the location was chosen by random assignment. If only a single store is to

be allocated the used greedy algorithm is optimal. Otherwise the approach is locally optimal.

Since, the allocated stores influence the competition variable, without taking the next to-be-

allocated optimal stores into account.

21

I developed a script, which calculated the utility for every road segment in the study area. The

pseudo code for this script can be found in box 3.1. The road segments with the highest utilities

were found and their ID number was put in a list. A random optimal solution was picked if

more than one optimal solution was listed. The random optimal solution, a road segment, was

saved as the optimal location and the corresponding midpoint was found. If more than one

optimal location was required, then a service area was generated around the corresponding

midpoint. The count value was set to four, to ascertain that no cannibalisation effect would

occur. Thus, the competition density increased heavily around the new found location. The

KDE values were added to the competition variable. The newly calculated competition variable

was used for the next location

Box 3.1: Pseudo code greedy algorithm in a network space.

3.2.3. Simulated annealing

Besides a greedy algorithm optimisation method, a simulated annealing approach was

developed for optimising multiple store locations simultaneously. The script was quite different

from the greedy algorithm script (box 3.1). Due to time constraints in combination with

computation time, the maximum number of iterations was set to 20,000. More iterations would

have increased the quality of the results. Besides the maximum number of iterations, a

maximum number of non-accepted solutions was set to 5,000. The number of non-accepted

solutions are the number of times that a proposed composite of road segments was not selected

Set bandwidth for KDE and define kernel functions

Set number of stores and variables weights

Loop through number of stores

o Compute KDE given the current spatial configuration of stores

o Loop through all road segments

Find variable values

Calculate utility

If utility is larger than current optimal utility, then

Save road segment in a new list

Utility becomes the optimal utility

If utility is equal to current optimal utility, then

Add road segment to optimal solution list

o Pick random optimal solution from optimal solution list

o Save found road segment as optimal location and add it to current configuration

22

as the new current composite of road segments. The cooling ratio was set to 0.9995, which

decreases the temperature in each iteration. The approach was only tested using data on bakeries

and supermarkets for the simultaneous allocation of five bakeries. The initial temperature was

set to four and the bandwidth was set to 500 (table 3.1).

Table 3.1: Simulated annealing settings.

Number of iterations 20,000

Maximum number of non-selected solutions 5,000

Number of stores 5

Cooling ratio 0.9995

Location data Bakeries and supermarkets

Initial temperature 4

Bandwidth 500

Box 3.2 shows the pseudo code of the simulated annealing process utilised in this thesis. Initially

five randomly chosen road segments were selected. The joint utility was calculated for this

composite of stores. Then, an iterative process started until the maximum number of iterations

or the maximum number of non-selected solutions was reached. In each iteration, on, randomly

selected location was moved to a different randomly selected road. To avoid a cannibalisation

effect, points within 500 metres from the other four points were not allowed to be selected.

Accordingly, no new density map had to be computed, which would have been a

computationally expensive procedure. As no cannibalisation effect was allowed to occur, the

new proposed joint utility was readily calculated. The proposed road segments were accepted

when the proposed joint utility was higher than the current joint utility. If the proposed joint

utility was lower than the current joint utility, the proposed road segments were excepted with

a certain probability. The last found combination of road segments was saved as the optimal

solution. The script also returned two lists: a list with probabilities that a lower utility was

selected and a list which contained the simultaneous found joint utilities. These lists were used

to create traces; the traces were utilised to examine the simulated annealing settings.

23

Run simulated annealing function, select number of stores and variable weights:

o Set cooling ratio, initial temperature, maximum number of iterations, maximum number of non-

accepted solutions and the bandwidth

o Randomly find initial road segments

o Loop through initial road segments:

Calculate utility

Add utility to initial joint utility variable

o Loop until number of iterations or number of non-selected solutions has reached its maximum

Find random segment out of selected road segments, which moves to another proposed

road segment

Find random proposed road segment

Check if no cannibalisation occurs:

Find distances between new proposed segment and the selected segment

which did not move

If every distance between the proposed midpoint and the other selected

midpoints is larger than the bandwidth, then

o This point is proposed

Else, then

o Propose another method until no cannibalisation occurs

Calculate new proposed joint utility

Find difference in joint utility (proposed joint utility – joint utility) and calculate

probability (equation 2)

If difference is smaller than one, then

Add probability to probability list

If difference is bigger or equal than/to zero, then

Proposed road segments are accepted

If difference is smaller than zero and pseudo-random number is larger than the

probability, then

Proposed road segments are accepted

If difference is smaller than zero and pseudo-random number is lower, then

Proposed road segments are not accepted

Temperature cools down (cooling ratio)

Add joint utility value to joint utility list

o Save last accepted road segments

Open text files to write the list with the consecutive joint utilities, and the list with the probability that a

worse solution is accepted

Box 3.2: Pseudo code simulated annealing.

24

3.2.4. From road segments to areas

Not every road segment is suitable to establish a small food venture. There is a possibility that

no space is available to open a new store. An optimal area is created around the optimal road

segments. Then, the chance increases that store space is available. Besides that, entrepreneurs

are able to take other variables like rental costs or store attractiveness into consideration. Using

a normal buffer function was not logical, because prior distance calculations were done in a

network space. Therefore, once again, service areas were generated. This time with break

values.

3.3. Case study

3.3.1. Optimal solutions

A scenario was created, named “scenario 1”. In this scenario, three optimal locations were to

be determined for new bakery ventures with the variable weights set to one (table 3.2). The

locations were found with the greedy algorithm method.

Table 3.2: Scenario 1 case study.

Scenario 1

Retailer type Bakery

Number of stores 3

Density weight 1

Demand weight 1

Attractiveness weight 1

Optimisation method Greedy algorithm

After the optimal locations were found, I wanted to assess if the cannibalisation effect occurs.

Therefore, I visualised the optimal location in two different maps: the original density map and

the newly calculated density map.

Every optimal road segment was part of an optimal site. The optimal site is a composite of

consecutive road segments with the highest utilities in the study area. Optimal sites arose,

because the road segments with the highest utilities were always neighbours. The size of the

optimal sites varied.

The second scenario was conducted with simulated annealing. The variable weights remained

one. Again, the case was executed with location data of supermarkets and bakeries. The number

25

of stores was set to five. The settings slightly varied from the settings in table 3.1. The number

of iterations was set to 50,000 and the maximum number of non-selected solutions was set to

20,000 to increase the quality of the results.

Table 3.3: Scenario 2 case study.

Scenario 2

Number of iterations 50,000

Maximum number of non-selected solutions 20,000

Number of stores 5

Cooling ratio 0.9995

Retailer type Bakery

Initial temperature 4

Bandwidth 500

Density weight 1

Demand weight 1

Attractiveness weight 1

Optimisation method Simulated annealing

3.3.2. User interface

A site selection method should enable users to adjust parameter values and data (Benoit &

Clarke, 1997). Professionals in the sector can indicate the important site criteria (Roig-Tierno

et al., 2013). Therefore, an interactive map was developed to show the results to users, who are

mainly small food company owners. The user of the interactive map indicates the food retailer

type, the number of stores, and adjusts the weights of criteria. This section helped me thinking

about a way to visualise the information and how to distribute the results to stakeholders.

The Python script to generate (an) optimal location(s) and the adjusted competition variable

was run within the R code. The internal function “system” which invoked an OS command,

was required to start the Python script. The input command was “python file path”. The file

path was the location where the optimisation method Python script was saved. The system

function only worked when R and Python were added to the environment variables. This was

only required for the Windows operating system. Environment variables affect the behaviour

of running processes. The variable weights and the type of food retailer were retrieved from a

text file, which was generated by the main R script (Appendix IV). The Python script required

some minor changes to cope with the GUI; complementary script was implemented to read the

values from the files. The input road data and the input midpoint data were determined by those

26

values. Moreover, the variable weights were retrieved from text files. Also, earlier generated

optimal solutions were deleted. The optimal points and areas were transformed to WGS84,

because this is required for the background map. The optimisation function returned two lists:

an optimal road midpoint list and the optimal area list. All files in the list were converted into

shapefiles, because shapefiles can be loaded in the R script.

The R programme to visualise the optimal locations exists of three elements. One element was

used to execute the Python script. The second element retrieved the optimal solution files

created by the Python script. The last element visualised the results in an interactive map view.

Furthermore, the result raster maps were imported to R for visualisation to the user.

A combination of the Leaflet and Shiny package was used to create the interactive map. The

Shiny package, developed by Chang et al. (2016), creates interactive web applications in R.

Shiny applications bind inputs and outputs together in a reactive way. Cheng et al. (2016) are

the developers of the Leaflet package. The Leaflet package uses the Leaflet JavaScript library

and the HTMLwidgets packages to create interactive maps. The maps can be used directly from

R-studio or as a web service.

The user can select for which type of food specialist he wants to find optimal locations: bakeries,

butchers, or greengroceries. Subsequently, the variable weights and the number of stores are to

be filled in the GUI. The weights can be set between zero and one, with steps of 0.1. The number

of stores is required to be an integer value between one and ten. An action button named

generate optimal location(s) starts the Python script to calculate the optimal solutions. Two

checkboxes can be used to turn the demand and site attractiveness raster maps with their

corresponding legends on and off. These are included to give the user extra information about

the study area. Generating optimal solutions takes considerable time, certainly if the number of

stores is set to more than one. To prevent multiple simultaneous calls to the Python function or

changes to the parameters while an optimisation is running, the inputs and the generate button

are frozen during execution of the Python scripts.

3.4. Sensitivity analysis

A sensitivity analysis was executed to find out if the outputs were stable when a different food

retailer type was selected and if the outputs were stable when one weight was set to zero.

Sensitivity analysis is used to test how uncertainties in the output can be appointed to different

27

parameter settings (Saltelli, 2002). It can also be used to asses to what extent variables and the

variable weights influence the end result.

The one-factor-at-a-time (OAT) is the most used form of sensitivity analysis. One input factor

is changed, while the others are fixed (Saltelli & Anonni, 2010). Several scenarios were created

to test the effect on the result. The variable weights and food specialist type were adjusted one

at the time to assess the impact on the result.

For convenience purposes, the sensitivity analyses of the weights and food retail store type were

executed with the greedy algorithm; computation time lower, and configuration of settings was

easier when using this method. I assessed visually if the patterns of the new venture locations

were corresponding when the parameters were changed. A fishnet, like the one already showed

in figure 3.4, was used as a template to assess in which area an optimal solution was located.

First, the sensitivity of the food retailer type was determined; to assess if the site selection

method gave different optimal locations for bakeries, butchers and greengroceries. For every

retailer type, five stores were calculated and all weights were set to one (table 3.4).

Table 3.4: Scenarios to assess the sensitivity of the food retailer type.

Retailer type Number

of stores

Density

weight

Demand

weight

Attractiveness

weight

Optimisation

method

Bakery 5 1 1 1 Greedy algorithm

Butcher 5 1 1 1 Greedy algorithm

Greengrocery 5 1 1 1 Greedy algorithm

The second test was conducted to evaluate how the variables influenced the selected locations.

This example was executed for the retailer type “greengroceries”. The decision to test this with

greengroceries was purely random. On their turn, the competition, demand, and attractiveness

weights were set to zero to examine if the optimal site changed. This was visually assessed.

Table 3.5 summarises the created scenarios to assess the sensitivity of the variables.

28

Table 3.5: Scenarios to assess the sensitivity of the variables.

Retailer

type

Number of

stores

Density

weight

Demand

weight

Attractiveness

weight

Optimisation

method

Greengrocery 1 1 1 0 Greedy algorithm

Greengrocery 1 1 0 1 Greedy algorithm

Greengrocery 1 0 1 1 Greedy algorithm

Baker 1 1 1 0 Greedy algorithm

Baker 1 1 0 1 Greedy algorithm

Baker 1 0 1 1 Greedy algorithm

Butcher 1 1 1 0 Greedy algorithm

Butcher 1 1 0 1 Greedy algorithm

Butcher 1 0 1 1 Greedy algorithm

29

4. Results

The results chapter visualises the results of the research questions individually. No

interpretation or discussion is included in this chapter.

4.1. Objective function

4.1.1. Explorative analysis

Competition

Figure 4.1 shows the spatial distribution of the considered food retailers in the study area. The

data were retrieved on 25-08-2016 from the Google places API. All four different food retailer

types showed similar patterns. Most retailers were located in the three biggest cities, but also

clusters of shops were found in the smaller cities and villages. The results outside the data

collection area were not removed yet in figure 4.1.

The retailer location data were not validated in this thesis. Since, Sienkiewicz (2015) already

found that this food retailer data retrieval was not perfect. Some food ventures were not found

while also some non-existing stores popped-up. For the development of the site selection

method this was not an issue. However, if you want to use the method in practice, it is inevitable

to test the retailer location data on errors.

30

Figure 4.1: Locations of retrieved food retailers within the study area based on Google maps.

The number of shops per food retailer type are shown in table 4.1. The data collection area

contained more supermarkets and bakeries, than butches and greengroceries.

31

Table 4.1: Number of food retailers in the study area.

Type of food retailer Number of shops found

Supermarkets 396

Bakeries 356

Butchers 173

Greengroceries 171

Total 1,096

During the explorative analysis it was found that the study area included, too many stores to

compute the network KDE within two days. The extensive number of supermarket and bakeries

caused that approximately 700 unique service areas had to be generated for the original study

area, with approximately 300 points each when the bandwidth was set to 500 metres. Therefore,

the study area was adjusted to only represent Amersfoort and surroundings. By consequence,

the number of generated service areas was reduced to 101 areas; this reduced the computation

time drastically. The adjusted study area is visualised in figure 4.2.

32

Figure 4.2: The Amersfoort study area.

Figure 4.3 shows the spatial distribution of the considered food retailers in the Amersfoort study

area. The four different food retailer types had similar location patterns; shops are spread all

over the study area, but densities are highest near the city centre.

33

Figure 4.3: Location of bakeries, butchers, supermarkets and greengroceries in the Amersfoort study area.

Figure 4.4 presents an example of the roads in the centre of Amersfoort where the source count

is visualised. It also shows the midpoints of every road segment. The grey lines present roads

without nearby retailers. The slightly thicker green lines symbolise roads with one store

alongside. The even thicker lines present roads with more than 2 or 3 nearby stores.

34

Figure 4.4: A categorised road network of the city centre of Amersfoort on the number of retailers that are

closest by.

Figure 4.5 is an example of multiple overlapping service areas. The bandwidth was set to 500

metres, the polygons were not circular, because a network space was used.

35

Figure 4.5: Service areas around source points. Notice that service areas may overlap.

The outcome of the network KDE was a quantification of the density of retail stores on the road

network. Figure 4.6 and figure 4.7 are the results of executing the network KDE for a road

network clipped on Amersfoort. The food retailer hotspots are shown as green and blue areas.

The individual road segment values are not visible in those figures. A zoom-in of the city centre

was added to figure 4.7, supermarkets and bakeries were added to this map, here you can better

see the individual road segment values. The computation time was around half a day for the

250 metres and approximately two days for the 500 metres bandwidth. Figure 4.7 shows that

the city centre of Amersfoort is considered a hotspot for food retailers.

36

Figure 4.6: Network KDE densities of Amersfoort with a bandwidth of 250 metres.

37

Figure 4.7: Network KDE densities of Amersfoort with a bandwidth of 500 metres.

Demand

The normalised demand computed from CBS (2014) data is presented in figure 4.8. The highest

value was 2,560 inhabitants per 500 x 500 square metres. The city centre and some newer

38

residential areas, are presented by clusters of dark green squares. Figure 4.9 presents the

demand in the network space.

Figure 4.8: Normalised demand in the Amersfoort study area.

39

Figure 4.9: Normalised demand variable on a network space

Attractiveness

Figure 4.10 shows the normalised number of elderly people in the Amersfoort study area.

Figure 4.11 visualises the normalised number of people with high income. People with higher

incomes tend to inhabit at the border of the city. Especially, the south-west of Amersfoort is the

residence for families with higher income. Figure 4.12 presents the city centres and the shape

of Amersfoort. The area in the right-bottom corner is Leusden and the part on the east side of

Amersfoort is Hoevelaken. Figure 4.13 is the result of equation 6 and the image presents the

most attractive locations for small food retailers. The attractiveness map shows that the most

attractive locations are always in urban areas. The south of Amersfoort was slightly more

attractive than the north.

40

Figure 4.10: Normalised number of elderly people.

.

Figure 4.12: Normalised urbanity classification.

Figure 4.11: Normalised number of people with

high income.

Figure 4.13: Normalised attractiveness variable.

Figure 4.14 represent the attractiveness variable on a network space. The figure shows that

existing food retailers mostly spatially coincide with areas which are assumed to be attractive

according to this thesis’ approach.

41

Figure 4.14: Existing food retailers on attractiveness variable map in the city centre of Amersfoort.

4.1.2. Composition

Figure 4.15 presents the utility map of scenario 1 (table 3.2). A block pattern is visible with

small differences inside the blocks. Due to the competition variable the city centre does not

present the highest utilities. The dark blue areas at the west and south side of the city centre had

the highest utilities for this scenario.

42

Figure 4.15: Composite utility map of Amersfoort corresponding to scenario 1.

4.2. Optimisation method

4.2.1. Greedy algorithm

The greedy algorithm has no settings to adjust and was therefore easy to configure. Only the

scenario of the composite utility function required to be chosen. The computation time was

quite fast; it took around a minute to find one optimal location.

4.2.2. Simulated annealing

The computation time for the simulated annealing with the settings of table 3.1 was

approximately 7.5 hours. The trace of the possibility that a combination of road segment with

a lower joint utility than the current combination was selected to be the new current state is

shown in figure 4.16. The sequence number presents the sequential number for every iteration

were the proposed joint utility was lower than the current joint utility.

43

Figure 4.16: Probability trace of simulated annealing for five simultaneous store locations.

In the beginning the probability was between 0.7 and one. Until around sequence number 3,200

there was always a chance that a worse joint configuration of five store locations was selected.

The average probability decreased until it was zero around iteration number 10,000. After

sequence number 10,000 only road segment compositions with higher joint utilities were

accepted.

The joint utility of the consecutive chosen joint configuration of road segments is visualised in

figure 4.17.

0

0,2

0,4

0,6

0,8

1

1,21

51

91

037

15

552

073

25

913

109

36

274

145

46

635

181

56

996

217

67

357

253

77

718

289

88

079

325

98

431

036

11

087

91

139

71

191

51

243

31

295

11

346

91

398

71

450

51

502

31

554

11

605

91

657

71

709

51

761

31

813

1

Pro

bab

ility

Sequence number

Probability trace

44

Figure 4.17: Trace of the joint utility of the accepted road segment composition.

Until iteration 10,000 there was a chance that a joint configuration of five store locations with

a lower joint utility was accepted as the new current state. The last found joint utility was 13.98,

the joint utility value is slightly lower than the joint utility of the greedy algorithm (14.20).

Expected was that the joint utility of the simulated annealing approach would outperform the

greedy algorithm when the maximum number of iterations was increased and the other settings

optimised.

4.3. Case results

4.3.1. Optimal solutions

Figure 4.18 presents the selected locations for scenario 1 (table 3.2); weights were set to one,

the number of stores three, and the optimisation method to greedy algorithm.

0

2

4

6

8

10

12

14

161

62

61

251

18

762

501

31

263

751

43

765

001

56

266

251

68

767

501

81

268

751

93

761

000

11

062

61

125

11

187

61

250

11

312

61

375

11

437

61

500

11

562

61

625

11

687

61

750

11

812

61

875

11

937

6

Join

t u

tilit

y

Iteration number

Accepted road segment composition trace

45

Figure 4.18: Three optimal locations for new bakeries in Amersfoort determined with the greedy algorithm.

Figure 4.19 shows how a single added store location (greedy algorithm) affects the KDE. The

right image is the density map after the network KDE for a newly added location. It can be

observed roads around the first selected location were classified differently in the two images.

This is also visible around the second optimal location; neighbouring roads were assigned to a

higher category.

46

Figure 4.19: Competition variable map, prior (left) and after (right) adding two stores by optimisation.

In scenario one, the greedy algorithm performed better than expected. Since, the gaps between

the optimal sites were larger than 500 metres. Moreover, the optimal sites were mostly smaller

than 500 metres. If the optimal site is bigger than 500 metres, then it can occur that a first

location is selected in such a way that the increase in the competition variable causes that the

whole site was not optimal anymore. In such a case, it is more beneficial for the joint utility that

two road segments on the edges of the optimal site are selected. Figure 4.20 shows the optimal

sites. The optimal site of the third iteration was larger than 500 metres. Luckily, optimal location

“GA-3” was on the edge of this site. Therefore, road segments were still available for “GA-4”.

47

Figure 4.20: The road network is visualised together with the optimal sites and optimal locations for four

iterations; GA stands for greedy algorithm and the number presents the store location number.

Figure 4.21 presents the selected locations for scenario 2 (table 3.2); weights were set to one,

the number of stores five, and the optimisation method to simulated annealing. The joint utility

equalled 14.20. A joint configuration of five road segments generated with greedy algorithm

gave approximately the same joint utility.

48

Figure 4.21: Five optimal locations for new bakeries in Amersfoort determined with simulated annealing.

4.3.2. User interface

Figure 4.22 is an image of the interactive map after pressing generate optimal location(s)

button. The background map is from OSM. A green loading bar appeared at the top of the screen

reports that the optimal solutions is being calculated. The input and generate buttons on the

right side of the screen are frozen. However, it was still possible to show the demand or the

attractiveness raster on the map. The legends are visible in the right-bottom corner. The

application automatically zooms to the study area. Manual zooming and panning is also

supported. In figure 4.22 the demand raster layer is checked, that map is shown as the coarse

raster layer.

Figure 4.23 shows a manually zoomed in result within the user interface. The two blue markers

represent the selected locations. The selected locations were outside the city centre at the west

side of Amersfoort. To easily distinct the demand and attractiveness raster with the optimal

areas the optimal areas were colourised with a blue palette.

49

Figure 4.22: GUI showing a map of Amersfoort with an information bar, a demand raster, demand legend, disabled parameter inputs, and a disabled execute button.

50

Figure 4.23: GUI showing a map of Amersfoort with two optimal locations, two optimal areas, optimal area legend, enabled parameter inputs, and an enabled execute

button.

51

4.4. Sensitivity analysis

Figure 4.24 presents a part of the fishnet which was utilised to find the patterns of the optimal

locations. The rows were indicated by numbers and the columns were indicated by letters. The

map was zoomed in to be able to distinguish the optimal locations. Figure 4.24 visualises the

results of five scenarios in table 3.4. The selected locations 1 and 2 were the selected location

for all three food retailer types.

Figure 4.24: Optimal locations if the number of stores is set to five (greedy algorithm).

The demand and attractiveness variables were equal for all three retailer types. Only the

competition variable can cause a difference in the patterns for the retailer types. Noticeable is

that the first two optimal solutions were the same for all types (“11E” and “10E”). The third

optimal location was different for the butcher “13G”, because an existing butcher is at the south

of the third optimal location of the bakery and greengrocery (“12G”). The fourth location of the

greengrocery and bakery are at neighbouring road segments in area “13G”. The fourth location

of the butcher was at an entirely different area (“11H”), because the location of the third selected

location for the butcher was found in area “13G”. The fifth location of the bakery is located

between optimal location 1 and 2 (“11E”) on the edge of the service area of optimal location 1.

The fifth location of the butcher and greengrocery are located at “11H”.

52

The second test was to examine the impact of neglecting one variable (table 3.5). Figure 4.25

is the result of this analysis, not the whole study area was visualised otherwise the symbols

were hard to distinguish. Some symbols were enlarged to show that they are on the same

location as another symbol.

Figure 4.25: Visualisation of results, neglecting one variable; GA stands for greedy algorithm, com0 was when

the competition variable is set to zero, dem0 was when the demand variable was set to zero, and the att0 was

when the attractiveness variable was set to zero.

When the competition variable was neglected, then the selected locations did not differ very

much from the selected locations for scenario 1 (section 4.3). The competition variable did not

change the 500 by 500 metres square wherein the location was found in the study area. On the

contrary, the value of the competition variable did cause considerable changes in the selected

location of stores within the 500 by 500 metres squares. When the competition variable is

neglected, then the 500 by 500 metres squares had an identical utility. This caused that the

optimal location was always a random choice. the competition variable had large areas with

zero density and more variability in general than the two other variables. This affected the

results, because the variables contributed differently to the utility value. The competition values

were more variable, because a density value was calculated for every individual road segment

and the demand and attractive values were the same for a region.

53

Area 4E had the best spatial dispersion of retailers and consumers in Amersfoort, however the

area was not as attractive as around the city centre (figure 4.14). It was not as attractive, because

in area 4E the population had a lower percentage of elderly and high income inhabitants.

54

5. Discussion

Discussion of the individual results of the research questions. Links are made between findings

of this thesis and findings of other studies. Shortcomings and other special circumstances are

described in the sections below.

5.1. Objective function

The demand (figure 4.9), attractiveness (figure 4.14), and competition (figure 4.6 and figure

4.7) variables were outdated and future demographic predictions were not included. The data

refers to the socio-demographic and the retailer location data. Those facts result that this method

is a short-run approach. The site selection method does not anticipate on a possible response by

other retailers in the study area. Competition could react or anticipate on the site selection of

the user of the method. The method assumed that the locations of competitors were fixed.

Almost 1,100 retailers were found (table 4.1) for the initial proposed study area (figure 3.2).

The computation time was approximately one day, when the study area was set to city scale

(figure 4.2) and the bandwidth was set to 250 metres (figure 4.6). Setting the bandwidth to 500

metres enlarges the computation time drastically; the computation time was around three days

with equal settings (figure 4.7). Which meant that the computation time for the network KDE

enlarged exponentially with the bandwidth. Areas with values of zero predominated the study

area when a bandwidth of 250 was selected, causing that most areas had a value of zero

competition, little variation was then found for the competition variable. The calculation of a

feasible density map of one city already took three days, as a result, the network KDE was not

suitable for a regional study area.

The choice of the bandwidth affected (section 2.3) the competition variable (figure 4.6 and

figure 4.7). Rui et al. (2016) also, selected a bandwidth of 500 metres for the calculation of the

network KDE on city scale, because otherwise their results turned out to be spikey. There are

two main differences between mine and their network KDE method. The first main difference

is that I chose to not divide the road segments in equal sizes, due to the extra computation time

and the frequency of road intersection in the Amersfoort study area. Which resulted that less

segment midpoints were within the bandwidth of a certain point. Longer road segments had a

smaller chance to be within the bandwidth of a store, because the midpoint was further away.

Outskirts generally have fewer road intersections than city centres, thus this effect occurred

mainly in the outskirts.

55

Another difference is that stores and roads were unweighted in my method, which also affected

the results of the competition variable. Weighting stores would have been useful, because some

stores attract more customers, than other stores. The retail location data (figure 4.1 and figure

4.3) did not contain information on the number of customers a store attracts. However, the

location data were easily obtained and updated.

Rui et al. (2016) also used weights for roads, he randomly assigns higher values to bigger roads.

Only, it is not always the case that bigger roads are faster, certainly not in a city centre.

Moreover, the conveyance differs among consumers. Therefore, my proposed method did not

include weighted roads.

5.2. Optimisation method

As mentioned in section 4.2 the greedy algorithm only finds local optima, the simulated

annealing approach can find the joint optimal solution with the right settings and number of

iterations. A good configuration of the settings is essential for implementing simulated

annealing. The initial temperature (T0), for example, is important for the acceptance probability

function (Ben-Ameur, 2004). Kirkpatrick et al. (1983) suggest that the initial temperature

should be equal to the maximum difference in utility. Improved, and newer functions are also

available, they find better initial temperatures (Ben-Ameur, 2004). These functions were not

implemented in this thesis.

20,000 iterations were executed for the simulated annealing (figure 4.18). However, the study

area consists of 24,000 road segments. Therefore, the number of iterations was fairly small. The

computation time was already 7.5 hours for 20,000 iterations, for that reason the simulated

annealing optimisation was only executed a few times. During those executions the initial

temperature and the cooling ratio were adjusted.

Simulated annealing (section 3.2.3) and the greedy algorithm (section 3.2.2) ensure differently

that no cannibalisation effect occurs. The greedy algorithm calculates a new density map; a

newly selected store location adds four times as much density to the competition variable as the

existing stores. It is not excluded that a selected store location is selected within 500 metres of

another selected store location. This is not possible in simulated annealing. In the simulated

annealing store locations cannot be selected within 500 metres of another store location. This

difference gives an advantage for the joint utility of greedy algorithm.

56

5.3. Case

5.3.1. Optimal solutions

Figure 4.21 showed the selected locations for the case, the bandwidth for the competition

variable was set to 500 metres. Some locations are located close to each other. Literature states

that most people shop within 500 metres of their residence (Veenstra, et al., 2010). However,

this does not imply that shops just outside 500 metres of another shop could not be considered

as competition. Unfortunately, a limit had to be set to avoid excessive computation times. An

equivalent bandwidth value was utilised to simultaneously change the density map for every

found optimal location in the greedy algorithm (section 3.2.2). Therefore, the distance between

two selected locations can be short.

5.3.2. User interface

The site selection user interface (figure 4.22 and figure 4.23) utilised three different

components: Python, R and ArcGIS. It was only possible to run the application on a computer

with Python, R and an ArcGIS license with an additional network license installed.

The package Rpython was developed to execute Python functions in R. However, I used the

system function (section 3.3.2) for the reason that the Rpython package has some issues with a

computer running on Windows. There is a Rpython package especially established for

Windows system. Though, this package is mostly experimental and not widely tested. Another

option could be to run the R code within Python with the package “Rpy2”.

Most site selection studies only developed a method and did not think about a way to show the

results to stakeholders or how to handle user input. An exception is Cui et al. (2012); they

combined the ArcGIS Engine develop kit with Flex, and Java. ArcGIS Engine was used for the

basic GIS functions (zooming and geographical querying). Flex is a mark-up language and

programming language Java is capable to handle the different inputs and calculate outputs. I

chose the Leaflet-Shiny combination, due to prior knowledge of these packages and the easiness

of implementation.

5.4. Sensitivity analysis

The results of the sensitivity analysis (figure 4.24 and figure 4.25) were only conducted with

the greedy algorithm. Selecting locations with simulated annealing could have changed the

57

sensitivity analysis results. Also, the chosen kernel for the network KDE was not changed, this

could also affect the sensitivity analysis results. Moreover, the bandwidth for the competition

variable was kept constant. Since, setting the bandwidth higher than 500 metres meant days of

extra computation time and setting the bandwidth lower than 500 metres made the competition

variable infeasible (section 4.1.1).

58

6. Conclusions, limitations and recommendations

This section gives the results of the objective and the research questions. Also, the limitations

are listed. Furthermore, recommendations for future research are given.

6.1. Conclusions

1. Which objective function is usable for site selection for food retail stores? An objective

function for store site selection should accommodate competition, demand, and

attractiveness. In my research a weighted sum function was selected. For that reason,

variable weights can be selected by a small food retailer. Moreover, it is hard to retrieve

the true weights and no studies were conducted to find those weights. An explorative

analysis proved that the required inputs were attainable.

2. Which optimisation method can be implemented to find optimal locations? Multiple

heuristic methods can be implemented. However, only greedy algorithm and simulated

annealing were examined in this thesis. The greedy algorithm gave the optimal solution

for a single store. However, if multiple store locations are to be found, the greedy

algorithm is likely not to produce an overall optimal solution. Simulated annealing can

find the joint optimal solution for multiple stores. However, a good configuration of

settings and a sufficient amount of iterations are required.

3. Where in a case study area are opportunities to start a new small multi-location food

venture? This depended on the input of the user. Yet, in most cases the area just outside

the west and south of the city centre of Amersfoort was selected. This was caused by

the fact that those areas were attractive locations with high demand and sometimes zero

or close-to-zero competition. Those areas were attractive for the reason that it was

classified as an urban area with a high demand from relatively wealthy inhabitants.

4. How sensitive are the outcomes of the site selection for different parameter settings? I

found that the settings of the food retailer type did not change the result of the optimal

location much, the location remained in the same region, mainly, because the

competition factor was the only parameter that represented different values among the

food retailers. Therefore, the competition factor did not influence the optimal region as

much as the demand and attractiveness factors.

59

6.2. Limitations

If more time and money were available, then experts could also be involved in deciding the

variables for the site selection. Now, only market research and literature were used to determine

the variables.

Recent studies in microeconomics state that the top-performing entrepreneurs are mostly

responsible for economic value creation (Van Praag & Van Stel, 2013). Other entrepreneurs

would be stimulating the economy more as an employee than as a venture owner (Van Praag &

Van Stel, 2013). Therefore, one can argue whether a site selection method for small food

entrepreneurs is good for the national economic situation. Small food retailers lose the

competition against the supermarkets, meaning that they are not the top-performing business

owners.

A compromise among quality, study area size, and computation time was inevitable for the

competition variable, and a large amount of variables was and could not have been taken into

account. Perhaps the most important missing variables were the rental and other location costs.

Other variables which were not considered were: store space, building attractiveness, other

stores that attract consumers, parking space and local laws and regulations. Simply, because the

data could not easily be retrieved for this study area and because of the minor and unknown

effect on the objective function.

The method assumed that most food shopping is done closest by the residence of consumers.

Out-of-work shopping and combine-shopping are concepts which contradict with this

assumption. Every day, people travel between home and work; they often combine this

commute with a shop visit, which covers this concept of out-of-work shopping. Consumers also

combine-shopping, which means that when they need non-food products, they buy their

groceries at the food retailer close to the particular non-food shop.

ArcGIS with network license is mandatory to execute the site selection method. This method

would be more beneficial for the computation time and usability if the script used in this thesis

would only make use of open source functions. I tried developing an equivalent script to

produce the network KDE with the GDAL/OGR package in combination with the Networkx

package, because those packages are open source, but they lacked the standard network

functions that ArcPy offered and which were convenient to calculate the network KDE. Okabe

(2009) developed and still updates a set of ArcGIS tools, including a network kernel density

60

technique. The whole package is named SANET, and the user reference is created by Shiode,

Okunuki and Okabe (2006). I created my own script (Appendix I), because the functions were

only available for academic use and only available for ArcGIS desktop version 9-10.2. Using

the network KDE tool by Okabe (2009) would imply that my site selection method could not

be applied in the retail industry.

6.3. Recommendations

Firstly, it is recommended for future research to apply the method in the real world. A small

food retailer should implement this method in his site selection procedure. An evaluation with

this entrepreneur would help finding out if the variables and objective function are competent.

The success rates of companies which use the method and the success rates of companies which

did not use such a method can be compared to get a realistic view of the effectiveness. However,

it remains hard to test the feasibility of this method, as many other variables also influence the

performance of a retailer. A big sample population of small food retailers is recommended to

make the results unbiased.

Secondly, a market research could be conducted for future research to find out default weights

for various business types. A questionnaire could be distributed among small food retail owners.

The results can be quantified as weights. However, this is a time expensive method. Moreover,

it could be hard to find retail owners who are willing to provide this information, for the reason

that they do not want to help their own competitors.

Thirdly, demand was set equal to population, but another approach can be introduced for future

research which is more in line with the concepts of combine-shopping and out-of-work

shopping. Consumption maps visualise consumer activity flows. It is based on the assumption

that consumers are more likely to do their shopping in certain areas above other areas.

Fourthly, it is recommended to add future predictions. Entrepreneurs do not want to start

ventures in areas with a decline in population. In contrary, people in new to-be-built residential

areas can be a new interesting target audience. Therefore, future changes in demand can be of

value to include in a site selection method.

Fifthly, it is recommended to test the site selection method in other study areas. The explorative

analysis, the data retrieval, and the implementation of the method were executed in the same

area. The method which was proposed in this thesis used variables for which the data can be

retrieved nation-wide in the Netherlands.

61

Sixthly, it is recommended to try other optimisation methods. In this thesis, only the greedy

algorithm and simulated annealing were considered, but there are plenty other feasible

optimisation methods, like a genetic algorithm. Another example is the method of neural

networks, which could be implemented and tested in the site selection method.

Seventhly and finally, it is recommended for future research to build or find a network function

which is equal to the network functions from ArcPy. This would entail that the method could

work with only open source functions. Pgrouting is an extension of PostGIS to provide routing

function to a geospatial database. Graser (2011) created a service area with Pgrouting. Her

method to create a service area could be explored, to examine if it can be used to replace the

generate service area function by ArcPy. Networkx is an extensive Python package with a lot

of network functions. However, nothing as convenient as the generate service area function in

ArcPy was found in that package.

62

7. References

Aboolian, R., Berman, O. & Krass, D. (2007). Competitive facility location model with

concave demand. European Journal of Operational Research, 181(2), 598-619.

Ben-Ameur, W. (2004). Computing the initial temperature of simulated annealing.

Computational Optimization and Applications, 29(3), 369-385.

Baray, J. & Cliquet, G. (2007). Delineating store trade areas through morphological analysis.

European Journal of Operational Research, 182(2), 886-898.

Benoit, D. & Clarke, G. P. (1997). Assessing GIS for retail location planning. Journal of

retailing and consumer services, 4(4), 239-258.

Blythman, J. (2012). Shopped: The shocking power of British supermarkets. Harper Collins

UK.

Brunsdon, C., Fotheringham, A. S. & Charlton, M.E. (1996). Geographically weighted

regression: a method for exploring spatial nonstationarity. Geographical analysis,

28(4), 281-298.

Central Bureau for Statistics (2014, October). Kaart met statistieken per vierkant van 500 bij

500 meter [Datafile]. Retrieved from https://www.cbs.nl/nl-nl/dossier/nederland-

regionaal/geografische %20data/kaart-met-statistieken-per-vierkant-van-500-bij-500-

meter

Central Bureau for Statistics (2015, June). Aantal viswinkels in 7 jaar met kwart toegenomen.

Retrieved from https://www.cbs.nl/nl-nl/nieuws/2015/25/aantal-viswinkels-in-7-jaar-

met-kwart-toegenomen

Černý, V. (1985). Thermodynamical approach to the traveling salesman problem: An efficient

simulation algorithm. Journal of optimization theory and applications, 45(1), 41-51.

63

Chang, W., Cheng, J., Allaire, J. J., Xie, Y. & McPherson, J. (2016). Shiny: Web Application

Framework for R. R package version 0.14.2.

Cheng, J., Xie, Y., Wickham, H. & Agafonkin V. (2016). Create Interactive Web Maps with

the JavaScript ‘Leaflet’ Library. R package version 1.0.1.

Church, R. L. (2002). Geographical information systems and location science. Computers &

Operations Research, 29(6), 541-562.

Cui, C., Wang, J., Pu, Y., Ma, J. & Chen, G. (2012). GIS-based method of delimitating trade

area for retail chains. International Journal of Geographical Information Science,

26(10), 1863-1879.

Davis, P. (2006). Spatial competition in retail markets: movie theaters. The RAND Journal of

Economics, 37(4), 964-982.

Goodchild, M. F. (1984). I lacs: A Location Allocation Model For Retail Site Selection.

Journal of Retailing, 60, 84-100.

Google. (2017). Google Places API. Retrieved from: https://developers.google.com/places/

Graser, G. (2011) Catchment areas with pgrouting driving distance. Retrieved from

https://anitagraser.com/2011/05/13/catchment-areas-with-pgrouting-driving_distance/

Holla, J. (2013). GfK Supermarktkengetallen. Retrieved from

http://megaslides.com/doc/39860/gfk-supermarktkengetallen-februari-2013

ING (2014, April). Bakker, slager en groenteboer op hun retour. Retrieved from:

https://www.ing.nl/nieuws/nieuws_en_persberichten/2014/04/bakker-slager-en-

groenteboer-op-hun-retour.html

Huff, D. L. (1964). Defining and estimating a trading area. The Journal of Marketing, 34-38.

Kirkpatrick, S., Gelatt, C. D., & Vecchi, M. P. (1983). Optimization by simulated annealing.

science, 220(4598), 671-680.

64

Neun, M., Burghardt, D. & Weibel, R. (2009). Automated processing for map generalization

using web services. GeoInformatica, 13(4), 425-452.

Okabe, A. (2009). About SANET. Retrieved from: http://sanet.csis.u-tokyo.ac.jp/

OSM. (2016). NWB. Retrieved from: http://wiki.openstreetmap.org/wiki/NWB

Pettinger, C., Holdsworth, M. & Gerber, M. (2008). ‘All under one roof? ’differences in food

availability and shopping patterns in Southern France and Central England. The

European Journal of Public Health, 18(2), 109-114.

Raatgever, A. (2014). Winkelgebied van de toekomst. Bouwstenen voor publiek-private-

samenwerking, 41. Retrieved from http://www.detailhandel.nl/images/pdf/

Winkelgebied_vd_toekomst_volledig_lowres_website_2.pdf

Rabobank (2007, October). Rabobank Cijfers & Trends: Visie op negen sectoren in het

Nederlandse bedrijfsleven. Retrieved from

https://www.rabobank.nl/images/ct_kwartaalbericht_okt07_2966688.pdf

Raven, H., Lang, T., & Dumonteil, C. (1995). Off our trolleys: food retailing and the

hypermarket economy. Institute for Public Policy Research.

Roig-Tierno, N., Baviera-Puig, A., Buitrago-Vera, J., & Mas-Verdu, F. (2013). The retail site

location decision process using GIS and the analytical hierarchy process. Applied

Geography, 40, 191-198.

Rui, Y., Yang, Z., Qian, T., Khalid, S., Xia, N. & Wang, J. (2016). Network-constrained and

category-based point pattern analysis for Suguo retail stores in Nanjing, China.

International Journal of Geographical Information Science, 30(2), 186-199.

Sadler, R. C. (2016). Integrating expert knowledge in a GIS to optimize siting decisions for

small scale healthy food retail interventions. International Journal of Health

Geographics, 15(1), 1.

65

Saltelli, A. (2002). Sensitivity analysis for importance assessment. Risk Analysis, 22(3), 579

-590.

Sarkar, A. (2007). GIS applications in logistics: a literature review. School of Business,

University of Redlands, 1200.

Schabenberger, O. & Gotway, C. A. (2005). Statistical Methods for Spatial Data Analysis.

Chapman & Hall/CRC, Boca Raton, Florida.

Scott, D. M. & He, S. Y. (2012). Modeling constrained destination choice for shopping: a

GIS-based, time-geographic approach. Journal of Transport Geography, 23, 60-71.

Shiode, S., Okunuki, K. I. & Okabe, A. (2006). User Reference for SANET: A Toolbox for

Spatial Analysis on a Network.

Sienkiewicz, B. (2015) The impact of local food environments on diet: do neighbouring food

retailers influence what you eat? (MSc Thesis). Wageningen University.

Steiniger, S. & Weibel, R. (2005). “A Conceptual Framework for Automated Generalization

and Its Application to Geologic and Soil Maps.” In Proceedings of XXII Int.

Cartographic Conference, 11–16.

Suárez-Vega, R., Gutiérrez-Acuña, J. L. & Rodríguez-Díaz, M. (2015). Locating a

supermarket using a locally calibrated Huff model. International Journal of

Geographical Information Science, 29(2), 217-233.

Suárez-Vega, R., Santos-Peñate, D. R. & Dorta-González, P. (2012). Location models and

GIS tools for retail site location. Applied Geography, 35(1), 12-22.

Suárez-Vega, R. & Santos-Peñate, D. R. (2014). The use of GIS tools to support decision

making in the expansion of chain stores. International Journal of Geographical

Information Science, 28(3), 553-569.

Tong, D. & Murray, A. T. (2012). Spatial optimization in geography. Annals of the

Association of American Geographers, 102(6), 1290-1309.

66

Turk, T., Kitapci, O., & Dortyol, I. T. (2014). The Usage of Geographical Information

Systems (GIS) in the Marketing Decision Making Process: A Case Study for

Determining Supermarket Locations. Procedia-Social and Behavioral Sciences, 148,

227-235.

Van Praag, M. & van Stel, A. (2013). The more business owners, the merrier? The role of

tertiary education. Small Business Economics, 41(2), 335-357.

Veenstra, S. A., Thomas, T. & Tutert, S. I. A. (2010). Trip distribution for limited

destinations: a case study for grocery shopping trips in the Netherlands.

Transportation, 37(4), 663-676.

Wood, S. & Reynolds, J. (2012). Leveraging locational insights within retail store

development? Assessing the use of location planners’ knowledge in retail marketing.

Geoforum, 43(6), 1076-1087.

Xie, Z. & Yan, J., (2008). Kernel density estimation of traffic accidents in a network space.

Computers, Environment and Urban Systems, 32(5), 396-406.

67

8. Appendix I: Network KDE Python-code

from __future__ import division # Python 2.7

# Import system modules

import arcpy

import math

# Set the workspace

arcpy.env.workspace = "C:/Users/Bob/Desktop/ArcGIS/A_Preprocessing.gdb"

# Data can be overwritten

arcpy.env.overwriteOutput = True

# Check network license

arcpy.CheckOutExtension("Network")

# Set input/output

inPointData = "lxcenter_bak_amers" # Bakery example

outSourcePointData = "lxcenter_source_bak"

inNetworkDataset = "Road_Network_bak/ND_bak"

outPointData = "lxcenter_bak_amers_KDE_500_gaus"

## Functions

# Gaussian function

def gaussian(distdecayeffect, bandwidth, source):

gausdensity = source*((1/bandwidth)*((1/(math.sqrt(2*math.pi))**(-

((distdecayeffect**2)/(2*(bandwidth**2)))))))

return gausdensity

# Quartic function

def quartic(distdecayeffect, bandwidth, source):

quarticdensity = source*((1/bandwidth)*((3/math.pi)*(1-((distdecayeffect**2)

/(bandwidth**2)))))

return quarticdensity

# Minimum variance function

def minvar(distdecayeffect, bandwidth, source):

minvardensity = source*((1/bandwidth)*((3/8)*(3-(5*((distdecayeffect**2)

/(bandwidth**2))))))

return minvardensity

# Set parameters

bandwidth = 500

kernel = "gaussian"

# Make a layer from the feature class

arcpy.MakeFeatureLayer_management(inPointData, "lyr")

# Within selected features, select only those which are source points

arcpy.SelectLayerByAttribute_management("lyr", "NEW_SELECTION",

'"Source" >= 1')

# Write the selected features to a new featureclass

arcpy.CopyFeatures_management("lyr", outSourcePointData)

# Create facilities

facilities = arcpy.FeatureSet()

facilities.load(outSourcePointData)

# Create facilities layer

arcpy.MakeFeatureLayer_management(facilities, "facilities_lyr")

arcpy.CopyFeatures_management("facilities_lyr", "facilities_points")

# Generate service areas

serviceareas = arcpy.na.GenerateServiceAreas(facilities, bandwidth,

"Meters", inNetworkDataset, "ServiceAreasSource")

# Create layer out of source service areas

arcpy.CopyFeatures_management("ServiceAreasSource",

"in_memory\ServiceAreasSourceFeatureClass")

arcpy.MakeFeatureLayer_management("in_memory\ServiceAreasSourceFeatureClass",

"ServiceAreasSource_lyr")

# Create layer out of the lxcentres

arcpy.CopyFeatures_management(inPointData, "in_memory\lxcenter_amers_fc")

arcpy.MakeFeatureLayer_management("in_memory\lxcenter_amers_fc",

"lxcenter_amers_fc_lyr")

68

# Create a loop through the service areas

polygons = arcpy.da.SearchCursor("ServiceAreasSource_lyr", ["OBJECTID"])

polygonnr = arcpy.GetCount_management("ServiceAreasSource")

# Find field with name OBJECTID

field = arcpy.AddFieldDelimiters("ServiceAreasSource_lyr", "OBJECTID")

fac_field = arcpy.AddFieldDelimiters("facilities_points", "OBJECTID")

for polygon in polygons:

# Create single polygon layer

objectID = polygon[0]

print("polygon nr: {0} of {1}".format(objectID, polygonnr))

selection = "{field} = {objectID}".format(field=field, objectID=objectID)

arcpy.Select_analysis("ServiceAreasSource_lyr",

"in_memory\ServiceAreas_feat", selection)

# Loop through single service area polygon

rows = arcpy.SearchCursor("in_memory\ServiceAreas_feat")

for row in rows:

facID = row.getValue("FacilityID")

fac_selection = "{field} = {facID}".format(field=fac_field, facID=facID)

# Find source point

arcpy.Select_analysis("facilities_points", "facilities_point_cur",

fac_selection)

rows2 = arcpy.SearchCursor("facilities_point_cur")

# Get source value

for row2 in rows2:

source = row2.getValue("Source")

# Clip lxcenters on polygon

arcpy.Clip_analysis(inPointData, "in_memory\ServiceAreas_feat",

"in_memory\Lxcenter_inPolygon")

# Loop through all lxcenters in service area polygon

points = arcpy.da.SearchCursor("in_memory\Lxcenter_inPolygon",

['OBJECTID'])

for point in points:

# Set density point zero

density = 0

pointID = point[0]

print("point: {}").format(pointID)

# Create single point layer

pointfield = arcpy.AddFieldDelimiters("in_memory\Lxcenter_inPolygon",

"OBJECTID")

point_selection = "{pointfield} = {pointID}".format(pointfield=pointfield,

pointID=pointID)

arcpy.Select_analysis("in_memory\Lxcenter_inPolygon",

"in_memory\Lxcenter_inPolygon_point", point_selection)

# Find distance to source lxcenter

facilities = arcpy.FeatureSet()

facilities.load("in_memory\Lxcenter_inPolygon_point")

incidents = arcpy.FeatureSet()

incidents.load("facilities_point_cur")

# Try if a route can be created

try:

# Find distance from point to other point

arcpy.FindClosestFacilities_na(incidents, facilities, "Meters",

inNetworkDataset, "in_memory", "Distance_route",

"Output_directions", "Output_Closest_Facilities")

rows3 = arcpy.SearchCursor("in_memory\Distance_route")

for row3 in rows3:

# Set distance decay effect

distdecayeffect = row3.getValue("Total_Meters")

# calculate KDE for lxcenter

if kernel == "gaussian":

density = gaussian(distdecayeffect, bandwidth, source)

elif kernel == "quartic":

density = quartic(distdecayeffect, bandwidth, source)

elif kernel == "minvar":

density = minvar(distdecayeffect, bandwidth, source)

# Not always a route can be generated

except Exception as e:

density = 0

69

print('No KDE calculated for this point')

# Get geometry of point

with arcpy.da.SearchCursor("in_memory\Lxcenter_inPolygon_point",

["SHAPE@XY"]) as cur:

for row4 in cur:

geom_point_2 = row4[0]

# Find point that are equal to the geometry and add lxcenterKDE

with arcpy.da.UpdateCursor("lxcenter_amers_fc_lyr", ["SHAPE@XY", "KDE"])

as cur2:

for row5 in cur2:

geom_point = row5[0]

# If geometry is the same add lxcenterKDE

if geom_point == geom_point_2:

row5[1] = row5[1] + density

cur2.updateRow(row5)

break

# Copy results of lxcenter_bak_amers_fc_lyr to outPointData

arcpy.CopyFeatures_management("lxcenter_amers_fc_lyr", outPointData)

70

9. Appendix II: JSON to CSV R-code

#set working direcotry

setwd("C:\\Users\\Bob\\Desktop")

#load libraries

library(RCurl)

library(RJSONIO)

#initialise data frame

df1 <- data.frame(lat=double(), lon=integer())

df2 <- data.frame(lat=double(), lon=integer())

df3 <- data.frame(lat=double(), lon=integer())

df4 <- data.frame(lat=double(), lon=integer())

dflist <- list(df1, df2, df3, df4)

#Google Places API data load

latlondata = read.csv("lat_lon_points.csv")

dataquery <- for(i in latlondata$OBJECTID){

datalat <- latlondata$Latitude[i]

datalon <- latlondata$Longitude[i]

latlon <- paste(datalat, datalon, sep = ",")

# Set data query URLs

url1 <- sprintf('https://maps.googleapis.com/maps/api/place/radarsearch

/json?location=%s&radius=4000&keyword=supermarkt&key=AIzaSyBw

ZvcMFCkzj3TZGwHP-MhbWH1QI7dwVA4', latlon)

url2 <- sprintf('https://maps.googleapis.com/maps/api/place/radarsearch

/json?location=%s&radius=4000&keyword=bakker&types=bakery&key=

AIzaSyBwZvcMFCkzj3TZGwHP-MhbWH1QI7dwVA4', latlon)

url3 <- sprintf('https://maps.googleapis.com/maps/api/place/radarsearch/

json?location=%s&radius=4000&keyword=slager&key=AIzaSyBwZvcMFC

kzj3TZGwHP-MhbWH1QI7dwVA4', latlon)

url4 <- sprintf('https://maps.googleapis.com/maps/api/place/radarsearch/

json?location=%s&radius=4000&keyword=groentewinkel&key=AIzaSyB

wZvcMFCkzj3TZGwHP-MhbWH1QI7dwVA4', latlon)

# List data query URLs

urllist <- list(url1, url2, url3, url4)

for(i in urllist)

web <- getURL(i)

raw <-fromJSON(web)

#Find latitude values

if (length(raw$results) != 0){

lat <- NULL

latquery <- for (j in c(1:length(raw$results))){

lat <- c(lat,raw$results[[j]]$geometry$location[[1]])

}

#Find longitude values

lon <- NULL

lonquery <- for (k in c(1:length(raw$results))){

lon <- c(lon, raw$results[[k]]$geometry$location[[2]])

}

#unlist seperate parameter and list together

latlonretailer <- list(lat, lon)

#create matrix

latlonmatrix <- do.call(cbind, latlonretailer)

listnr <- match(i, urllist)

if(listnr == 1){

df1 <- rbind(df1, latlonmatrix)

}

else if(listnr == 2){

df2 <- rbind(df2, latlonmatrix)

}

71

else if(listnr == 3){

df3 <- rbind(df3, latlonmatrix)

}

else if(listnr == 4){

df4 <- rbind(df4, latlonmatrix)

}

}

}

#write to csv

write.csv(df1, 'location_supermarkets.csv')

write.csv(df2, 'location_bakers.csv')

write.csv(df3, 'location_butchers.csv')

write.csv(df4, 'location_greengroceries.csv')

72

10. Appendix III: Location points for Google Places API query

Table 10: Location points Google Places API query

ID Latitude Longitude ID Latitude Longitude

1 51,95397708 5,339082416 36 52,09373605 5,718338897

2 51,95398351 5,414726979 37 52,0934984 5,794218405

3 51,95394139 5,490371547 38 52,09321178 5,870096826

4 51,9538506 5,566016184 39 52,09287632 5,945974078

5 51,95371104 5,641660399 40 52,09249183 6,02184994

6 51,95352273 5,717303559 41 52,14092969 5,338882146

7 51,95328588 5,792945795 42 52,14093623 5,414843466

8 51,95300055 5,868587142 43 52,14089378 5,490804631

9 51,95266653 5,944227395 44 52,14080244 5,56676553

10 51,95228373 6,019866325 45 52,14066219 5,642726024

11 52,00071579 5,339032494 46 52,14047301 5,718685797

12 52,00072216 5,414755938 47 52,14023488 5,794644658

13 52,0006799 5,490479365 48 52,13994792 5,870602501

14 52,00058891 5,566202704 49 52,13961213 5,946559209

15 52,00044913 5,641925635 50 52,13922722 6,022514502

16 52,00026062 5,717647781 51 52,18766704 5,338831913

17 52,00002355 5,793368976 52 52,18767358 5,414872891

18 51,99973791 5,869089144 53 52,1876311 5,490913608

19 51,99940341 5,944808165 54 52,18753963 5,566953977

20 51,99902005 6,020525887 55 52,1873991 5,642994077

21 52,04745407 5,338982435 56 52,18720946 5,719033505

22 52,04746046 5,41478497 57 52,18697085 5,795071956

23 52,04741808 5,490587453 58 52,18668356 5,871109358

24 52,04732695 5,566389745 59 52,18634739 5,947145623

25 52,04718706 5,642191627 60 52,18596211 6,023180613

26 52,04699847 5,717992864 61 52,23440385 5,338781447

27 52,04676115 5,793793148 62 52,23441044 5,414902364

28 52,04647502 5,86959233 63 52,23436795 5,491022862

29 52,04614001 5,945390329 64 52,23427637 5,567142856

30 52,04575608 6,021187043 65 52,23413559 5,643262729

31 52,09419198 5,338932294 66 52,2339456 5,719382134

32 52,09419848 5,414814143 67 52,23370666 5,795500434

33 52,09415607 5,49069589 68 52,23341893 5,871617515

34 52,09406484 5,566577395 69 52,23308224 5,947733446

35 52,09392482 5,642458471 70 52,23269647 6,023848217

73

11. Appendix IV: R-code for interactive map

# Import library

library(leaflet)

library(shiny)

library(shinyjs)

library(raster)

library(rgdal)

# Set working Directory

setwd("C:\\Users\\Bob\\Google Drive\\MSc Thesis\\script\\Case_visualisation")

# Source Function

source('open_python.R')

source('vis_optimal_results.R')

# Add color scheme for raster

raster_pal <- colorNumeric(c("lightgreen",'green' ,"Darkgreen"), c(0,1),

na.color = "transparent")

# Create color palette for optimal area

area_pal <- colorFactor(

palette = "Blues",

domain = c(100,200,300,400,500)

)

# Import demand raster

demand_raster <- raster("input_images\\dem_raster.tif")

# Set and change coordinate system of demand raster

WGS84 <- "+proj=longlat +ellps=WGS84 +datum=WGS84"

RD_new <- "+init=epsg:28992"

proj4string(demand_raster) <- CRS(RD_new)

demand_raster <- projectRaster(demand_raster, crs = CRS(WGS84))

# Import attractiveness raster

attractiveness_raster <- raster("input_images\\att_raster.tif")

# Set and change coordinate system of attractiveness raster

proj4string(attractiveness_raster) <- CRS(RD_new)

attractiveness_raster <- projectRaster(attractiveness_raster, crs =

CRS(web_mercator))

# Create bootstrap user interface

ui <- bootstrapPage(

# Set style of page + loadmessage

tags$style(type = "text/css", "html, body {width:100%;height:100%}",

"#loadmessage {

position: fixed;

top: 0px;

left: 0px;

width: 100%;

padding: 5px 0px 5px 0px;

text-align: center;

font-weight: bold;

font-size: 100%;

color: #000000;

background-color: #CCFF66;

z-index: 105;

}

"),

# Place a conditional panel which shows a load message when a function is running

conditionalPanel(condition="$('html').hasClass('shiny-busy')",

tags$div("Optimal solutions are being calculated...",

id="loadmessage")),

# Create Leaflet output

leafletOutput("mymap", width = "100%", height = "100%"),

# Place an absolute panel over the map view

absolutePanel(top = 10, right = 10,

# Implement Shiny JavaScript functions

shinyjs::useShinyjs(),

74

# Set input parameters

selectInput("food_specialist_type", "Choose a food specialist type:",

choices = c("Bakery", "Butcher", "Greengrocery")

),

numericInput("weight1", "Demand weight (0-1):", 1, min = 0, max = 1, step=0.1

),

numericInput("weight2", "Competition weight (0-1):", 1, min = 0, max = 1,

step=0.1),

numericInput("weight3", "Location attractiveness weight (0-1):", 1, min = 0,

max = 1, step=0.1),

numericInput("storenr", "Indicate the number of stores: (1-10)", 1, min = 1,

max = 10),

# Create actionbutton

actionButton("generateoptimal", "Generate optimal location(s)"

),

# Additional textouput for tests

textOutput("text") # Only used for testing purposes

)

)

# Create server to change information based on inputs

server <- function(input, output, session) {

# Set leaflet map

output$mymap <- renderLeaflet({ leaflet() %>%

# Add OpenStreetMap (default) map

addTiles(group = "openstreetmap") %>%

# Add a legend for the raster images

addLegend("bottomright", pal=raster_pal, values = c(0,1),

title = "Raster score")%>%

# Add a legend for the polygons

addLegend("bottomright", values=c(100,200,300,400,500),

pal=area_pal, title = "Metres to optimal location")%>%

# Add demand raster

addRasterImage(demand_raster, opacity=0.5, colors=pall, group="Demand map")%>%

# Add attractiveness raster

addRasterImage(attractiveness_raster, opacity=0.5, colors=pall,

group="Attractiveness map")%>%

# Add on/off control for the different layers

addLayersControl(

overlayGroups = c("Demand map", "Attractiveness map"),

options = layersControlOptions(collapsed = FALSE),position='bottomleft')%>%

# Hide the rasters by default

hideGroup(c("Demand map", "Attractiveness map"))

})

# If button generate button event is clicked, then..

observeEvent(input$generateoptimal, {

# Open leafletproxy of mymap

leafletProxy("mymap")%>%

# Clear prior markers and optimal areas

clearShapes()%>%

clearMarkers()

# Disable inputs

shinyjs::disable("generateoptimal")

shinyjs::disable("food_specialist_type")

shinyjs::disable("weight1")

shinyjs::disable("weight2")

shinyjs::disable("weight3")

shinyjs::disable("storenr")

# Write inputs to textfile

write(input$weight1, 'input_data\\weigth1.txt')

write(input$weight2, 'input_data\\weigth2.txt')

write(input$weight3, 'input_data\\weigth3.txt')

write(input$storenr, 'input_data\\storenr.txt')

write(input$food_specialist_type, 'input_data\\food_specialist_type.txt')

# Execute optimisation method

openPython('GreedyAlgorithm') #or ‘SimulatedAnnealing’

75

# Enable inputs

shinyjs::enable("generateoptimal")

shinyjs::enable("food_specialist_type")

shinyjs::enable("weight1")

shinyjs::enable("weight2")

shinyjs::enable("weight3")

shinyjs::enable("storenr")

# Find all optimal location/area files

opt_list <- opt_location()

loop_nr <- 1

# Loop through all objects in opt_list

for (i in opt_list){

# If optimal location, then...

if (class(i[1]) == "SpatialPointsDataFrame"){

# Set optimal point based on loop number

optimal_lxcenter<-opt_list[[loop_nr]]

# Set leaflet proxy

leafletProxy("mymap")%>%

# Add markers where the optima are found

addMarkers(lng = optimal_lxcenter$coords.x1,

lat = optimal_lxcenter$coords.x2)

loop_nr = loop_nr + 1

}

# If not an optimal location, then..

else {

# Set optimal_area based on loop number

optimal_area <- opt_list[[loop_nr]]

# Set leaflet proxy

leafletProxy("mymap", data=optimal_area)%>%

# Add polygons based on the service areas of the optimal points

addPolygons(stroke=FALSE,

color = ~colorQuantile("Blues", optimal_area$ToBreak)(ToBreak),

fillOpacity = 0.6)

loop_nr = loop_nr + 1

}

}

})

}

# Execute shinyApp

shinyApp(ui, server)