Centre for Geo-Information Thesis Report GIRS-2017-07
Transcript of Centre for Geo-Information Thesis Report GIRS-2017-07
26
-01
-20
17
Centre for Geo-Information
Thesis Report GIRS-2017-07
Site selection for small retailers in the food industry using
demographic data and network-constrained kernel density
estimation
Bob Houtkooper
iii
Site selection for small retailers in the food industry using
demographic data and network-constrained kernel density
estimation
Bob Houtkooper
Registration number 93 06 23 368 090
Supervisor:
dr.ir. S (Sytze) de Bruin
A thesis submitted in partial fulfilment of the degree of Master of Science
at Wageningen University and Research Centre,
The Netherlands.
26-01-2017
Wageningen, The Netherlands
Thesis code number: GRS-80436
Thesis Report: GIRS-2017-07
Wageningen University and Research Centre
Laboratory of Geo-Information Science and Remote Sensing
v
Foreword
It was quite odd to work on an individual project for such a long time. I have experienced it to be
fairly learnful. I want to thank everyone who helped me through the thesis process. Sytze de Bruin,
thank you for your help and contributions.
Bob Houtkooper
23-01-2017, Wageningen
vii
Abstract
Despite the high financial risks of site selection, the implementation of spatial data to
support site selection for small food retailers is still limited. This thesis proposes a site
selection method to help finding multi-location food ventures using a simultaneous
approach. The used objective function incorporates variables concerning competition, demand,
and site attractiveness. Competition was computed with a network kernel density estimation
(KDE). Socio-demographic data, which describe the demand and attractiveness variables, were
retrieved and overlaid on a road network. The road segments with the highest utility were found
with an optimisation method. In this research, two heuristic optimisation methods were
examined: simulated annealing and greedy algorithm. A case which concerned finding the
optimal locations of three bakeries in Amersfoort was executed. A user interface was developed
to allow the adjustment of weights for the individual variables of the objective function, as well
as the number of stores, and the food retailer type. A sensitivity analysis was conducted to
assess the sensitivity of the approach to parameter settings. Main findings indicated that the
weights for the variables are unknown and differ among food specialists. Owing to the
computation load of the network KDE, the method could only be applied at the scale of a city.
Also, the available data were not all up-to-date and future demographic predictions were not
included. Therefore, the experimental results should be considered as an illustration of the
approach. A case was executed for the city of Amersfoort, the Netherlands. The used greedy
algorithm was an applicable optimisation method to find single optimal locations. However,
the chance that the global optimal was found decreased with increasing numbers of locations.
Therefore, when multiple stores locations were to be found, the simulated annealing
optimisation method was preferable. Under the default settings, the optimal locations were
found around the west and south side of the city centre. This area can be considered attractive,
because it is populated by people with higher incomes, demand is high, and competition is
generally, zero or close-to-zero.
Keywords: site selection, small food retail stores, network-constrained point patterns, spatial
optimisation, simulated annealing, greedy algorithm, sensitivity analysis.
ix
Table of contents
1. Introduction ....................................................................................................................... 1
1.1. Context and background .............................................................................................. 1
1.2. Problem definition ....................................................................................................... 1
1.3. Research questions ...................................................................................................... 3
1.4. Structure of the report .................................................................................................. 4
2. Review ................................................................................................................................ 5
2.1. Site preferences in relation to space ............................................................................ 5
2.2. Spatial networks .......................................................................................................... 7
2.3. Network kernel density estimation .............................................................................. 7
2.4. Spatial optimisation ..................................................................................................... 8
2.5. Food retailer location data ........................................................................................... 9
3. Methods ............................................................................................................................ 10
3.1. Objective function ..................................................................................................... 10
3.1.1. Composite utility ................................................................................................ 10
3.1.2. Study area ........................................................................................................... 14
3.2. Optimisation method ................................................................................................. 20
3.2.1. Optimal locations ............................................................................................... 20
3.2.2. Greedy algorithm ................................................................................................ 20
3.2.3. Simulated annealing ........................................................................................... 21
3.2.4. From road segments to areas .............................................................................. 24
3.3. Case study .................................................................................................................. 24
3.3.1. Optimal solutions ............................................................................................... 24
3.3.2. User interface ..................................................................................................... 25
3.4. Sensitivity analysis .................................................................................................... 26
4. Results .............................................................................................................................. 29
4.1. Objective function ..................................................................................................... 29
4.1.1. Explorative analysis ........................................................................................... 29
x
4.1.2. Composition ....................................................................................................... 41
4.2. Optimisation method ................................................................................................. 42
4.2.1. Greedy algorithm ................................................................................................ 42
4.2.2. Simulated annealing ........................................................................................... 42
4.3. Case results ................................................................................................................ 44
4.3.1. Optimal solutions ............................................................................................... 44
4.3.2. User interface ..................................................................................................... 48
4.4. Sensitivity analysis .................................................................................................... 51
5. Discussion ......................................................................................................................... 54
5.1. Objective function ..................................................................................................... 54
5.2. Optimisation method ................................................................................................. 55
5.3. Case ........................................................................................................................... 56
5.3.1. Optimal solutions ............................................................................................... 56
5.3.2. User interface ..................................................................................................... 56
5.4. Sensitivity analysis .................................................................................................... 56
6. Conclusions, limitations and recommendations ........................................................... 58
6.1. Conclusions ............................................................................................................... 58
6.2. Limitations ................................................................................................................. 59
6.3. Recommendations ..................................................................................................... 60
7. References ........................................................................................................................ 62
8. Appendix I: Network KDE Python-code ...................................................................... 67
9. Appendix II: JSON to CSV R-code ............................................................................... 70
10. Appendix III: Location points for Google Places API query ...................................... 72
11. Appendix IV: R-code for interactive map .................................................................... 73
xi
List of figures
Figure 3.1: Diagram composite objective function; hexagons represent input, squares are
functions, and ellipses represent outputs. ................................................................................. 12
Figure 3.2: Study area. ............................................................................................................. 14
Figure 3.3: Flowchart network KDE calculation; the hexagons represent data, rectangles
represent actions, and the diamond represents a conditional statement. .................................. 17
Figure 3.4: Fishnet over the study area. ................................................................................... 19
Figure 4.1: Locations of retrieved food retailers within the study area based on Google maps.
.................................................................................................................................................. 30
Figure 4.2: The Amersfoort study area. ................................................................................... 32
Figure 4.3: Location of bakeries, butchers, supermarkets and greengroceries in the Amersfoort
study area. ................................................................................................................................. 33
Figure 4.4: A categorised road network of the city centre of Amersfoort on the number of
retailers that are closest by. ...................................................................................................... 34
Figure 4.5: Service areas around source points. Notice that service areas may overlap. ......... 35
Figure 4.6: Network KDE densities of Amersfoort with a bandwidth of 250 metres. ............. 36
Figure 4.7: Network KDE densities of Amersfoort with a bandwidth of 500 metres. ............. 37
Figure 4.8: Normalised demand in the Amersfoort study area. ............................................... 38
Figure 4.9: Normalised demand variable on a network space ................................................. 39
Figure 4.10: Normalised number of elderly people. . .......................................................... 40
Figure 4.11: Normalised number of people with high income. ............................................... 40
Figure 4.12: Normalised urbanity classification. ..................................................................... 40
Figure 4.13: Normalised attractiveness variable. ..................................................................... 40
Figure 4.14: Existing food retailers on attractiveness variable map in the city centre of
Amersfoort. .............................................................................................................................. 41
Figure 4.15: Composite utility map of Amersfoort corresponding to scenario 1..................... 42
Figure 4.16: Probability trace of simulated annealing for five simultaneous store locations .. 43
Figure 4.17: Trace of the joint utility of the accepted road segment composition ................... 44
Figure 4.18: Three optimal locations for new bakeries in Amersfoort determined with the
greedy algorithm. ..................................................................................................................... 45
Figure 4.19: Competition variable map, prior (left) and after (right) adding two stores by
optimisation. ............................................................................................................................. 46
xii
Figure 4.20: The road network is visualised together with the optimal sites and optimal locations
for four iterations; GA stands for greedy algorithm and the number presents the store location
number. ..................................................................................................................................... 47
Figure 4.21: Five optimal locations for new bakeries in Amersfoort determined with simulated
annealing. ................................................................................................................................. 48
Figure 4.22: GUI showing a map of Amersfoort with an information bar, a demand raster,
demand legend, disabled parameter inputs, and a disabled execute button. ............................ 49
Figure 4.23: GUI showing a map of Amersfoort with two optimal locations, two optimal areas,
optimal area legend, enabled parameter inputs, and an enabled execute button. ..................... 50
Figure 4.24: Optimal locations if the number of stores is set to five (greedy algorithm). ....... 51
Figure 4.25: Visualisation of results, neglecting one variable; GA stands for greedy algorithm,
com0 was when the competition variable is set to zero, dem0 was when the demand variable
was set to zero, and the att0 was when the attractiveness variable was set to zero.................. 52
List of tables
Table 3.1: Simulated annealing settings. .................................................................................. 22
Table 3.2: Scenario one case study. ......................................................................................... 24
Table 3.3: Scenario two case study. ......................................................................................... 25
Table 3.4: Scenarios to assess the sensitivity of the food retailer type. ................................... 27
Table 3.5: Scenarios to assess the sensitivity of the variables. ................................................ 28
Table 4.1: Number of food retailers in the study area. ............................................................. 31
Table 10: Location points Google Places API query ............................................................. 722
List of boxes
Box 3.1: Pseudo code greedy algorithm in a network space. ................................................... 21
Box 3.2: Pseudo code simulated annealing. ............................................................................. 23
xiii
List of abbreviations
AHP = Analytic Hierarchy Process
API = Application Programming Interface
CBS = Central Bureau of Statistics
CSV = Comma Separated File
GDAL = Geospatial Data Abstraction Library
GIS = Geo-Information Science
GUI = Graphical User Interface
GWR = Geographically Weighted Regression
JSON = JavaScript Object Notation
KDE = Kernel Density Estimation
MGI = Master Geo-Information science
NWB = Nationaal Wegen Bestand
OAT = One-factor-At-the-Time
OGR = OpenGIS simple features Reference implementation
OS = Operating System
OSM = Open Street Map
RD = RijksDriehoekstelsel
URL = Uniform Resource Locator
XML = eXtensible Markup Language
HTML = HyperText Markup Language
1
1. Introduction
In section 1.1 and section 1.2, the background of the problem is defined and the major findings
of this field of interest are stated. Subsequently, the knowledge gap is given, which describes
the general need for this study. Moreover, the target audience is defined. In section 1.3, the
research questions are stated. Finally, in section 1.4, the structure of this report is presented.
1.1. Context and background
Analysis of spatial data is useful for several strategic business activities. Strategic management
refers to the overall direction of a company (Sarkar, 2007). An example of a strategic business
activity is doing market analysis. A market analysis is conducted to find out if a certain market
is attractive for a certain organisation. The distribution of competitors and customers is critical
information. Besides strategic management, spatial data is used to optimise operational business
activities, for instance resource allocation, the reduction of operating costs, and monitoring of
business activities (Sarkar, 2007). These operational activities control the day-to-day
operations. Geo-information system (GIS) concepts and techniques are suitable for regional
business, because they can support decision-making in regional marketing, spatial planning,
and logistics (Cui et al., 2012). GIS is already widely used in various industries, for example in
the insurance, retail, transportation, tourism, real estate, and telecommunication industries (Cui
et al., 2012).
1.2. Problem definition
Small food enterprises need support. (ING, 2014; Central Bureau of Statistics, 2015; Rabobank,
2007). The economic bureau of the ING bank (ING, 2014) revealed that the number of bakeries,
butchers, and greengroceries decline in the Netherlands, owing to competition of supermarkets.
The number of greengroceries declined the most; between 2008 and 2013 15% of the
greengroceries disappeared. In the same period, butchers and bakeries faced a decline of 10%
and 7%, respectively (ING, 2014). Data of the Central Bureau of Statistics endorse the declining
numbers of poultry shops, butchers and greengroceries between 2007 and 2014 (CBS, 2015).
The Rabobank (2007) found that various consumer trends negatively affect specialised food
retailers, for instance growth of hard discount and online food, changing consumer preferences
(ready-to-eat meals), and a decline of customer loyalty. In conclusion, traditional food shops
are disappearing in the Netherlands.
2
Today, most consumers buy everything at one place: the supermarket. This trend of buying all
groceries at the supermarket causes a decline in social contact (Raven, Lang & Dumonteil,
1995; Pettinger, Holdsworth & Gerber, 2008; Blythman, 2012). Researches endorse that small
food enterprises encounter problems, due to competition of supermarkets (Raven et al., 1995;
Blythman, 2012). Multiple small food ventures can easily be swallowed by one single
supermarket. The decline of small food retailers is deplored, mainly because they hold
communities together (Pettinger et al., 2008). Pettinger et al. (2008) conducted research on
shopping behaviour differences between England and France. Results pointed to a single
difference in shopping behaviour: the French often shop in individual traditional shops, whereas
the English prefer shopping in supermarkets. England faces higher obesity rates than France. A
correlation was assumed between obesity rates and shopping culture in a country (Pettinger et
al., 2008). The authors claimed that supermarkets generally offer less healthy products and that
they tempt consumers to buy cheaper products. A more far-fetched reason was that consumers
get less exercise when going to only one store when doing grocery shopping. However, the
authors warned that more in-depth research is required to infer causal relations between health
and shopping at supermarkets.
Traditional food shops require help in their survival of the growing competition of
supermarkets. One way to help entrepreneurs is to assist them in determining best locations for
new shops. This can be done with a site selection method. An aid for site selection is useful,
since opening a new food specialist store faces high financial risk in the retail sector (Roig-
Tierno et al., 2013).
Church (2002) mentioned it is easy to conclude that the success of many location applications
in the future may be intimately tied to GIS. Sarkar (2007) emphasised that the barriers to
implement GIS in business is deteriorating. He stated that computing power is increasing, data
availability is wider, software is more extensive available, integration with corporate databases
is easier, and the internet is more and more used to share data and software. Therefore, a GIS
method is proposed.
Especially in the Netherlands, a lot of potentially useful spatial data are available, which can be
applied in a site selection method. GIS could turn these data into information by combining
different data sources and methods. The method I propose aims to be utilised by small retail
owners in the food industry, for finding multiple locations or a single location for their venture.
This will help small retail owners in selecting a site for their venture. Other researches primarily
focussed on site selection for supermarkets outside the Netherlands with inputs that are not
3
easily accessible (Suárez-Vega et al., 2014; Turk et al., 2014; Rui et al., 2016). Suárez-Vega et
al. (2014) utilised a commercial and an industrial index. Turk et al. (2014) distributed 1,100
questionnaires. Rui et al. used store brandings and store types.
In conclusion, a site selection method, designed for bakeries, butchers and greengroceries in
the Netherlands, was developed in this research. As mentioned by Rui et al (2016) network
KDE in combination with demographic data can be applied in order to explore site selection for
retail stores. According to my knowledge, no site selection method, based on demographic data
and a network-constrained point pattern, yet exists for small food ventures in the Netherlands.
Other researches required inputs from which the data are not easy to retrieve and therefore not
easily applicable. The method is also feasible when a retail owner wants to open more than one
store, as my method makes use of a heuristic spatial optimisation method.
This thesis targets small retailers in the food industry, especially those from the Netherlands.
The focus is mainly on the Netherlands, since the required data may not readily available in
other countries. Besides that, the attractiveness factors were based on market research in the
Netherlands. The site selection method was established by the following research design. First,
a literature study was conducted to define an objective function. Second, data sources required
to be retrieved and altered and methods were implemented. Therefore, an explorative analysis
was conducted to examine if the data could be attained and transformed and to examine the
methods. Third, a user interface was developed, in which users can adjust weights and other
settings, which on their turn can affect the objective function, and visualise the results. Fourth,
a case study was conducted to test the method. Finally, a sensitivity analysis was conducted to
assess the sensitivity of the approach to parameter settings. The research aims to answer the
research questions described below.
1.3. Research questions
The objective of this thesis is to create a site selection method for small retail owners in the
food industry. Retail owners can use the method as an aid in their site selection decision process.
4
To reach the main objective, four research questions need to be answered:
Which objective function is usable for site selection for food retail stores?
Which optimisation method can be implemented to find optimal locations?
Where in a case study area lie opportunities to start a new, small multi-location food
venture?
How sensitive are the outcomes of the site selection for different parameter settings?
1.4. Structure of the report
This thesis consists of six sections. Chapter 2 is a review of the literature on location analysis
and GIS. Also, the methods which were used in this thesis are reviewed. Chapter 3 explains the
methodology. Section 3.1.1 presents the composition of the objective function. Then, section
3.1.2 describes how the data of the variables of the objective function were retrieved and
calculated. Moreover, the study area was shown. Section 3.2 presents the methodology of the
optimisation methods. Section 3.3 explains how the case was executed and section 3.4 presents
the sensitivity analysis. In chapter 4 the methods are applied on a case study. First, an
explorative analysis is described that aimed to check if all the data could be retrieved. Chapter
5 discusses the results and compares them to results found in literature. Chapter 6 covers the
conclusion, limitations and recommendations. The methods, results, discussion, and conclusion
sectors are ordered the same as the sequence of the research questions.
5
2. Review
The review chapter presents the results and methods from relevant literature. Ideas and
methods from various researchers are summarised to give an overview of the main concepts in
the site selection field.
2.1. Site preferences in relation to space
One of the main factors which influences the feasibility of a retail store is the spatial dispersion
of retailers and consumers (Davis, 2006). Geo-demand and geo-competition hinge on this
factor. Geo-demand is the position of customers and geo-competition relates to the location of
competition (Roig-Tierno, Baviera-Puig, Buitrago-Vera & Mas-Verdu, 2013). However, the
site selection depends on more than just spatial dispersion (Wood & Reynolds, 2012).
Suárez-Vega et al. (2012) noted that there are two frequently conflicting objectives in site
selection: maximisation of the total market share captured by the firm and minimisation of
market share losses for its existing facilities due to being captured by the new firm. The authors
combined location models and GIS to create tools to help locating one new store in a franchise
distribution system in a continuous space (Suárez-Vega et al., 2012). They concluded that GIS
tools can broaden the vision of an entrepreneur when opening a new store. Suárez-Vega et al.
(2014) used two methods: a weighting and a constraint method. In the weighting method a
weight was assigned to each function and the weighted sum of these functions was optimised.
The constraint method showed the cannibalisation cost. Cannibalisation occurs when a new
chain store causes a decline in customers for another store in the same chain. The
cannibalisation cost was subtracted from the weight value.
In contrast, Turk et al. (2014) did not consider competition. Their regional model only
considered socio-demographic variables which were used for creating consumption maps to
find optimal store locations. Turk et al. (2014) distributed questionnaires to obtain information
about consumption on a region scale.
Another well-known site selection method commonly used in the field of retail distribution is
the Huff model (Huff, 1964). It is the most common way to delimitate trade area (Baray &
Cliquet, 2007). The Huff model assumes that two factors determine the trade area: the distance
between customers and stores and the attractiveness of the stores (Huff, 1964). Suárez-Vega et
al. (2015) presented an adjusted Huff model with spatial nonstationary parameters: distance,
6
attractiveness, and competition. While global models ignore individual customer preferences,
their local Huff model did account for it. Suárez-Vega et al. (2015) calibrated the Huff model
via geographically weighted regression (GWR). GWR is a local regression technique to
estimate parameters for every point in the study area, showing the variability over the analysed
space (Brunsdon et al., 1996). This technique is based on the theory about things that are close
by that close things tend to have similar values. Locally observed data are more influential than
data from more remote locations.
In another research (Sadler, 2016), local expert knowledge was used to optimise new sites for
healthy food ventures in social-distressed areas. In this research, a three-step process was
introduced, which included an analytic hierarchy process (AHP), direct mapping, and point
allocation of key variables. AHP is a method to assess the importance of variables in a multi-
criteria decision making. Experts score variables pair-wise, experts decide which of the two
variables is more important. In the direct mapping process, experts were asked to mark were
they thought a new location would be of most value. In the point allocation process, experts
were asked to divide a certain amount of points to variables.
Roig-Tierno et al. (2013) combined a kernel density estimation with an AHP. The kernel
density estimation was used to find hotspots of potential customers. Those hotspots were ranked
with the help of an AHP.
Scott and He (2012) developed a constrained model to predict destination choice for shopping.
Two of their results are interesting for this thesis. First, seniors prefer shopping at bakeries.
Second, the income determines where people prefer to shop. These statements coincide with
findings of Detailhandel Nederland, a Dutch foundation which provides information on retail
in the Netherlands (Raatgever, 2014). Detailhandel Nederland expects that in the future, only
small food retailers in city centres will be feasible (Raatgever, 2014). The market research
company GfK claimed that food retail specialists gain large turnovers from wealthy elderly
(Holla, 2013), who tend to live in certain neighbourhoods.
Some studies (Suárez-Vega et al., 2015; Goodchild, 1984) use store attractiveness instead of
site attractiveness. For example, Suárez-Vega et al. (2015) only consider store size. The bigger
the store size, the higher the attractiveness value, which may apply to supermarkets but perhaps
not to specialist food shops. Besides, store size data are not easily available on a bigger scale.
Goodchild (1984) states that store attractiveness characteristics are not only physical, but also
short-term variables, like pricing and advertising. Two problems occur according to Goodchild
7
(1984) when implementing store attractiveness characteristics. First, if you implement more
than one store, then assumed locations influence the competition variable, and store size can
never be assigned to a hypothetical location. Second, again, store size is hard to retrieve.
Therefore, I have chosen to use site attractiveness criteria.
2.2. Spatial networks
Cui et al. (2012) proposed a method to precisely delimitate a trade area, focusing on chain
businesses. These are individual stores with the same brand name, such as “Bakker Bart” or
“Keurslager”. The authors stressed the importance of measuring trade areas since this improves
the understanding of existing market opportunities and it facilitates predicting sales. As a
consequence, decision-making entails lower risks. To determine the trade area, Cui et al. (2012)
used spatial networks.
Spatial networks predict accessibility to stores in a more realistic way than a planar space (Cui
et al., 2012). A spatial network is a street network, which is turned into a graph. Graphs consist
of nodes and edges. Nodes are created on every road intersection and the edges are the links
between two nodes. All edges in the network space have attributes, such as travel time or
distance. The time attribute is defined by the time it would take to get from one node to another
node over an edge. The distance attribute equals the distance between two linked nodes.
2.3. Network kernel density estimation
Spatial networks can be used to create network-constrained density maps. Rui et al. (2016)
studied retail hot-spot areas using two network-constrained point pattern analysis methods:
network kernel density estimation (KDE) and network K-function. The network KDE and
network K-function were applied as indicators of the distribution of retail stores.
KDE is a popular method for point-event distributions. Xie and Yan (2008) introduced a KDE
in a network space. It is computed, using equation 1:
8
λ(s) = ∑1
𝑟𝑘 (
𝑑𝑖𝑠
𝑟) ,
𝑛
𝑖=1
(1)
where
dis is the distance between point i and s, this is important because it concerns the network space,
λ is the density,
s is the point location,
r is the bandwidth; the distance wherein the points are taken into account when calculating the density,
for a particular location,
k is the kernel.
The density is calculated over a point unit. Point i is a source point. A source point is a point
where a store is closest by. The kernel decides the weight of point s at distance dis. Thus, the
kernel function returns a weight value based on the distance between point i and point s. The
three kernel functions which are mostly utilised are the Gaussian function, the quartic function,
and the minimum variance function (Schabenberger & Gotway, 2005).
2.4. Spatial optimisation
Optimisation algorithms select the best results for a given objective function (Neun et al., 2008).
According to Tong and Murray (2012), there are two possible methods to find an optimal
solution of a site selection objective function: exact and heuristic. The exact search method
extensively calculates all possibilities, which results in finding an optimal solution (Tong &
Murray, 2012). Heuristic methods are based on strategies and rules of thumb procedures to
solve an optimisation problem (Tong & Murray, 2012).
Simulated annealing is an example of a heuristic method for spatial optimisation. Kirkpatrick
et al. (1983) and Cerny (1985) independently introduced simulated annealing. When an
objective function is available, but no solution can be calculated within a feasible time limit,
then simulated annealing is a possibility. Simulated annealing iteratively tries to find the
maximum or minimum of a function (Steiniger & Weibel, 2005). Simulated annealing starts
with an initial state. Then, another state is proposed. If the utility of that state is better, then the
proposed state is accepted. The proposed state becomes the current state. However, if the utility
of the proposed state is lower than the utility of the current state, the proposed state is excepted
with a certain probability. The probability is determined by a temperature, a variable that
decreases in every iteration, and the difference in utility between the proposed and current state.
9
The acceptance probability function (Kirkpatrick et al., 1983) is shown in equation 2.
Noticeable is that equation 2 strongly depends on temperature.
Probability = 𝑒−(𝑆′−𝑆 )/𝑇 , (2)
where
S’ is the utility of the proposed state,
S is the utility of the current state,
T is the current temperature.
If the difference between the utility of the proposed state and the utility of the current state is
close to zero, then the probability of acceptance increases. Also, the higher the temperature, the
higher the possibility that a proposed state is accepted. In every iteration, the temperature
decreases with a certain cooling ratio. Therefore, with every iteration, the probability decreases
that a state with a lower value is accepted as the current state.
Other heuristic optimisation methods are known as greedy algorithms. Aboolian, Berman and
Krass (2007) showed that for store sitting problems the greedy algorithm is an efficient method
to obtain close-to-optimal quality solutions. In site selection, a greedy algorithm in turn finds
the location with the highest utility derived from an objective function, given the current
locations of existing stores. This location is saved as an optimal solution and is added to the
stock of existing stores. The process is repeated until the number of selected locations is equal
to n new stores. Hence, the greedy algorithm method is only locally optimal; it finds the true
optimal location if only one store needs to be found. The greedy algorithm may not find the
jointly optimal solution if multiple locations need to be found.
2.5. Food retailer location data
Sienkiewicz a former Master Geo-Information (MGI) student, described a method to extract
the Google Places data and how to save it into a CSV format (Sienkiewicz, 2015). She compared
the Google Places data to two other food retailer location sources and found that the Google
Places data is the most complete data source for food retailer locations.
10
3. Methods
The method chapter clarifies how the research questions were answered. It ensures that this
research is reproducible, by explaining how the data were obtained and which methods were
used. Moreover, justifications are provided why certain methods were used.
3.1. Objective function
3.1.1. Composite utility
Three commonly used variables in site selection are demand, competition, and attractiveness
(Goodchild, 1984; Huff, 1964). With these variables, it is possible to predict the number of
consumers that is likely to shop at a particular retail venture. The proposed site selection method
considers site attractiveness, (and not store attraction, due to the unknown target store(s)), the
scale of the research, and the availability of data. Suarez-Vega et al. (2015) claim that traveling
cost is more important than the size of a supermarket. In this thesis, it is assumed that this also
holds for small traditional food retailers. For food retailers and convenience stores, it is known
that demand is elastic with respect to distance, meaning that when the distance to a store
becomes greater, the chance is bigger that a customer goes to a competitor (Goodchild, 1984).
Food is a primary good, therefore my method assumes that demand is proportional to the
population. In this thesis’ utility assessment, network KDE and the demographic data were
combined with a site attractiveness variable.
Each individual road segment in the road network received a site suitability value, also called
utility value. A weighted sum criterion is proposed, based on the variables which are used in
the market share model (Goodchild, 1984) and the variables of the multiplicative competitive
interaction model (Suárez-Vega et al., 2015). The weighted sum function has three normalised
terms, each pondered by a weight. A single utility was computed, using equation 3:
𝑈𝑗 = 𝑤1 ∗ ((𝑑𝑗 − 1) ∗ −1) + 𝑤2 ∗ 𝑝𝑗 + 𝑤3 ∗ 𝑎𝑗 , (3)
where
U is the utility,
j is a road segment,
w are the weights,
d is a normalised density variable of food retailers,
p is a normalised demand/population variable,
a is a normalised site attractiveness variable.
11
The input data were normalised to range between zero and one. The weights can be set
according to individual preferences of the user, because the importance of the criteria may shift
among entrepreneurs, especially among specialisations. For example, a butcher could have
more interest in the demographic characteristics while a bakery may be more affected by local
competition. The weights are also limited to a range between zero and one. After setting the
selected number of stores to ≥1, the joint utility is calculated. The joint utility equals sum of all
utilities when optimising multiple ventures (equation 4):
𝐽𝑈𝑗 = ∑ 𝑈𝑗
𝑛
𝑖=1
, (4)
where
JU is the joint utility,
j is a road segment,
n is the number of stores,
U is the utility.
Figure 3.1 shows a diagram of the objective function composition. The site selection objective
function is an output and a function. An explorative analysis was required to discover whether
the required input variables can be acquired, to test the used methods, and to provide the data
for a case and a sensitivity analysis.
12
Figure 3.1: Diagram composite objective function; hexagons represent input, squares are functions, and
ellipses represent outputs.
Composite attractiveness
Areas with a high urban classification, a high percentage of elderly people (65+), and high
incomes were deemed attractive sites for small food retailers. Households were categorised as
households with high incomes, when they belong to the top 20% incomes in the Netherlands.
The data for the attractiveness variable were retrieved from the CBS (2014). The attractiveness
criterion did not distinguish weight values for the separate elements in the attractiveness
criterion, for the simple reason that the true importance unfortunately was not known; this was
not found in literature or any available market research. Therefore, the elements of the
attractiveness variable contributed approximately proportionately, which made that data
normalisation was necessary. The percentage of high income, and the number of elderly people
were normalised using equation 5.
𝑥′ = 𝑥 −𝑚in (𝑥)
max(𝑥)−min (𝑥) , (5)
where
x’ is the normalised value,
x is the original value.
13
This normalisation technique was not usable for the urban classification, since the original
urban classification was categorised by the CBS as an ordinal value between one and five;
divided as follow:
Dense urban areas (address density of more than 2500 addresses/km) have a value of
one,
Urban areas (address density of 1500 to 2500 addresses/km) have a value of two,
Moderate urban areas (address density of 1000 to 1500 addresses/km) have a value of
three,
Minor urban areas (address density of 500 to 1000 addresses/km) have a value of four,
Non-urban areas (address density of less than 500 addresses/km) have a value of five.
To normalise the results, the classification values one, two, three, four, and five were turned
into one, 0.75, 0.50, 0.25, and zero, respectively. The areas, classified as one by the CBS,
represent the most attractive sites for the urbanity factor. Small food retailers are only feasible
in urban areas as explained in section 2.6. Therefore, non-urban areas were assigned a value of
zero.
The site attractiveness variable was again normalised, because it was an input of the utility
formula, which worked with weights to determine the impact per element. Accordingly, the
attractiveness variable (equation 6) is the sum of three elements, divided by the maximum
value:
𝑎𝑟𝑒𝑔𝑖𝑜𝑛 = 𝐻𝑖𝑛𝑐𝑟𝑒𝑔𝑖𝑜𝑛 + 𝐸𝑙𝑑𝑟𝑒𝑔𝑖𝑜𝑛+𝑈𝑟𝑏𝑟𝑒𝑔𝑖𝑜𝑛
max(𝑎), (6)
where
region is one 500 by 500 metres square,
a is the attractiveness variable,
Hinc is the normalised high income value,
Eld is the normalised number of elderly people,
Urb is the normalised urbanity category.
Section 1.2 claimed that greengroceries, bakeries, and butchers suffer from the competition of
supermarkets. Therefore, store location data from those food retailers were required.
Competition also happens within the same branch. The point input data (the location of food
retailers) for the network KDE existed of supermarket and the specific food specialist location
14
data, to fully represent the competition. For instance, bakery locations and supermarket
locations were taken into account when calculating the network KDE for a bakery.
3.1.2. Study area
Location
The data collection was executed for a region in the Netherlands, covering the cities
Amersfoort, Arnhem, Apeldoorn, as well as several smaller cities and villages. This area covers
a wide variation in population and food retailer densities. The study area has a size of
approximately 1,900 square kilometres and is visualised in figure 3.2.
Figure 3.2: Study area.
15
Explorative analysis
An explorative analysis was conducted to test the feasibility of the calculation of the network
KDE and to examine whether the required data and analyses are feasible. For convenience
purposes, the maximum time required for the network KDE calculations was set to 48 hours.
The network KDE was applied to find hotspots of roads near food retailers. Hotspots are roads
which are located near large numbers of stores. How far someone can travel on a network space
is determined by constraints. The constraint can be distance or time. With a distance constraint
you can calculate how far someone can travel for a certain distance. When a time constraint is
utilised, then you can find out how far someone can travel within a certain time. Only a distance
attribute, the length of a road, was considered as a constraint, thus not a time attribute. This had
two reasons. First, because of the scale of the study. Second, some consumers shop by car,
others by bike or afoot, which made it really hard to successfully determine the traveling time
of roads. Network KDE was computed according to the method of Xie and Yan (2008).
The midpoints of existing road segments were determined to construct the network KDE. To
avoid excessive computation time, the segments were not split into equal lengths. Since stores
are typically located along short segments in areas with many road intersections, this strategy
was assumed to have at most a minor effect on model outcomes. For every store, the road
segment which was closest by was determined. If the number of stores was at least one, then
the segment was saved as a source segment. The count values, i.e. the number of stores, were
assigned to the midpoints of the source segments.
Also, a bandwidth, r, was established. Every source midpoint got a so-called service area or
catchment area. A service area is a polygon which presents all roads that are located within the
bandwidth of a source midpoint. Only the KDE of midpoints that fall within the service areas
were calculated. The KDE of the remaining midpoints was set to zero.
A Python script was developed to perform the network KDE (Appendix I) with the ArcPy
package. For the explorative analysis I used the location of supermarkets and bakeries as the
input point data. Two density maps were developed. One with a bandwidth set to 250 metres
and the other one with a bandwidth set to 500 metres. The kernel was set to quartic, which is
defined by equation 7:
16
𝐾𝑗 = 3
𝜋(1 − (
𝑑𝑖𝑠𝑗𝑣2
𝑟2 )), (7)
where
K is the kernel value,
j is a road segment,
disjv is the distance-decay-effect; distance between a road segment and the corresponding source road
segment,
r is the bandwidth.
The KDE values were normalised, in order to fit in the composition of the objective function.
The network KDE is computing-intensive, because for every source segment, approximately
300 road segments were found in a service area, when the bandwidth set to 500 metres. For all
these road segments, a distance-decay required to be calculated.
A summary of the network KDE calculations is shown in a flowchart in figure 3.3:
17
Figure 3.3: Flowchart network KDE calculation; the hexagons represent data, rectangles represent actions,
and the diamond represents a conditional statement.
18
Socio-demographic data were retrieved from the Central Bureau of Statistics (2014). Network
data were acquired through Open Street Map (OSM). The network data provided by OSM uses
data from the Nationaal Wegen Bestand (NWB) (OSM, 2016). The network data were clipped
on the study area. OSM also has data on shops data, such as supermarkets, and the bigger chains,
like Hema, Action. It is questionable whether OSM contains enough data to create a site
selection method. The Google Places API is more complete. The problem of the Googles Places
API is that a restricted number of food retailers is returned per query.
CBS provides demographic data at a resolution of 500 x 500 square metres. This was deemed
a suitable size, because most consumers go to a food retailer at a distance between 250 and 500
metres (Veenstra, et al., 2010). These corresponding squares, called regions in this thesis,
provided extensive socio-demographic data. For the demand variable, only the most recent
population quantities (2014) were required. Data on the percentage of high incomes, the urban
classification, and the number of people aged above 65 years originate from the year 2011,
2013, and 2014, respectively. It turned out that the spatial extent of the socio-demographic data
did not exactly match that of the study area. Demand values of -99,998 denoting zero population
were converted into zero. The data squares were overlain to the midpoints of road segments.
Subsequently, with the help from a spatial join, the values were added to the road network.
Google Maps retailer location data were retrieved using the Google Places API (Google, 2017).
The API returns data in XML or JSON. You can choose to search nearby by adding a location
to the URL, but it is also possible to do a text search using keywords. Even a third option was
available: radar search. The radar search is almost identical to the search nearby option. The
difference is that it returns more results in a single query, but with less attribute information.
The radar search was the handiest, since the area hosts a large number of food retailers and
extra attribute information were not needed. Three compulsory parameters need to be defined
in the URL query: key, location, and radius. A personal API key was mandatory to use the
Google data. The key is easily requested at the developers’ website of Google. The location is
defined by WGS84 longitude and latitude, and the search radius defines the buffer around this
location in which retailers are to be searched. The query returned the geographic coordinates of
the location of retailer stores.
An R-script (Appendix II) was developed to obtain the data and to transform these from JSON
into CSV. A fishnet was created in ArcGIS. A fishnet is a net of rectangular polygons. This was
utilised, because the Google API returned a maximum of 200 results per query and the study
area contains more food retailers than that. The fishnet existed of 70 points which were placed
19
in the centre of a square with sides of 5,200 metres. The identification number, the latitude and
longitude of these points can be found in Appendix III. It was assumed that the number of food
retailers situated within the search radius never exceeded 200. An edge length of 5,200 metres
for the fishnet was convenient, because the thus obtained 70 squares approximately fit the data
collection area.
The search radius parameter in the Google API query had to be properly set to fully cover the
study area. Figure 3.4 shows the used fishnet. The green dots represent the points which were
used as location input in the query.
Figure 3.4: Fishnet over the study area.
Squared areas avoid overlap in the queries. Contrary, circled areas gave overlapping results.
The Google Places API only returns results from circled areas, whereas squared areas were
more convenient. Therefore, results outside the area and duplicate retailer stores occurred, these
were removed.
Also a second problem occurred. The bakery query required an extra filter, because otherwise,
companies using the common Dutch family name “Bakker” would also be returned. The “types”
filter was added to the bakery query. No types filter was available for the butcher and
greengrocery. Fortunately, after a short examination on Google Maps, only butchers and
greengroceries were found with the corresponding keywords. Four URL queries were generated
to subtract supermarket, bakery, butcher, and greengrocery locations:
20
https://maps.googleapis.com/maps/api/place/radarsearch/json?location=LOCATIONIN
PUT&radius=4000&keyword=supermarkt&key=AIzaSyBwZvcMFCkzj3TZGwHP-
MhbWH1QI7dwVA4
https://maps.googleapis.com/maps/api/place/radarsearch/json?location=LOCATIONIN
PUT&radius=4000&keyword=bakker&types=bakery&key=AIzaSyBwZvcMFCkzj3T
ZGwHP-MhbWH1QI7dwVA4
https://maps.googleapis.com/maps/api/place/radarsearch/json?location=LOCATIONIN
PUT&radius=4000&keyword=slager&key=AIzaSyBwZvcMFCkzj3TZGwHP-
MhbWH1QI7dwVA4
https://maps.googleapis.com/maps/api/place/radarsearch/json?location=LOCATIONIN
PUT&radius=4000&keyword=groentewinkel&key=AIzaSyBwZvcMFCkzj3TZGwHP
-MhbWH1QI7dwVA4
3.2. Optimisation method
3.2.1. Optimal locations
Numerous methods are available for locations optimisation. Every allocated location changed
the competition variable. Therefore, an exhaustive search optimising three locations in a study
area with 24,000 road segments would have 2.3 x 1012 potential solutions. Each trial would
require re-computation, of the density, which is computationally expensive. An optimisation
method was implemented to support a food specialist retailer who wants to open multiple stores.
In that case simultaneously more than one location required to be found.
In this case, only heuristic optimisation methods were feasible, because there is no analytic
solution while exhaustive search over the solution space is infeasible. The heuristic approaches
which were tested in this thesis were a greedy algorithm and simulated annealing.
3.2.2. Greedy algorithm
In this greedy algorithm one store is added at the time; each time re-computing the competition
variable and selecting the next optimal solution until n stores are allocated. In case of multiple
equally optimal sites the location was chosen by random assignment. If only a single store is to
be allocated the used greedy algorithm is optimal. Otherwise the approach is locally optimal.
Since, the allocated stores influence the competition variable, without taking the next to-be-
allocated optimal stores into account.
21
I developed a script, which calculated the utility for every road segment in the study area. The
pseudo code for this script can be found in box 3.1. The road segments with the highest utilities
were found and their ID number was put in a list. A random optimal solution was picked if
more than one optimal solution was listed. The random optimal solution, a road segment, was
saved as the optimal location and the corresponding midpoint was found. If more than one
optimal location was required, then a service area was generated around the corresponding
midpoint. The count value was set to four, to ascertain that no cannibalisation effect would
occur. Thus, the competition density increased heavily around the new found location. The
KDE values were added to the competition variable. The newly calculated competition variable
was used for the next location
Box 3.1: Pseudo code greedy algorithm in a network space.
3.2.3. Simulated annealing
Besides a greedy algorithm optimisation method, a simulated annealing approach was
developed for optimising multiple store locations simultaneously. The script was quite different
from the greedy algorithm script (box 3.1). Due to time constraints in combination with
computation time, the maximum number of iterations was set to 20,000. More iterations would
have increased the quality of the results. Besides the maximum number of iterations, a
maximum number of non-accepted solutions was set to 5,000. The number of non-accepted
solutions are the number of times that a proposed composite of road segments was not selected
Set bandwidth for KDE and define kernel functions
Set number of stores and variables weights
Loop through number of stores
o Compute KDE given the current spatial configuration of stores
o Loop through all road segments
Find variable values
Calculate utility
If utility is larger than current optimal utility, then
Save road segment in a new list
Utility becomes the optimal utility
If utility is equal to current optimal utility, then
Add road segment to optimal solution list
o Pick random optimal solution from optimal solution list
o Save found road segment as optimal location and add it to current configuration
22
as the new current composite of road segments. The cooling ratio was set to 0.9995, which
decreases the temperature in each iteration. The approach was only tested using data on bakeries
and supermarkets for the simultaneous allocation of five bakeries. The initial temperature was
set to four and the bandwidth was set to 500 (table 3.1).
Table 3.1: Simulated annealing settings.
Number of iterations 20,000
Maximum number of non-selected solutions 5,000
Number of stores 5
Cooling ratio 0.9995
Location data Bakeries and supermarkets
Initial temperature 4
Bandwidth 500
Box 3.2 shows the pseudo code of the simulated annealing process utilised in this thesis. Initially
five randomly chosen road segments were selected. The joint utility was calculated for this
composite of stores. Then, an iterative process started until the maximum number of iterations
or the maximum number of non-selected solutions was reached. In each iteration, on, randomly
selected location was moved to a different randomly selected road. To avoid a cannibalisation
effect, points within 500 metres from the other four points were not allowed to be selected.
Accordingly, no new density map had to be computed, which would have been a
computationally expensive procedure. As no cannibalisation effect was allowed to occur, the
new proposed joint utility was readily calculated. The proposed road segments were accepted
when the proposed joint utility was higher than the current joint utility. If the proposed joint
utility was lower than the current joint utility, the proposed road segments were excepted with
a certain probability. The last found combination of road segments was saved as the optimal
solution. The script also returned two lists: a list with probabilities that a lower utility was
selected and a list which contained the simultaneous found joint utilities. These lists were used
to create traces; the traces were utilised to examine the simulated annealing settings.
23
Run simulated annealing function, select number of stores and variable weights:
o Set cooling ratio, initial temperature, maximum number of iterations, maximum number of non-
accepted solutions and the bandwidth
o Randomly find initial road segments
o Loop through initial road segments:
Calculate utility
Add utility to initial joint utility variable
o Loop until number of iterations or number of non-selected solutions has reached its maximum
Find random segment out of selected road segments, which moves to another proposed
road segment
Find random proposed road segment
Check if no cannibalisation occurs:
Find distances between new proposed segment and the selected segment
which did not move
If every distance between the proposed midpoint and the other selected
midpoints is larger than the bandwidth, then
o This point is proposed
Else, then
o Propose another method until no cannibalisation occurs
Calculate new proposed joint utility
Find difference in joint utility (proposed joint utility – joint utility) and calculate
probability (equation 2)
If difference is smaller than one, then
Add probability to probability list
If difference is bigger or equal than/to zero, then
Proposed road segments are accepted
If difference is smaller than zero and pseudo-random number is larger than the
probability, then
Proposed road segments are accepted
If difference is smaller than zero and pseudo-random number is lower, then
Proposed road segments are not accepted
Temperature cools down (cooling ratio)
Add joint utility value to joint utility list
o Save last accepted road segments
Open text files to write the list with the consecutive joint utilities, and the list with the probability that a
worse solution is accepted
Box 3.2: Pseudo code simulated annealing.
24
3.2.4. From road segments to areas
Not every road segment is suitable to establish a small food venture. There is a possibility that
no space is available to open a new store. An optimal area is created around the optimal road
segments. Then, the chance increases that store space is available. Besides that, entrepreneurs
are able to take other variables like rental costs or store attractiveness into consideration. Using
a normal buffer function was not logical, because prior distance calculations were done in a
network space. Therefore, once again, service areas were generated. This time with break
values.
3.3. Case study
3.3.1. Optimal solutions
A scenario was created, named “scenario 1”. In this scenario, three optimal locations were to
be determined for new bakery ventures with the variable weights set to one (table 3.2). The
locations were found with the greedy algorithm method.
Table 3.2: Scenario 1 case study.
Scenario 1
Retailer type Bakery
Number of stores 3
Density weight 1
Demand weight 1
Attractiveness weight 1
Optimisation method Greedy algorithm
After the optimal locations were found, I wanted to assess if the cannibalisation effect occurs.
Therefore, I visualised the optimal location in two different maps: the original density map and
the newly calculated density map.
Every optimal road segment was part of an optimal site. The optimal site is a composite of
consecutive road segments with the highest utilities in the study area. Optimal sites arose,
because the road segments with the highest utilities were always neighbours. The size of the
optimal sites varied.
The second scenario was conducted with simulated annealing. The variable weights remained
one. Again, the case was executed with location data of supermarkets and bakeries. The number
25
of stores was set to five. The settings slightly varied from the settings in table 3.1. The number
of iterations was set to 50,000 and the maximum number of non-selected solutions was set to
20,000 to increase the quality of the results.
Table 3.3: Scenario 2 case study.
Scenario 2
Number of iterations 50,000
Maximum number of non-selected solutions 20,000
Number of stores 5
Cooling ratio 0.9995
Retailer type Bakery
Initial temperature 4
Bandwidth 500
Density weight 1
Demand weight 1
Attractiveness weight 1
Optimisation method Simulated annealing
3.3.2. User interface
A site selection method should enable users to adjust parameter values and data (Benoit &
Clarke, 1997). Professionals in the sector can indicate the important site criteria (Roig-Tierno
et al., 2013). Therefore, an interactive map was developed to show the results to users, who are
mainly small food company owners. The user of the interactive map indicates the food retailer
type, the number of stores, and adjusts the weights of criteria. This section helped me thinking
about a way to visualise the information and how to distribute the results to stakeholders.
The Python script to generate (an) optimal location(s) and the adjusted competition variable
was run within the R code. The internal function “system” which invoked an OS command,
was required to start the Python script. The input command was “python file path”. The file
path was the location where the optimisation method Python script was saved. The system
function only worked when R and Python were added to the environment variables. This was
only required for the Windows operating system. Environment variables affect the behaviour
of running processes. The variable weights and the type of food retailer were retrieved from a
text file, which was generated by the main R script (Appendix IV). The Python script required
some minor changes to cope with the GUI; complementary script was implemented to read the
values from the files. The input road data and the input midpoint data were determined by those
26
values. Moreover, the variable weights were retrieved from text files. Also, earlier generated
optimal solutions were deleted. The optimal points and areas were transformed to WGS84,
because this is required for the background map. The optimisation function returned two lists:
an optimal road midpoint list and the optimal area list. All files in the list were converted into
shapefiles, because shapefiles can be loaded in the R script.
The R programme to visualise the optimal locations exists of three elements. One element was
used to execute the Python script. The second element retrieved the optimal solution files
created by the Python script. The last element visualised the results in an interactive map view.
Furthermore, the result raster maps were imported to R for visualisation to the user.
A combination of the Leaflet and Shiny package was used to create the interactive map. The
Shiny package, developed by Chang et al. (2016), creates interactive web applications in R.
Shiny applications bind inputs and outputs together in a reactive way. Cheng et al. (2016) are
the developers of the Leaflet package. The Leaflet package uses the Leaflet JavaScript library
and the HTMLwidgets packages to create interactive maps. The maps can be used directly from
R-studio or as a web service.
The user can select for which type of food specialist he wants to find optimal locations: bakeries,
butchers, or greengroceries. Subsequently, the variable weights and the number of stores are to
be filled in the GUI. The weights can be set between zero and one, with steps of 0.1. The number
of stores is required to be an integer value between one and ten. An action button named
generate optimal location(s) starts the Python script to calculate the optimal solutions. Two
checkboxes can be used to turn the demand and site attractiveness raster maps with their
corresponding legends on and off. These are included to give the user extra information about
the study area. Generating optimal solutions takes considerable time, certainly if the number of
stores is set to more than one. To prevent multiple simultaneous calls to the Python function or
changes to the parameters while an optimisation is running, the inputs and the generate button
are frozen during execution of the Python scripts.
3.4. Sensitivity analysis
A sensitivity analysis was executed to find out if the outputs were stable when a different food
retailer type was selected and if the outputs were stable when one weight was set to zero.
Sensitivity analysis is used to test how uncertainties in the output can be appointed to different
27
parameter settings (Saltelli, 2002). It can also be used to asses to what extent variables and the
variable weights influence the end result.
The one-factor-at-a-time (OAT) is the most used form of sensitivity analysis. One input factor
is changed, while the others are fixed (Saltelli & Anonni, 2010). Several scenarios were created
to test the effect on the result. The variable weights and food specialist type were adjusted one
at the time to assess the impact on the result.
For convenience purposes, the sensitivity analyses of the weights and food retail store type were
executed with the greedy algorithm; computation time lower, and configuration of settings was
easier when using this method. I assessed visually if the patterns of the new venture locations
were corresponding when the parameters were changed. A fishnet, like the one already showed
in figure 3.4, was used as a template to assess in which area an optimal solution was located.
First, the sensitivity of the food retailer type was determined; to assess if the site selection
method gave different optimal locations for bakeries, butchers and greengroceries. For every
retailer type, five stores were calculated and all weights were set to one (table 3.4).
Table 3.4: Scenarios to assess the sensitivity of the food retailer type.
Retailer type Number
of stores
Density
weight
Demand
weight
Attractiveness
weight
Optimisation
method
Bakery 5 1 1 1 Greedy algorithm
Butcher 5 1 1 1 Greedy algorithm
Greengrocery 5 1 1 1 Greedy algorithm
The second test was conducted to evaluate how the variables influenced the selected locations.
This example was executed for the retailer type “greengroceries”. The decision to test this with
greengroceries was purely random. On their turn, the competition, demand, and attractiveness
weights were set to zero to examine if the optimal site changed. This was visually assessed.
Table 3.5 summarises the created scenarios to assess the sensitivity of the variables.
28
Table 3.5: Scenarios to assess the sensitivity of the variables.
Retailer
type
Number of
stores
Density
weight
Demand
weight
Attractiveness
weight
Optimisation
method
Greengrocery 1 1 1 0 Greedy algorithm
Greengrocery 1 1 0 1 Greedy algorithm
Greengrocery 1 0 1 1 Greedy algorithm
Baker 1 1 1 0 Greedy algorithm
Baker 1 1 0 1 Greedy algorithm
Baker 1 0 1 1 Greedy algorithm
Butcher 1 1 1 0 Greedy algorithm
Butcher 1 1 0 1 Greedy algorithm
Butcher 1 0 1 1 Greedy algorithm
29
4. Results
The results chapter visualises the results of the research questions individually. No
interpretation or discussion is included in this chapter.
4.1. Objective function
4.1.1. Explorative analysis
Competition
Figure 4.1 shows the spatial distribution of the considered food retailers in the study area. The
data were retrieved on 25-08-2016 from the Google places API. All four different food retailer
types showed similar patterns. Most retailers were located in the three biggest cities, but also
clusters of shops were found in the smaller cities and villages. The results outside the data
collection area were not removed yet in figure 4.1.
The retailer location data were not validated in this thesis. Since, Sienkiewicz (2015) already
found that this food retailer data retrieval was not perfect. Some food ventures were not found
while also some non-existing stores popped-up. For the development of the site selection
method this was not an issue. However, if you want to use the method in practice, it is inevitable
to test the retailer location data on errors.
30
Figure 4.1: Locations of retrieved food retailers within the study area based on Google maps.
The number of shops per food retailer type are shown in table 4.1. The data collection area
contained more supermarkets and bakeries, than butches and greengroceries.
31
Table 4.1: Number of food retailers in the study area.
Type of food retailer Number of shops found
Supermarkets 396
Bakeries 356
Butchers 173
Greengroceries 171
Total 1,096
During the explorative analysis it was found that the study area included, too many stores to
compute the network KDE within two days. The extensive number of supermarket and bakeries
caused that approximately 700 unique service areas had to be generated for the original study
area, with approximately 300 points each when the bandwidth was set to 500 metres. Therefore,
the study area was adjusted to only represent Amersfoort and surroundings. By consequence,
the number of generated service areas was reduced to 101 areas; this reduced the computation
time drastically. The adjusted study area is visualised in figure 4.2.
32
Figure 4.2: The Amersfoort study area.
Figure 4.3 shows the spatial distribution of the considered food retailers in the Amersfoort study
area. The four different food retailer types had similar location patterns; shops are spread all
over the study area, but densities are highest near the city centre.
33
Figure 4.3: Location of bakeries, butchers, supermarkets and greengroceries in the Amersfoort study area.
Figure 4.4 presents an example of the roads in the centre of Amersfoort where the source count
is visualised. It also shows the midpoints of every road segment. The grey lines present roads
without nearby retailers. The slightly thicker green lines symbolise roads with one store
alongside. The even thicker lines present roads with more than 2 or 3 nearby stores.
34
Figure 4.4: A categorised road network of the city centre of Amersfoort on the number of retailers that are
closest by.
Figure 4.5 is an example of multiple overlapping service areas. The bandwidth was set to 500
metres, the polygons were not circular, because a network space was used.
35
Figure 4.5: Service areas around source points. Notice that service areas may overlap.
The outcome of the network KDE was a quantification of the density of retail stores on the road
network. Figure 4.6 and figure 4.7 are the results of executing the network KDE for a road
network clipped on Amersfoort. The food retailer hotspots are shown as green and blue areas.
The individual road segment values are not visible in those figures. A zoom-in of the city centre
was added to figure 4.7, supermarkets and bakeries were added to this map, here you can better
see the individual road segment values. The computation time was around half a day for the
250 metres and approximately two days for the 500 metres bandwidth. Figure 4.7 shows that
the city centre of Amersfoort is considered a hotspot for food retailers.
37
Figure 4.7: Network KDE densities of Amersfoort with a bandwidth of 500 metres.
Demand
The normalised demand computed from CBS (2014) data is presented in figure 4.8. The highest
value was 2,560 inhabitants per 500 x 500 square metres. The city centre and some newer
38
residential areas, are presented by clusters of dark green squares. Figure 4.9 presents the
demand in the network space.
Figure 4.8: Normalised demand in the Amersfoort study area.
39
Figure 4.9: Normalised demand variable on a network space
Attractiveness
Figure 4.10 shows the normalised number of elderly people in the Amersfoort study area.
Figure 4.11 visualises the normalised number of people with high income. People with higher
incomes tend to inhabit at the border of the city. Especially, the south-west of Amersfoort is the
residence for families with higher income. Figure 4.12 presents the city centres and the shape
of Amersfoort. The area in the right-bottom corner is Leusden and the part on the east side of
Amersfoort is Hoevelaken. Figure 4.13 is the result of equation 6 and the image presents the
most attractive locations for small food retailers. The attractiveness map shows that the most
attractive locations are always in urban areas. The south of Amersfoort was slightly more
attractive than the north.
40
Figure 4.10: Normalised number of elderly people.
.
Figure 4.12: Normalised urbanity classification.
Figure 4.11: Normalised number of people with
high income.
Figure 4.13: Normalised attractiveness variable.
Figure 4.14 represent the attractiveness variable on a network space. The figure shows that
existing food retailers mostly spatially coincide with areas which are assumed to be attractive
according to this thesis’ approach.
41
Figure 4.14: Existing food retailers on attractiveness variable map in the city centre of Amersfoort.
4.1.2. Composition
Figure 4.15 presents the utility map of scenario 1 (table 3.2). A block pattern is visible with
small differences inside the blocks. Due to the competition variable the city centre does not
present the highest utilities. The dark blue areas at the west and south side of the city centre had
the highest utilities for this scenario.
42
Figure 4.15: Composite utility map of Amersfoort corresponding to scenario 1.
4.2. Optimisation method
4.2.1. Greedy algorithm
The greedy algorithm has no settings to adjust and was therefore easy to configure. Only the
scenario of the composite utility function required to be chosen. The computation time was
quite fast; it took around a minute to find one optimal location.
4.2.2. Simulated annealing
The computation time for the simulated annealing with the settings of table 3.1 was
approximately 7.5 hours. The trace of the possibility that a combination of road segment with
a lower joint utility than the current combination was selected to be the new current state is
shown in figure 4.16. The sequence number presents the sequential number for every iteration
were the proposed joint utility was lower than the current joint utility.
43
Figure 4.16: Probability trace of simulated annealing for five simultaneous store locations.
In the beginning the probability was between 0.7 and one. Until around sequence number 3,200
there was always a chance that a worse joint configuration of five store locations was selected.
The average probability decreased until it was zero around iteration number 10,000. After
sequence number 10,000 only road segment compositions with higher joint utilities were
accepted.
The joint utility of the consecutive chosen joint configuration of road segments is visualised in
figure 4.17.
0
0,2
0,4
0,6
0,8
1
1,21
51
91
037
15
552
073
25
913
109
36
274
145
46
635
181
56
996
217
67
357
253
77
718
289
88
079
325
98
431
036
11
087
91
139
71
191
51
243
31
295
11
346
91
398
71
450
51
502
31
554
11
605
91
657
71
709
51
761
31
813
1
Pro
bab
ility
Sequence number
Probability trace
44
Figure 4.17: Trace of the joint utility of the accepted road segment composition.
Until iteration 10,000 there was a chance that a joint configuration of five store locations with
a lower joint utility was accepted as the new current state. The last found joint utility was 13.98,
the joint utility value is slightly lower than the joint utility of the greedy algorithm (14.20).
Expected was that the joint utility of the simulated annealing approach would outperform the
greedy algorithm when the maximum number of iterations was increased and the other settings
optimised.
4.3. Case results
4.3.1. Optimal solutions
Figure 4.18 presents the selected locations for scenario 1 (table 3.2); weights were set to one,
the number of stores three, and the optimisation method to greedy algorithm.
0
2
4
6
8
10
12
14
161
62
61
251
18
762
501
31
263
751
43
765
001
56
266
251
68
767
501
81
268
751
93
761
000
11
062
61
125
11
187
61
250
11
312
61
375
11
437
61
500
11
562
61
625
11
687
61
750
11
812
61
875
11
937
6
Join
t u
tilit
y
Iteration number
Accepted road segment composition trace
45
Figure 4.18: Three optimal locations for new bakeries in Amersfoort determined with the greedy algorithm.
Figure 4.19 shows how a single added store location (greedy algorithm) affects the KDE. The
right image is the density map after the network KDE for a newly added location. It can be
observed roads around the first selected location were classified differently in the two images.
This is also visible around the second optimal location; neighbouring roads were assigned to a
higher category.
46
Figure 4.19: Competition variable map, prior (left) and after (right) adding two stores by optimisation.
In scenario one, the greedy algorithm performed better than expected. Since, the gaps between
the optimal sites were larger than 500 metres. Moreover, the optimal sites were mostly smaller
than 500 metres. If the optimal site is bigger than 500 metres, then it can occur that a first
location is selected in such a way that the increase in the competition variable causes that the
whole site was not optimal anymore. In such a case, it is more beneficial for the joint utility that
two road segments on the edges of the optimal site are selected. Figure 4.20 shows the optimal
sites. The optimal site of the third iteration was larger than 500 metres. Luckily, optimal location
“GA-3” was on the edge of this site. Therefore, road segments were still available for “GA-4”.
47
Figure 4.20: The road network is visualised together with the optimal sites and optimal locations for four
iterations; GA stands for greedy algorithm and the number presents the store location number.
Figure 4.21 presents the selected locations for scenario 2 (table 3.2); weights were set to one,
the number of stores five, and the optimisation method to simulated annealing. The joint utility
equalled 14.20. A joint configuration of five road segments generated with greedy algorithm
gave approximately the same joint utility.
48
Figure 4.21: Five optimal locations for new bakeries in Amersfoort determined with simulated annealing.
4.3.2. User interface
Figure 4.22 is an image of the interactive map after pressing generate optimal location(s)
button. The background map is from OSM. A green loading bar appeared at the top of the screen
reports that the optimal solutions is being calculated. The input and generate buttons on the
right side of the screen are frozen. However, it was still possible to show the demand or the
attractiveness raster on the map. The legends are visible in the right-bottom corner. The
application automatically zooms to the study area. Manual zooming and panning is also
supported. In figure 4.22 the demand raster layer is checked, that map is shown as the coarse
raster layer.
Figure 4.23 shows a manually zoomed in result within the user interface. The two blue markers
represent the selected locations. The selected locations were outside the city centre at the west
side of Amersfoort. To easily distinct the demand and attractiveness raster with the optimal
areas the optimal areas were colourised with a blue palette.
49
Figure 4.22: GUI showing a map of Amersfoort with an information bar, a demand raster, demand legend, disabled parameter inputs, and a disabled execute button.
50
Figure 4.23: GUI showing a map of Amersfoort with two optimal locations, two optimal areas, optimal area legend, enabled parameter inputs, and an enabled execute
button.
51
4.4. Sensitivity analysis
Figure 4.24 presents a part of the fishnet which was utilised to find the patterns of the optimal
locations. The rows were indicated by numbers and the columns were indicated by letters. The
map was zoomed in to be able to distinguish the optimal locations. Figure 4.24 visualises the
results of five scenarios in table 3.4. The selected locations 1 and 2 were the selected location
for all three food retailer types.
Figure 4.24: Optimal locations if the number of stores is set to five (greedy algorithm).
The demand and attractiveness variables were equal for all three retailer types. Only the
competition variable can cause a difference in the patterns for the retailer types. Noticeable is
that the first two optimal solutions were the same for all types (“11E” and “10E”). The third
optimal location was different for the butcher “13G”, because an existing butcher is at the south
of the third optimal location of the bakery and greengrocery (“12G”). The fourth location of the
greengrocery and bakery are at neighbouring road segments in area “13G”. The fourth location
of the butcher was at an entirely different area (“11H”), because the location of the third selected
location for the butcher was found in area “13G”. The fifth location of the bakery is located
between optimal location 1 and 2 (“11E”) on the edge of the service area of optimal location 1.
The fifth location of the butcher and greengrocery are located at “11H”.
52
The second test was to examine the impact of neglecting one variable (table 3.5). Figure 4.25
is the result of this analysis, not the whole study area was visualised otherwise the symbols
were hard to distinguish. Some symbols were enlarged to show that they are on the same
location as another symbol.
Figure 4.25: Visualisation of results, neglecting one variable; GA stands for greedy algorithm, com0 was when
the competition variable is set to zero, dem0 was when the demand variable was set to zero, and the att0 was
when the attractiveness variable was set to zero.
When the competition variable was neglected, then the selected locations did not differ very
much from the selected locations for scenario 1 (section 4.3). The competition variable did not
change the 500 by 500 metres square wherein the location was found in the study area. On the
contrary, the value of the competition variable did cause considerable changes in the selected
location of stores within the 500 by 500 metres squares. When the competition variable is
neglected, then the 500 by 500 metres squares had an identical utility. This caused that the
optimal location was always a random choice. the competition variable had large areas with
zero density and more variability in general than the two other variables. This affected the
results, because the variables contributed differently to the utility value. The competition values
were more variable, because a density value was calculated for every individual road segment
and the demand and attractive values were the same for a region.
53
Area 4E had the best spatial dispersion of retailers and consumers in Amersfoort, however the
area was not as attractive as around the city centre (figure 4.14). It was not as attractive, because
in area 4E the population had a lower percentage of elderly and high income inhabitants.
54
5. Discussion
Discussion of the individual results of the research questions. Links are made between findings
of this thesis and findings of other studies. Shortcomings and other special circumstances are
described in the sections below.
5.1. Objective function
The demand (figure 4.9), attractiveness (figure 4.14), and competition (figure 4.6 and figure
4.7) variables were outdated and future demographic predictions were not included. The data
refers to the socio-demographic and the retailer location data. Those facts result that this method
is a short-run approach. The site selection method does not anticipate on a possible response by
other retailers in the study area. Competition could react or anticipate on the site selection of
the user of the method. The method assumed that the locations of competitors were fixed.
Almost 1,100 retailers were found (table 4.1) for the initial proposed study area (figure 3.2).
The computation time was approximately one day, when the study area was set to city scale
(figure 4.2) and the bandwidth was set to 250 metres (figure 4.6). Setting the bandwidth to 500
metres enlarges the computation time drastically; the computation time was around three days
with equal settings (figure 4.7). Which meant that the computation time for the network KDE
enlarged exponentially with the bandwidth. Areas with values of zero predominated the study
area when a bandwidth of 250 was selected, causing that most areas had a value of zero
competition, little variation was then found for the competition variable. The calculation of a
feasible density map of one city already took three days, as a result, the network KDE was not
suitable for a regional study area.
The choice of the bandwidth affected (section 2.3) the competition variable (figure 4.6 and
figure 4.7). Rui et al. (2016) also, selected a bandwidth of 500 metres for the calculation of the
network KDE on city scale, because otherwise their results turned out to be spikey. There are
two main differences between mine and their network KDE method. The first main difference
is that I chose to not divide the road segments in equal sizes, due to the extra computation time
and the frequency of road intersection in the Amersfoort study area. Which resulted that less
segment midpoints were within the bandwidth of a certain point. Longer road segments had a
smaller chance to be within the bandwidth of a store, because the midpoint was further away.
Outskirts generally have fewer road intersections than city centres, thus this effect occurred
mainly in the outskirts.
55
Another difference is that stores and roads were unweighted in my method, which also affected
the results of the competition variable. Weighting stores would have been useful, because some
stores attract more customers, than other stores. The retail location data (figure 4.1 and figure
4.3) did not contain information on the number of customers a store attracts. However, the
location data were easily obtained and updated.
Rui et al. (2016) also used weights for roads, he randomly assigns higher values to bigger roads.
Only, it is not always the case that bigger roads are faster, certainly not in a city centre.
Moreover, the conveyance differs among consumers. Therefore, my proposed method did not
include weighted roads.
5.2. Optimisation method
As mentioned in section 4.2 the greedy algorithm only finds local optima, the simulated
annealing approach can find the joint optimal solution with the right settings and number of
iterations. A good configuration of the settings is essential for implementing simulated
annealing. The initial temperature (T0), for example, is important for the acceptance probability
function (Ben-Ameur, 2004). Kirkpatrick et al. (1983) suggest that the initial temperature
should be equal to the maximum difference in utility. Improved, and newer functions are also
available, they find better initial temperatures (Ben-Ameur, 2004). These functions were not
implemented in this thesis.
20,000 iterations were executed for the simulated annealing (figure 4.18). However, the study
area consists of 24,000 road segments. Therefore, the number of iterations was fairly small. The
computation time was already 7.5 hours for 20,000 iterations, for that reason the simulated
annealing optimisation was only executed a few times. During those executions the initial
temperature and the cooling ratio were adjusted.
Simulated annealing (section 3.2.3) and the greedy algorithm (section 3.2.2) ensure differently
that no cannibalisation effect occurs. The greedy algorithm calculates a new density map; a
newly selected store location adds four times as much density to the competition variable as the
existing stores. It is not excluded that a selected store location is selected within 500 metres of
another selected store location. This is not possible in simulated annealing. In the simulated
annealing store locations cannot be selected within 500 metres of another store location. This
difference gives an advantage for the joint utility of greedy algorithm.
56
5.3. Case
5.3.1. Optimal solutions
Figure 4.21 showed the selected locations for the case, the bandwidth for the competition
variable was set to 500 metres. Some locations are located close to each other. Literature states
that most people shop within 500 metres of their residence (Veenstra, et al., 2010). However,
this does not imply that shops just outside 500 metres of another shop could not be considered
as competition. Unfortunately, a limit had to be set to avoid excessive computation times. An
equivalent bandwidth value was utilised to simultaneously change the density map for every
found optimal location in the greedy algorithm (section 3.2.2). Therefore, the distance between
two selected locations can be short.
5.3.2. User interface
The site selection user interface (figure 4.22 and figure 4.23) utilised three different
components: Python, R and ArcGIS. It was only possible to run the application on a computer
with Python, R and an ArcGIS license with an additional network license installed.
The package Rpython was developed to execute Python functions in R. However, I used the
system function (section 3.3.2) for the reason that the Rpython package has some issues with a
computer running on Windows. There is a Rpython package especially established for
Windows system. Though, this package is mostly experimental and not widely tested. Another
option could be to run the R code within Python with the package “Rpy2”.
Most site selection studies only developed a method and did not think about a way to show the
results to stakeholders or how to handle user input. An exception is Cui et al. (2012); they
combined the ArcGIS Engine develop kit with Flex, and Java. ArcGIS Engine was used for the
basic GIS functions (zooming and geographical querying). Flex is a mark-up language and
programming language Java is capable to handle the different inputs and calculate outputs. I
chose the Leaflet-Shiny combination, due to prior knowledge of these packages and the easiness
of implementation.
5.4. Sensitivity analysis
The results of the sensitivity analysis (figure 4.24 and figure 4.25) were only conducted with
the greedy algorithm. Selecting locations with simulated annealing could have changed the
57
sensitivity analysis results. Also, the chosen kernel for the network KDE was not changed, this
could also affect the sensitivity analysis results. Moreover, the bandwidth for the competition
variable was kept constant. Since, setting the bandwidth higher than 500 metres meant days of
extra computation time and setting the bandwidth lower than 500 metres made the competition
variable infeasible (section 4.1.1).
58
6. Conclusions, limitations and recommendations
This section gives the results of the objective and the research questions. Also, the limitations
are listed. Furthermore, recommendations for future research are given.
6.1. Conclusions
1. Which objective function is usable for site selection for food retail stores? An objective
function for store site selection should accommodate competition, demand, and
attractiveness. In my research a weighted sum function was selected. For that reason,
variable weights can be selected by a small food retailer. Moreover, it is hard to retrieve
the true weights and no studies were conducted to find those weights. An explorative
analysis proved that the required inputs were attainable.
2. Which optimisation method can be implemented to find optimal locations? Multiple
heuristic methods can be implemented. However, only greedy algorithm and simulated
annealing were examined in this thesis. The greedy algorithm gave the optimal solution
for a single store. However, if multiple store locations are to be found, the greedy
algorithm is likely not to produce an overall optimal solution. Simulated annealing can
find the joint optimal solution for multiple stores. However, a good configuration of
settings and a sufficient amount of iterations are required.
3. Where in a case study area are opportunities to start a new small multi-location food
venture? This depended on the input of the user. Yet, in most cases the area just outside
the west and south of the city centre of Amersfoort was selected. This was caused by
the fact that those areas were attractive locations with high demand and sometimes zero
or close-to-zero competition. Those areas were attractive for the reason that it was
classified as an urban area with a high demand from relatively wealthy inhabitants.
4. How sensitive are the outcomes of the site selection for different parameter settings? I
found that the settings of the food retailer type did not change the result of the optimal
location much, the location remained in the same region, mainly, because the
competition factor was the only parameter that represented different values among the
food retailers. Therefore, the competition factor did not influence the optimal region as
much as the demand and attractiveness factors.
59
6.2. Limitations
If more time and money were available, then experts could also be involved in deciding the
variables for the site selection. Now, only market research and literature were used to determine
the variables.
Recent studies in microeconomics state that the top-performing entrepreneurs are mostly
responsible for economic value creation (Van Praag & Van Stel, 2013). Other entrepreneurs
would be stimulating the economy more as an employee than as a venture owner (Van Praag &
Van Stel, 2013). Therefore, one can argue whether a site selection method for small food
entrepreneurs is good for the national economic situation. Small food retailers lose the
competition against the supermarkets, meaning that they are not the top-performing business
owners.
A compromise among quality, study area size, and computation time was inevitable for the
competition variable, and a large amount of variables was and could not have been taken into
account. Perhaps the most important missing variables were the rental and other location costs.
Other variables which were not considered were: store space, building attractiveness, other
stores that attract consumers, parking space and local laws and regulations. Simply, because the
data could not easily be retrieved for this study area and because of the minor and unknown
effect on the objective function.
The method assumed that most food shopping is done closest by the residence of consumers.
Out-of-work shopping and combine-shopping are concepts which contradict with this
assumption. Every day, people travel between home and work; they often combine this
commute with a shop visit, which covers this concept of out-of-work shopping. Consumers also
combine-shopping, which means that when they need non-food products, they buy their
groceries at the food retailer close to the particular non-food shop.
ArcGIS with network license is mandatory to execute the site selection method. This method
would be more beneficial for the computation time and usability if the script used in this thesis
would only make use of open source functions. I tried developing an equivalent script to
produce the network KDE with the GDAL/OGR package in combination with the Networkx
package, because those packages are open source, but they lacked the standard network
functions that ArcPy offered and which were convenient to calculate the network KDE. Okabe
(2009) developed and still updates a set of ArcGIS tools, including a network kernel density
60
technique. The whole package is named SANET, and the user reference is created by Shiode,
Okunuki and Okabe (2006). I created my own script (Appendix I), because the functions were
only available for academic use and only available for ArcGIS desktop version 9-10.2. Using
the network KDE tool by Okabe (2009) would imply that my site selection method could not
be applied in the retail industry.
6.3. Recommendations
Firstly, it is recommended for future research to apply the method in the real world. A small
food retailer should implement this method in his site selection procedure. An evaluation with
this entrepreneur would help finding out if the variables and objective function are competent.
The success rates of companies which use the method and the success rates of companies which
did not use such a method can be compared to get a realistic view of the effectiveness. However,
it remains hard to test the feasibility of this method, as many other variables also influence the
performance of a retailer. A big sample population of small food retailers is recommended to
make the results unbiased.
Secondly, a market research could be conducted for future research to find out default weights
for various business types. A questionnaire could be distributed among small food retail owners.
The results can be quantified as weights. However, this is a time expensive method. Moreover,
it could be hard to find retail owners who are willing to provide this information, for the reason
that they do not want to help their own competitors.
Thirdly, demand was set equal to population, but another approach can be introduced for future
research which is more in line with the concepts of combine-shopping and out-of-work
shopping. Consumption maps visualise consumer activity flows. It is based on the assumption
that consumers are more likely to do their shopping in certain areas above other areas.
Fourthly, it is recommended to add future predictions. Entrepreneurs do not want to start
ventures in areas with a decline in population. In contrary, people in new to-be-built residential
areas can be a new interesting target audience. Therefore, future changes in demand can be of
value to include in a site selection method.
Fifthly, it is recommended to test the site selection method in other study areas. The explorative
analysis, the data retrieval, and the implementation of the method were executed in the same
area. The method which was proposed in this thesis used variables for which the data can be
retrieved nation-wide in the Netherlands.
61
Sixthly, it is recommended to try other optimisation methods. In this thesis, only the greedy
algorithm and simulated annealing were considered, but there are plenty other feasible
optimisation methods, like a genetic algorithm. Another example is the method of neural
networks, which could be implemented and tested in the site selection method.
Seventhly and finally, it is recommended for future research to build or find a network function
which is equal to the network functions from ArcPy. This would entail that the method could
work with only open source functions. Pgrouting is an extension of PostGIS to provide routing
function to a geospatial database. Graser (2011) created a service area with Pgrouting. Her
method to create a service area could be explored, to examine if it can be used to replace the
generate service area function by ArcPy. Networkx is an extensive Python package with a lot
of network functions. However, nothing as convenient as the generate service area function in
ArcPy was found in that package.
62
7. References
Aboolian, R., Berman, O. & Krass, D. (2007). Competitive facility location model with
concave demand. European Journal of Operational Research, 181(2), 598-619.
Ben-Ameur, W. (2004). Computing the initial temperature of simulated annealing.
Computational Optimization and Applications, 29(3), 369-385.
Baray, J. & Cliquet, G. (2007). Delineating store trade areas through morphological analysis.
European Journal of Operational Research, 182(2), 886-898.
Benoit, D. & Clarke, G. P. (1997). Assessing GIS for retail location planning. Journal of
retailing and consumer services, 4(4), 239-258.
Blythman, J. (2012). Shopped: The shocking power of British supermarkets. Harper Collins
UK.
Brunsdon, C., Fotheringham, A. S. & Charlton, M.E. (1996). Geographically weighted
regression: a method for exploring spatial nonstationarity. Geographical analysis,
28(4), 281-298.
Central Bureau for Statistics (2014, October). Kaart met statistieken per vierkant van 500 bij
500 meter [Datafile]. Retrieved from https://www.cbs.nl/nl-nl/dossier/nederland-
regionaal/geografische %20data/kaart-met-statistieken-per-vierkant-van-500-bij-500-
meter
Central Bureau for Statistics (2015, June). Aantal viswinkels in 7 jaar met kwart toegenomen.
Retrieved from https://www.cbs.nl/nl-nl/nieuws/2015/25/aantal-viswinkels-in-7-jaar-
met-kwart-toegenomen
Černý, V. (1985). Thermodynamical approach to the traveling salesman problem: An efficient
simulation algorithm. Journal of optimization theory and applications, 45(1), 41-51.
63
Chang, W., Cheng, J., Allaire, J. J., Xie, Y. & McPherson, J. (2016). Shiny: Web Application
Framework for R. R package version 0.14.2.
Cheng, J., Xie, Y., Wickham, H. & Agafonkin V. (2016). Create Interactive Web Maps with
the JavaScript ‘Leaflet’ Library. R package version 1.0.1.
Church, R. L. (2002). Geographical information systems and location science. Computers &
Operations Research, 29(6), 541-562.
Cui, C., Wang, J., Pu, Y., Ma, J. & Chen, G. (2012). GIS-based method of delimitating trade
area for retail chains. International Journal of Geographical Information Science,
26(10), 1863-1879.
Davis, P. (2006). Spatial competition in retail markets: movie theaters. The RAND Journal of
Economics, 37(4), 964-982.
Goodchild, M. F. (1984). I lacs: A Location Allocation Model For Retail Site Selection.
Journal of Retailing, 60, 84-100.
Google. (2017). Google Places API. Retrieved from: https://developers.google.com/places/
Graser, G. (2011) Catchment areas with pgrouting driving distance. Retrieved from
https://anitagraser.com/2011/05/13/catchment-areas-with-pgrouting-driving_distance/
Holla, J. (2013). GfK Supermarktkengetallen. Retrieved from
http://megaslides.com/doc/39860/gfk-supermarktkengetallen-februari-2013
ING (2014, April). Bakker, slager en groenteboer op hun retour. Retrieved from:
https://www.ing.nl/nieuws/nieuws_en_persberichten/2014/04/bakker-slager-en-
groenteboer-op-hun-retour.html
Huff, D. L. (1964). Defining and estimating a trading area. The Journal of Marketing, 34-38.
Kirkpatrick, S., Gelatt, C. D., & Vecchi, M. P. (1983). Optimization by simulated annealing.
science, 220(4598), 671-680.
64
Neun, M., Burghardt, D. & Weibel, R. (2009). Automated processing for map generalization
using web services. GeoInformatica, 13(4), 425-452.
Okabe, A. (2009). About SANET. Retrieved from: http://sanet.csis.u-tokyo.ac.jp/
OSM. (2016). NWB. Retrieved from: http://wiki.openstreetmap.org/wiki/NWB
Pettinger, C., Holdsworth, M. & Gerber, M. (2008). ‘All under one roof? ’differences in food
availability and shopping patterns in Southern France and Central England. The
European Journal of Public Health, 18(2), 109-114.
Raatgever, A. (2014). Winkelgebied van de toekomst. Bouwstenen voor publiek-private-
samenwerking, 41. Retrieved from http://www.detailhandel.nl/images/pdf/
Winkelgebied_vd_toekomst_volledig_lowres_website_2.pdf
Rabobank (2007, October). Rabobank Cijfers & Trends: Visie op negen sectoren in het
Nederlandse bedrijfsleven. Retrieved from
https://www.rabobank.nl/images/ct_kwartaalbericht_okt07_2966688.pdf
Raven, H., Lang, T., & Dumonteil, C. (1995). Off our trolleys: food retailing and the
hypermarket economy. Institute for Public Policy Research.
Roig-Tierno, N., Baviera-Puig, A., Buitrago-Vera, J., & Mas-Verdu, F. (2013). The retail site
location decision process using GIS and the analytical hierarchy process. Applied
Geography, 40, 191-198.
Rui, Y., Yang, Z., Qian, T., Khalid, S., Xia, N. & Wang, J. (2016). Network-constrained and
category-based point pattern analysis for Suguo retail stores in Nanjing, China.
International Journal of Geographical Information Science, 30(2), 186-199.
Sadler, R. C. (2016). Integrating expert knowledge in a GIS to optimize siting decisions for
small scale healthy food retail interventions. International Journal of Health
Geographics, 15(1), 1.
65
Saltelli, A. (2002). Sensitivity analysis for importance assessment. Risk Analysis, 22(3), 579
-590.
Sarkar, A. (2007). GIS applications in logistics: a literature review. School of Business,
University of Redlands, 1200.
Schabenberger, O. & Gotway, C. A. (2005). Statistical Methods for Spatial Data Analysis.
Chapman & Hall/CRC, Boca Raton, Florida.
Scott, D. M. & He, S. Y. (2012). Modeling constrained destination choice for shopping: a
GIS-based, time-geographic approach. Journal of Transport Geography, 23, 60-71.
Shiode, S., Okunuki, K. I. & Okabe, A. (2006). User Reference for SANET: A Toolbox for
Spatial Analysis on a Network.
Sienkiewicz, B. (2015) The impact of local food environments on diet: do neighbouring food
retailers influence what you eat? (MSc Thesis). Wageningen University.
Steiniger, S. & Weibel, R. (2005). “A Conceptual Framework for Automated Generalization
and Its Application to Geologic and Soil Maps.” In Proceedings of XXII Int.
Cartographic Conference, 11–16.
Suárez-Vega, R., Gutiérrez-Acuña, J. L. & Rodríguez-Díaz, M. (2015). Locating a
supermarket using a locally calibrated Huff model. International Journal of
Geographical Information Science, 29(2), 217-233.
Suárez-Vega, R., Santos-Peñate, D. R. & Dorta-González, P. (2012). Location models and
GIS tools for retail site location. Applied Geography, 35(1), 12-22.
Suárez-Vega, R. & Santos-Peñate, D. R. (2014). The use of GIS tools to support decision
making in the expansion of chain stores. International Journal of Geographical
Information Science, 28(3), 553-569.
Tong, D. & Murray, A. T. (2012). Spatial optimization in geography. Annals of the
Association of American Geographers, 102(6), 1290-1309.
66
Turk, T., Kitapci, O., & Dortyol, I. T. (2014). The Usage of Geographical Information
Systems (GIS) in the Marketing Decision Making Process: A Case Study for
Determining Supermarket Locations. Procedia-Social and Behavioral Sciences, 148,
227-235.
Van Praag, M. & van Stel, A. (2013). The more business owners, the merrier? The role of
tertiary education. Small Business Economics, 41(2), 335-357.
Veenstra, S. A., Thomas, T. & Tutert, S. I. A. (2010). Trip distribution for limited
destinations: a case study for grocery shopping trips in the Netherlands.
Transportation, 37(4), 663-676.
Wood, S. & Reynolds, J. (2012). Leveraging locational insights within retail store
development? Assessing the use of location planners’ knowledge in retail marketing.
Geoforum, 43(6), 1076-1087.
Xie, Z. & Yan, J., (2008). Kernel density estimation of traffic accidents in a network space.
Computers, Environment and Urban Systems, 32(5), 396-406.
67
8. Appendix I: Network KDE Python-code
from __future__ import division # Python 2.7
# Import system modules
import arcpy
import math
# Set the workspace
arcpy.env.workspace = "C:/Users/Bob/Desktop/ArcGIS/A_Preprocessing.gdb"
# Data can be overwritten
arcpy.env.overwriteOutput = True
# Check network license
arcpy.CheckOutExtension("Network")
# Set input/output
inPointData = "lxcenter_bak_amers" # Bakery example
outSourcePointData = "lxcenter_source_bak"
inNetworkDataset = "Road_Network_bak/ND_bak"
outPointData = "lxcenter_bak_amers_KDE_500_gaus"
## Functions
# Gaussian function
def gaussian(distdecayeffect, bandwidth, source):
gausdensity = source*((1/bandwidth)*((1/(math.sqrt(2*math.pi))**(-
((distdecayeffect**2)/(2*(bandwidth**2)))))))
return gausdensity
# Quartic function
def quartic(distdecayeffect, bandwidth, source):
quarticdensity = source*((1/bandwidth)*((3/math.pi)*(1-((distdecayeffect**2)
/(bandwidth**2)))))
return quarticdensity
# Minimum variance function
def minvar(distdecayeffect, bandwidth, source):
minvardensity = source*((1/bandwidth)*((3/8)*(3-(5*((distdecayeffect**2)
/(bandwidth**2))))))
return minvardensity
# Set parameters
bandwidth = 500
kernel = "gaussian"
# Make a layer from the feature class
arcpy.MakeFeatureLayer_management(inPointData, "lyr")
# Within selected features, select only those which are source points
arcpy.SelectLayerByAttribute_management("lyr", "NEW_SELECTION",
'"Source" >= 1')
# Write the selected features to a new featureclass
arcpy.CopyFeatures_management("lyr", outSourcePointData)
# Create facilities
facilities = arcpy.FeatureSet()
facilities.load(outSourcePointData)
# Create facilities layer
arcpy.MakeFeatureLayer_management(facilities, "facilities_lyr")
arcpy.CopyFeatures_management("facilities_lyr", "facilities_points")
# Generate service areas
serviceareas = arcpy.na.GenerateServiceAreas(facilities, bandwidth,
"Meters", inNetworkDataset, "ServiceAreasSource")
# Create layer out of source service areas
arcpy.CopyFeatures_management("ServiceAreasSource",
"in_memory\ServiceAreasSourceFeatureClass")
arcpy.MakeFeatureLayer_management("in_memory\ServiceAreasSourceFeatureClass",
"ServiceAreasSource_lyr")
# Create layer out of the lxcentres
arcpy.CopyFeatures_management(inPointData, "in_memory\lxcenter_amers_fc")
arcpy.MakeFeatureLayer_management("in_memory\lxcenter_amers_fc",
"lxcenter_amers_fc_lyr")
68
# Create a loop through the service areas
polygons = arcpy.da.SearchCursor("ServiceAreasSource_lyr", ["OBJECTID"])
polygonnr = arcpy.GetCount_management("ServiceAreasSource")
# Find field with name OBJECTID
field = arcpy.AddFieldDelimiters("ServiceAreasSource_lyr", "OBJECTID")
fac_field = arcpy.AddFieldDelimiters("facilities_points", "OBJECTID")
for polygon in polygons:
# Create single polygon layer
objectID = polygon[0]
print("polygon nr: {0} of {1}".format(objectID, polygonnr))
selection = "{field} = {objectID}".format(field=field, objectID=objectID)
arcpy.Select_analysis("ServiceAreasSource_lyr",
"in_memory\ServiceAreas_feat", selection)
# Loop through single service area polygon
rows = arcpy.SearchCursor("in_memory\ServiceAreas_feat")
for row in rows:
facID = row.getValue("FacilityID")
fac_selection = "{field} = {facID}".format(field=fac_field, facID=facID)
# Find source point
arcpy.Select_analysis("facilities_points", "facilities_point_cur",
fac_selection)
rows2 = arcpy.SearchCursor("facilities_point_cur")
# Get source value
for row2 in rows2:
source = row2.getValue("Source")
# Clip lxcenters on polygon
arcpy.Clip_analysis(inPointData, "in_memory\ServiceAreas_feat",
"in_memory\Lxcenter_inPolygon")
# Loop through all lxcenters in service area polygon
points = arcpy.da.SearchCursor("in_memory\Lxcenter_inPolygon",
['OBJECTID'])
for point in points:
# Set density point zero
density = 0
pointID = point[0]
print("point: {}").format(pointID)
# Create single point layer
pointfield = arcpy.AddFieldDelimiters("in_memory\Lxcenter_inPolygon",
"OBJECTID")
point_selection = "{pointfield} = {pointID}".format(pointfield=pointfield,
pointID=pointID)
arcpy.Select_analysis("in_memory\Lxcenter_inPolygon",
"in_memory\Lxcenter_inPolygon_point", point_selection)
# Find distance to source lxcenter
facilities = arcpy.FeatureSet()
facilities.load("in_memory\Lxcenter_inPolygon_point")
incidents = arcpy.FeatureSet()
incidents.load("facilities_point_cur")
# Try if a route can be created
try:
# Find distance from point to other point
arcpy.FindClosestFacilities_na(incidents, facilities, "Meters",
inNetworkDataset, "in_memory", "Distance_route",
"Output_directions", "Output_Closest_Facilities")
rows3 = arcpy.SearchCursor("in_memory\Distance_route")
for row3 in rows3:
# Set distance decay effect
distdecayeffect = row3.getValue("Total_Meters")
# calculate KDE for lxcenter
if kernel == "gaussian":
density = gaussian(distdecayeffect, bandwidth, source)
elif kernel == "quartic":
density = quartic(distdecayeffect, bandwidth, source)
elif kernel == "minvar":
density = minvar(distdecayeffect, bandwidth, source)
# Not always a route can be generated
except Exception as e:
density = 0
69
print('No KDE calculated for this point')
# Get geometry of point
with arcpy.da.SearchCursor("in_memory\Lxcenter_inPolygon_point",
["SHAPE@XY"]) as cur:
for row4 in cur:
geom_point_2 = row4[0]
# Find point that are equal to the geometry and add lxcenterKDE
with arcpy.da.UpdateCursor("lxcenter_amers_fc_lyr", ["SHAPE@XY", "KDE"])
as cur2:
for row5 in cur2:
geom_point = row5[0]
# If geometry is the same add lxcenterKDE
if geom_point == geom_point_2:
row5[1] = row5[1] + density
cur2.updateRow(row5)
break
# Copy results of lxcenter_bak_amers_fc_lyr to outPointData
arcpy.CopyFeatures_management("lxcenter_amers_fc_lyr", outPointData)
70
9. Appendix II: JSON to CSV R-code
#set working direcotry
setwd("C:\\Users\\Bob\\Desktop")
#load libraries
library(RCurl)
library(RJSONIO)
#initialise data frame
df1 <- data.frame(lat=double(), lon=integer())
df2 <- data.frame(lat=double(), lon=integer())
df3 <- data.frame(lat=double(), lon=integer())
df4 <- data.frame(lat=double(), lon=integer())
dflist <- list(df1, df2, df3, df4)
#Google Places API data load
latlondata = read.csv("lat_lon_points.csv")
dataquery <- for(i in latlondata$OBJECTID){
datalat <- latlondata$Latitude[i]
datalon <- latlondata$Longitude[i]
latlon <- paste(datalat, datalon, sep = ",")
# Set data query URLs
url1 <- sprintf('https://maps.googleapis.com/maps/api/place/radarsearch
/json?location=%s&radius=4000&keyword=supermarkt&key=AIzaSyBw
ZvcMFCkzj3TZGwHP-MhbWH1QI7dwVA4', latlon)
url2 <- sprintf('https://maps.googleapis.com/maps/api/place/radarsearch
/json?location=%s&radius=4000&keyword=bakker&types=bakery&key=
AIzaSyBwZvcMFCkzj3TZGwHP-MhbWH1QI7dwVA4', latlon)
url3 <- sprintf('https://maps.googleapis.com/maps/api/place/radarsearch/
json?location=%s&radius=4000&keyword=slager&key=AIzaSyBwZvcMFC
kzj3TZGwHP-MhbWH1QI7dwVA4', latlon)
url4 <- sprintf('https://maps.googleapis.com/maps/api/place/radarsearch/
json?location=%s&radius=4000&keyword=groentewinkel&key=AIzaSyB
wZvcMFCkzj3TZGwHP-MhbWH1QI7dwVA4', latlon)
# List data query URLs
urllist <- list(url1, url2, url3, url4)
for(i in urllist)
web <- getURL(i)
raw <-fromJSON(web)
#Find latitude values
if (length(raw$results) != 0){
lat <- NULL
latquery <- for (j in c(1:length(raw$results))){
lat <- c(lat,raw$results[[j]]$geometry$location[[1]])
}
#Find longitude values
lon <- NULL
lonquery <- for (k in c(1:length(raw$results))){
lon <- c(lon, raw$results[[k]]$geometry$location[[2]])
}
#unlist seperate parameter and list together
latlonretailer <- list(lat, lon)
#create matrix
latlonmatrix <- do.call(cbind, latlonretailer)
listnr <- match(i, urllist)
if(listnr == 1){
df1 <- rbind(df1, latlonmatrix)
}
else if(listnr == 2){
df2 <- rbind(df2, latlonmatrix)
}
71
else if(listnr == 3){
df3 <- rbind(df3, latlonmatrix)
}
else if(listnr == 4){
df4 <- rbind(df4, latlonmatrix)
}
}
}
#write to csv
write.csv(df1, 'location_supermarkets.csv')
write.csv(df2, 'location_bakers.csv')
write.csv(df3, 'location_butchers.csv')
write.csv(df4, 'location_greengroceries.csv')
72
10. Appendix III: Location points for Google Places API query
Table 10: Location points Google Places API query
ID Latitude Longitude ID Latitude Longitude
1 51,95397708 5,339082416 36 52,09373605 5,718338897
2 51,95398351 5,414726979 37 52,0934984 5,794218405
3 51,95394139 5,490371547 38 52,09321178 5,870096826
4 51,9538506 5,566016184 39 52,09287632 5,945974078
5 51,95371104 5,641660399 40 52,09249183 6,02184994
6 51,95352273 5,717303559 41 52,14092969 5,338882146
7 51,95328588 5,792945795 42 52,14093623 5,414843466
8 51,95300055 5,868587142 43 52,14089378 5,490804631
9 51,95266653 5,944227395 44 52,14080244 5,56676553
10 51,95228373 6,019866325 45 52,14066219 5,642726024
11 52,00071579 5,339032494 46 52,14047301 5,718685797
12 52,00072216 5,414755938 47 52,14023488 5,794644658
13 52,0006799 5,490479365 48 52,13994792 5,870602501
14 52,00058891 5,566202704 49 52,13961213 5,946559209
15 52,00044913 5,641925635 50 52,13922722 6,022514502
16 52,00026062 5,717647781 51 52,18766704 5,338831913
17 52,00002355 5,793368976 52 52,18767358 5,414872891
18 51,99973791 5,869089144 53 52,1876311 5,490913608
19 51,99940341 5,944808165 54 52,18753963 5,566953977
20 51,99902005 6,020525887 55 52,1873991 5,642994077
21 52,04745407 5,338982435 56 52,18720946 5,719033505
22 52,04746046 5,41478497 57 52,18697085 5,795071956
23 52,04741808 5,490587453 58 52,18668356 5,871109358
24 52,04732695 5,566389745 59 52,18634739 5,947145623
25 52,04718706 5,642191627 60 52,18596211 6,023180613
26 52,04699847 5,717992864 61 52,23440385 5,338781447
27 52,04676115 5,793793148 62 52,23441044 5,414902364
28 52,04647502 5,86959233 63 52,23436795 5,491022862
29 52,04614001 5,945390329 64 52,23427637 5,567142856
30 52,04575608 6,021187043 65 52,23413559 5,643262729
31 52,09419198 5,338932294 66 52,2339456 5,719382134
32 52,09419848 5,414814143 67 52,23370666 5,795500434
33 52,09415607 5,49069589 68 52,23341893 5,871617515
34 52,09406484 5,566577395 69 52,23308224 5,947733446
35 52,09392482 5,642458471 70 52,23269647 6,023848217
73
11. Appendix IV: R-code for interactive map
# Import library
library(leaflet)
library(shiny)
library(shinyjs)
library(raster)
library(rgdal)
# Set working Directory
setwd("C:\\Users\\Bob\\Google Drive\\MSc Thesis\\script\\Case_visualisation")
# Source Function
source('open_python.R')
source('vis_optimal_results.R')
# Add color scheme for raster
raster_pal <- colorNumeric(c("lightgreen",'green' ,"Darkgreen"), c(0,1),
na.color = "transparent")
# Create color palette for optimal area
area_pal <- colorFactor(
palette = "Blues",
domain = c(100,200,300,400,500)
)
# Import demand raster
demand_raster <- raster("input_images\\dem_raster.tif")
# Set and change coordinate system of demand raster
WGS84 <- "+proj=longlat +ellps=WGS84 +datum=WGS84"
RD_new <- "+init=epsg:28992"
proj4string(demand_raster) <- CRS(RD_new)
demand_raster <- projectRaster(demand_raster, crs = CRS(WGS84))
# Import attractiveness raster
attractiveness_raster <- raster("input_images\\att_raster.tif")
# Set and change coordinate system of attractiveness raster
proj4string(attractiveness_raster) <- CRS(RD_new)
attractiveness_raster <- projectRaster(attractiveness_raster, crs =
CRS(web_mercator))
# Create bootstrap user interface
ui <- bootstrapPage(
# Set style of page + loadmessage
tags$style(type = "text/css", "html, body {width:100%;height:100%}",
"#loadmessage {
position: fixed;
top: 0px;
left: 0px;
width: 100%;
padding: 5px 0px 5px 0px;
text-align: center;
font-weight: bold;
font-size: 100%;
color: #000000;
background-color: #CCFF66;
z-index: 105;
}
"),
# Place a conditional panel which shows a load message when a function is running
conditionalPanel(condition="$('html').hasClass('shiny-busy')",
tags$div("Optimal solutions are being calculated...",
id="loadmessage")),
# Create Leaflet output
leafletOutput("mymap", width = "100%", height = "100%"),
# Place an absolute panel over the map view
absolutePanel(top = 10, right = 10,
# Implement Shiny JavaScript functions
shinyjs::useShinyjs(),
74
# Set input parameters
selectInput("food_specialist_type", "Choose a food specialist type:",
choices = c("Bakery", "Butcher", "Greengrocery")
),
numericInput("weight1", "Demand weight (0-1):", 1, min = 0, max = 1, step=0.1
),
numericInput("weight2", "Competition weight (0-1):", 1, min = 0, max = 1,
step=0.1),
numericInput("weight3", "Location attractiveness weight (0-1):", 1, min = 0,
max = 1, step=0.1),
numericInput("storenr", "Indicate the number of stores: (1-10)", 1, min = 1,
max = 10),
# Create actionbutton
actionButton("generateoptimal", "Generate optimal location(s)"
),
# Additional textouput for tests
textOutput("text") # Only used for testing purposes
)
)
# Create server to change information based on inputs
server <- function(input, output, session) {
# Set leaflet map
output$mymap <- renderLeaflet({ leaflet() %>%
# Add OpenStreetMap (default) map
addTiles(group = "openstreetmap") %>%
# Add a legend for the raster images
addLegend("bottomright", pal=raster_pal, values = c(0,1),
title = "Raster score")%>%
# Add a legend for the polygons
addLegend("bottomright", values=c(100,200,300,400,500),
pal=area_pal, title = "Metres to optimal location")%>%
# Add demand raster
addRasterImage(demand_raster, opacity=0.5, colors=pall, group="Demand map")%>%
# Add attractiveness raster
addRasterImage(attractiveness_raster, opacity=0.5, colors=pall,
group="Attractiveness map")%>%
# Add on/off control for the different layers
addLayersControl(
overlayGroups = c("Demand map", "Attractiveness map"),
options = layersControlOptions(collapsed = FALSE),position='bottomleft')%>%
# Hide the rasters by default
hideGroup(c("Demand map", "Attractiveness map"))
})
# If button generate button event is clicked, then..
observeEvent(input$generateoptimal, {
# Open leafletproxy of mymap
leafletProxy("mymap")%>%
# Clear prior markers and optimal areas
clearShapes()%>%
clearMarkers()
# Disable inputs
shinyjs::disable("generateoptimal")
shinyjs::disable("food_specialist_type")
shinyjs::disable("weight1")
shinyjs::disable("weight2")
shinyjs::disable("weight3")
shinyjs::disable("storenr")
# Write inputs to textfile
write(input$weight1, 'input_data\\weigth1.txt')
write(input$weight2, 'input_data\\weigth2.txt')
write(input$weight3, 'input_data\\weigth3.txt')
write(input$storenr, 'input_data\\storenr.txt')
write(input$food_specialist_type, 'input_data\\food_specialist_type.txt')
# Execute optimisation method
openPython('GreedyAlgorithm') #or ‘SimulatedAnnealing’
75
# Enable inputs
shinyjs::enable("generateoptimal")
shinyjs::enable("food_specialist_type")
shinyjs::enable("weight1")
shinyjs::enable("weight2")
shinyjs::enable("weight3")
shinyjs::enable("storenr")
# Find all optimal location/area files
opt_list <- opt_location()
loop_nr <- 1
# Loop through all objects in opt_list
for (i in opt_list){
# If optimal location, then...
if (class(i[1]) == "SpatialPointsDataFrame"){
# Set optimal point based on loop number
optimal_lxcenter<-opt_list[[loop_nr]]
# Set leaflet proxy
leafletProxy("mymap")%>%
# Add markers where the optima are found
addMarkers(lng = optimal_lxcenter$coords.x1,
lat = optimal_lxcenter$coords.x2)
loop_nr = loop_nr + 1
}
# If not an optimal location, then..
else {
# Set optimal_area based on loop number
optimal_area <- opt_list[[loop_nr]]
# Set leaflet proxy
leafletProxy("mymap", data=optimal_area)%>%
# Add polygons based on the service areas of the optimal points
addPolygons(stroke=FALSE,
color = ~colorQuantile("Blues", optimal_area$ToBreak)(ToBreak),
fillOpacity = 0.6)
loop_nr = loop_nr + 1
}
}
})
}
# Execute shinyApp
shinyApp(ui, server)