Research Article Spatiotemporal Hotspots Analysis for Exploring...

8
Hindawi Publishing Corporation Advances in Fuzzy Systems Volume 2013, Article ID 385974, 7 pages http://dx.doi.org/10.1155/2013/385974 Research Article Spatiotemporal Hotspots Analysis for Exploring the Evolution of Diseases: An Application to Oto-Laryngopharyngeal Diseases Ferdinando Di Martino, 1 Roberta Mele, 1 Umberto E. S. Barillari, 2 Maria Rosaria Barillari, 2 Irina Perfilieva, 3 and Sabrina Senatore 4 1 Universit´ a degli Studi di Napoli Federico II, Dipartimento di Architettura, Via Toledo 402, 80134 Napoli, Italy 2 Seconda Universit` a degli Studi di Napoli, Dipartimento di Psichiatria, Neuropsichiatria Infantile, Audiofoniatria e Dermatovenereologia, L.go Madonna delle Grazie, 80138 Napoli, Italy 3 Centre of Excellence IT4 Innovations, Institute for Research and Applications of Fuzzy Modelling, University of Ostrava, 30. dubna 22, 70103 Ostrava, Czech Republic 4 Universit´ a degli Studi di Salerno, Dipartimento di Informatica, Via Ponte don Melillo, 80084 Fisciano, Salerno, Italy Correspondence should be addressed to Ferdinando Di Martino; [email protected] Received 1 May 2013; Revised 29 July 2013; Accepted 29 July 2013 Academic Editor: Salvatore Sessa Copyright © 2013 Ferdinando Di Martino et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. is paper presents a spatiotemporal analysis of hotspot areas based on the Extended Fuzzy C-Means method implemented in a geographic information system. is method has been adapted for detecting spatial areas with high concentrations of events and tested to study their temporal evolution. e data consist of georeferenced patterns corresponding to the residence of patients in the district of Naples (Italy) to whom a surgical intervention to the oto-laryngopharyngeal apparatus was carried out between the years 2008 and 2012. 1. Introduction In a GIS, the impact of phenomena in a specific area due to the proximity of the event (e.g., the study of the impact area of an earthquake, or the area constraint around a river basin) is performed using buffer area geoprocessing functions. Given a geospatial event topologically represented as a georeferenced punctual, linear, or areal element, an atomic buffer area is constituted by circular areas centered on the element. For example, if the event is the epicenter of an earthquake, georeferenced by a point, a set of buffer areas is formed by concentric circular areas around that point; the radius of each circular buffer area is defined a priori. When it is not possible to define statically an area of impact and we need to determine what is the area affected by the presence of a consistent set of events, we are faced with the problem of detecting this area as a cluster on which the georeferenced events are thickened as well. ese clusters are georeferenced, represented as polygons on the map, and called hotspot areas. e study of hotspot areas is vital in many disciplines such as crime analysis [13], which studies the spread on the territory of criminal events, fire analysis [4], which analyzes the phenomenon of spread of fires on forested areas, and disease analysis [57], which studies the localization of focuses of diseases and their temporal evolution. e clustering methods mainly used for detecting hotspot areas are the algorithms based on density (see [8, 9]); they can detect the exact geometry of the hotspots, but are highly expensive in terms of computational complexity, and in the great majority of cases, it is not necessary to determine exactly the shape of the clusters. e clustering algorithm more used for its linear computational complexity is the Fuzzy C-Means algorithm (FCM) [10], a partitive fuzzy clustering method that uses the Euclidean distance to determine prototypes cluster as points.

Transcript of Research Article Spatiotemporal Hotspots Analysis for Exploring...

Page 1: Research Article Spatiotemporal Hotspots Analysis for Exploring …downloads.hindawi.com/journals/afs/2013/385974.pdf · 2019. 7. 31. · Advances in Fuzzy Systems Let X ={x1,...,x

Hindawi Publishing CorporationAdvances in Fuzzy SystemsVolume 2013 Article ID 385974 7 pageshttpdxdoiorg1011552013385974

Research ArticleSpatiotemporal Hotspots Analysis for Exploring the Evolution ofDiseases An Application to Oto-Laryngopharyngeal Diseases

Ferdinando Di Martino1 Roberta Mele1 Umberto E S Barillari2 Maria Rosaria Barillari2

Irina Perfilieva3 and Sabrina Senatore4

1 Universita degli Studi di Napoli Federico II Dipartimento di Architettura Via Toledo 402 80134 Napoli Italy2 Seconda Universita degli Studi di Napoli Dipartimento di Psichiatria Neuropsichiatria InfantileAudiofoniatria e Dermatovenereologia Lgo Madonna delle Grazie 80138 Napoli Italy

3 Centre of Excellence IT4 Innovations Institute for Research and Applications of Fuzzy Modelling University of Ostrava30 dubna 22 70103 Ostrava Czech Republic

4Universita degli Studi di Salerno Dipartimento di Informatica Via Ponte don Melillo 80084 Fisciano Salerno Italy

Correspondence should be addressed to Ferdinando Di Martino fdimartiuninait

Received 1 May 2013 Revised 29 July 2013 Accepted 29 July 2013

Academic Editor Salvatore Sessa

Copyright copy 2013 Ferdinando Di Martino et alThis is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited

This paper presents a spatiotemporal analysis of hotspot areas based on the Extended Fuzzy C-Means method implemented in ageographic information system This method has been adapted for detecting spatial areas with high concentrations of events andtested to study their temporal evolution The data consist of georeferenced patterns corresponding to the residence of patients inthe district of Naples (Italy) to whom a surgical intervention to the oto-laryngopharyngeal apparatus was carried out between theyears 2008 and 2012

1 Introduction

In a GIS the impact of phenomena in a specific area due tothe proximity of the event (eg the study of the impact area ofan earthquake or the area constraint around a river basin) isperformed using buffer area geoprocessing functions Given ageospatial event topologically represented as a georeferencedpunctual linear or areal element an atomic buffer area isconstituted by circular areas centered on the element Forexample if the event is the epicenter of an earthquakegeoreferenced by a point a set of buffer areas is formed byconcentric circular areas around that point the radius of eachcircular buffer area is defined a priori

When it is not possible to define statically an area ofimpact and we need to determine what is the area affectedby the presence of a consistent set of events we are facedwith the problem of detecting this area as a cluster on whichthe georeferenced events are thickened as well These clusters

are georeferenced represented as polygons on the map andcalled hotspot areas

The study of hotspot areas is vital in many disciplinessuch as crime analysis [1ndash3] which studies the spread onthe territory of criminal events fire analysis [4] whichanalyzes the phenomenon of spread of fires on forested areasand disease analysis [5ndash7] which studies the localizationof focuses of diseases and their temporal evolution Theclustering methods mainly used for detecting hotspot areasare the algorithms based on density (see [8 9]) they candetect the exact geometry of the hotspots but are highlyexpensive in terms of computational complexity and in thegreatmajority of cases it is not necessary to determine exactlythe shape of the clusters The clustering algorithmmore usedfor its linear computational complexity is the Fuzzy C-Meansalgorithm (FCM) [10] a partitive fuzzy clustering methodthat uses the Euclidean distance to determine prototypescluster as points

2 Advances in Fuzzy Systems

Let X = x1 x

119873 sub 119877

119899 be a dataset composed of119873 pattern x

119895= (119909

1119895 1199092119895 119909

119899119895)

119879 where 119909119896119895

is the 119896thcomponent (feature) of the pattern x

119895 The FCM algorithm

minimizes the following objective function

119869 (XUV) =

119862

sum

119894=1

119873

sum

119895=1

119906

119898

119894119895119889

2

119894119895 (1)

where 119862 is the number of clusters fixed a priori 119906119894119895is the

membership degree of the pattern x119895to the 119894th cluster (119894 =

1 119862) V = v1 v

119862 sub 119877

119899 is the set of points given bythe centers of the 119862 clusters (prototypes) 119898 is the fuzzifierparameter and 119889

119894119895is the distance between the center k

119894=

(V1119894 V2119894 V

119899119895)

119879 of the 119894th cluster and the 119895th vector x119895

calculated as the Euclidean norm

119889119894119895=

10038171003817100381710038171003817

(x119895minus k119894)

10038171003817100381710038171003817

= radic

119899

sum

119896=1

(x119896119895

minus k119896119894)

2

(2)

Using the Lagrange multipliers method for minimizing theobjective function (1) we obtain the following solution for thecenter of each cluster prototype

k119894=

sum

119873

119895=1119906

119898

119894119895x119895

sum

119873

119895=1119906

119898

119894119895

(3)

where 119894 = 1 119862 and for membership degrees 119906119894119895

119906119894119895=

1

(sum

119888

ℎ=1(119889

2

119894119895119889

2

ℎ119895))

2(119898minus1) (4)

subjected to the constraints

119862

sum

119894=1

119906119894119895= 1 forall119895 isin 1 119873

0 lt

119873

sum

119895=1

119906119894119895lt 119873 forall119894 isin 1 119862

(5)

Initially the 119906119894119895rsquos and the v

119894are assigned randomly and

updated in each iteration If 119880

(119897)= (119906

(119897)

119894119895) is the matrix U

calculated at the 119897-th step the iterative process stops when10038171003817100381710038171003817

U(119897) minus U(119897minus1)10038171003817100381710038171003817

= max119894119895

10038161003816100381610038161003816

119906

(119897)

119894119895minus 119906

(119897minus1)

119894119895

10038161003816100381610038161003816

lt 120576 (6)

where 120576 gt 0 is a prefixed parameterThis algorithm has a linear computational complexity

however it is sensitive to the presence of noise and outliersfurthermore the number of cluster 119862 is fixed a priori andneeds to use a validity index for determining an optimal valuefor the parameter 119862

In order to overcome these shortcomings in [11 12] theEFCM algorithm is proposed where the cluster prototypesare hyperspheres in the case of the Euclidean metric LikeFCM the EFCM algorithm is characterized by a linear com-putational complexity furthermore it is robust with respect

A minus (A cap B)

= A minus B

A cap BB minus (A cap B) =

B minus A

Figure 1 Intersections of two hotspots detected for events thathappened in two consecutive periods

to the presence of noise and outliers and the final number ofclusters is determined during the iterative process

In [13 14] the authors propose the use of the EFCMalgorithm for detecting hotspot areas The final hotspots areidentified as the detected cluster prototypes and shown on themap as circular areas In [4] the authors analyze the spatio-temporal evolution of the hotspots in the fire analysis Thepattern event dataset is partitioned according to the timeof the eventrsquos detection so each subset is corresponding toa specific time interval The authors compare the hotspotsobtained in two consecutive years by studying their inter-sections on the map In this way it is possible to follow theevolution of a particular phenomenon

The cluster prototypes detected from EFCM method arecircular areas on themap that can approximate a hotspot areaFigure 1 shows an example of two circular hotspots obtainedas clusters

Figure 1 shows three different regions

(i) An area in which the hotspot 119860 is not intersectedby the hotspot 119861 (corresponding to 119860 minus (119860 cap 119861) =

119860minus119861) this region can be considered as a geographicalarea in which prematurely detected event disappearssuccessively

(ii) The region of intersection of the two hotspots 119860 cap 119861this region can be considered a geographical area inwhich the event continues to persist

(iii) An area in which the hotspot B is not intersected bythe hotspotA (corresponding to 119861minus(119860cap119861) = 119861minus119860)this region can be considered as a geographical area inwhich the prematurely undetected event propagatessuccessively

We can study the spatio-temporal evolution of the hotspotsby analyzing the interactions between the correspondingcircular cluster prototypes obtained for consecutive periods

Advances in Fuzzy Systems 3

and detecting the presence of new hotspots in regions previ-ously not covered by hotspots and the absebce of hotspots inregions previously spatially included in hotspot areas

In this research we present a method for studyingthe spatio-temporal evolution of hotspots areas in diseaseanalysis we apply the EFCM algorithm for comparingin consecutive years event datasets corresponding to oto-laryngopharyngeal diseases diagnosis detected in the districtof Naples (I) Each event corresponds to the residence of thepatient who contracted the disease

We study the spatio-temporal evolution of the hotspotsanalyzing the intersections of hotspots corresponding totwo consecutive years the displacement of the centroidsthe increase or reduction of the hotspots areas and theemergence of new hotspots

In Section 2 we give an overview of the EFCM algo-rithm In Section 3 we present our method for studying thespatio-temporal evolution of hotspots in disease analysis InSection 4 we present the results of the spatio-temporal evolu-tion of hotspots for the otolaryngologist-laryngopharyngealdiseases diagnosis events detected in the district of Naples (I)Our conclusions are in Section 5

2 The EFCM Algorithm

In the EFCM algorithm we consider clustering prototypesgiven by hyperspheres in the 119899-dimensional featurersquos spaceThe 119894th hypersphere is characterized by a centroid v

119894=

(k1119894 k

119899119894) and a radius 119903

119894

Indeed if 119903119894is the radius of 119881

119894 we say that 119909

119895belongs to

119881119894if 119889119894119895le 119903119894

The radius 119903119894is obtained considering the covariance

matrix 119875119894associated with the 119894th cluster defined as

P119894=

sum

119873

119895=1119906

119898

119894119895(x119895minus k119894) (x119895minus k119894)

119879

sum

119873

119895=1119906

119898

119894119895

(7)

whose determinant gives the volume of the 119894th cluster SinceP119894is symmetric and positive it can be decomposed in the

following form

Pi = Q119894Λ119894Q119879119894 (8)

where 119876119894is an orthonormal matrix and Δ

119894= (120582

119894119895) is

a diagonal matrix The radius 119903119894is given by the following

formula (see [12])

119903119894=

1

119899

radic

119899

prod

119896=1

120582

1119899

119894119896=

radicdet (119875119894)

1119899

(9)

The objective function to be minimized is the following

119869 (119883119880 119881) =

119862

sum

119894=1

119873

sum

119895=1

119906

119898

119894119895(119889

2

119894119895minus 119903

2

119894) (10)

where the membership degrees 119906119894119895are updated as

119906119894119895= 1 times (

119862

sum

ℎ=1

((119889

2

119894119895max (0 1 minus 119903

2

119894119889

2

119894119895))

(119889

2

ℎ119895max (0 1 minus 119903

2

ℎ119889

2

ℎ119895)))

1(119898minus1)

)

minus1

=

1

sum

119862

ℎ=1(119889

2

119894119895119908119894119895119889

2

ℎ119895119908ℎ119895)

1(119898minus1)

(11)

We set 11988910158402119896119895

= 119889

2

119896119895119908119896119895

= max(0 1198892119896119895

minus 119903

2

119896) and define the

number 120593119895

= card119896 isin 1 119862 119889

1015840

119896119895= 0 for any 119895 =

1 119873 thus we obtain

119906119894119895=

1

sum

119862

ℎ=1(119889

1015840

119894119895119889

1015840

ℎ119895)

2(119898minus1)if 120593119895= 0

119906119894119895=

0 if 1198891015840119894119895gt 0

if 120593119895gt 0

1

120593119895

if 1198891015840119894119895= 0

(12)

However the usage of (12) produces the negative effect ofdiminishing the objective function (10) when a meaningfulnumber of features are placed in a cluster and this fact canprevent the separation of the clusters Then a solution to thisproblem consists in the assumption of a small starting valueof 119903119894and then it is increased gradually with the factor120573(119897)119862(119897)

where119862(119897) is the number of clusters at the 119897th iteration and120573

(119897)

is defined recursively as 120573(0) = 1 120573

(119897)= min(119862(119897minus1) 120573(119897minus1)) by

setting

119868119894119896

=

sum

119873

119895=1min (119906

119894119895 119906119896119894)

sum

119873

119895=1119906119894119895

(13)

and the symmetric matrix 119878 = (119878119894119896) where 119878

119894119896= max119868

119894119896 119868119896119894

is defined as well If 119878(119897) is the matrix 119878 at the 119897th iteration (119897 gt

1) and the threshold 120572

(119897)= 1(119862

(119897)minus 1) is introduced as limit

then two indexes 119894lowast and 119896

lowast are determined such that 119878(119897)119894lowast119896lowast

ge

120572

(119897) and thus 119894lowast and 119896

lowast are merged by setting

119906

(119897)

119894lowast119895= 119906

(119897)

119894lowast119895+ 119906

(119897)

119896lowast119895 forall119895 isin 1 119873

119862

(119897)= 119862

(119897minus1)+ 1

(14)

The 119896

lowastth row can be removed from the matrix 119880

(119897) Inother words the EFCM algorithm can be summarized in thefollowing steps

(1) The user assigns the initial number of clusters 119862

(0)

119898 gt 1 (usually119898 = 2) 120576 gt 0 the initial value 119878(0)119894119896

= 0and 120573

(0)= 1

(2) The membership degrees 119906

(0)

119894119895(119895 = 1 119873 and 119894 =

1 119862(0)) are assigned randomly

4 Advances in Fuzzy Systems

Figure 2 Example of events georeferenced on a road network

(3) The centers of the clusters V119894are calculated by using

(3)(4) The radii of the clusters are calculated by using (9)(5) 119906119894119895is calculated by using (12)

(6) The indexes 119894lowast and 119896

lowast are determined in such a waythat 119878(119897)119894lowast119896lowast assumes the possible greatest value at the 119897th

iteration(7) If |119878(119897)

119894lowast119896lowast minus 119878

(119897minus1)

119894lowast119896lowast | lt 120576 and 119878

(119897)

119894lowast119896lowast gt 120572(119897) = 1(119862

(119897minus1)minus 1)

then the 119894lowastth and 119896

lowastth clusters aremerged via (14) andthe 119896lowastth row is deleted from 119880

(119897)(8) If (6) is satisfied then the process stops otherwise go

to the step (3) for the (119897 + 1)th iteration

3 Hotspots Detection and Evolution inDisease Analysis

Each pattern is given by the event corresponding to theresidence of the patient to whom a specific disease has beendetected The two features of the pattern are the geographiccoordinates of the residence

The first step of our process is a geocoding activitynecessary for obtaining the event dataset starting by the streetaddress of the patients

To ensure an accurate matching for the geopositioning ofthe event we need the topologically correct road network andthe corresponding complete toponymic data

The starting data include the name of the street and thehouse number of the patientrsquos residence After the matchingprocess each data is converted in an event point georefer-enced on the map

In Figure 2 the road network of the district of Naples isshown the name of the street is labeled on themap the eventsare georeferenced as points on the map

Figure 3 shows the data corresponding to an eventselected on the map

After geo-referencing each event the event dataset canbe split partitioning them by time interval For example theevent in Figure 3 can be split by the field ldquoYearrdquo

For each subset of events we apply the EFCM algorithmto detect the final cluster prototypes

Figure 3 Data associated to an event on the map

Figure 4 A form created in theGIS Tool ESRIArcGIS formanagingthe EFCM process

In this research we point out the analysis of the temporalevolution and spread of oto-laryngo-pharyngeal diseasesdetected within the district of Naples The datasets dividedby time sequences corresponding to periods of one yearare made up of patterns for different events georeferencedcorresponding to ailments encountered in patients for whichan intervention and the subsequent histological examinationwere pointed out as well The event refers to the geoposition-ing of the location of the patient

The data have been further divided by the type of thedisease for analyzing the distribution and evolution of eachspecific disease on the area of the study

The EFCM algorithm has been encapsulated in the GISplatform ESRI ArcGIS Figure 4 shows the mask created forsetting the parameters and running the EFCM algorithm

We can set other numerical fields for adding otherfeatures to the geographical coordinates

Initially we set the initial number of clusters the fuzzifierm and the error threshold for stopping the iterations Afterrunning EFCM the number of iterations the final numberof clusters and the error calculated at the last iteration arereported The resultant clusters are shown as circular areason the map and can be saved in a new geographic layer

The final process concerns the comparative analysis ofthe hotspots obtained by the clusters corresponding to eachsubsets of events

Advances in Fuzzy Systems 5

Figure 5 Analysis of the spatio-temporal evolution of hotspotsdetected in two consecutive periods

Figure 6 Edema of bilateral Reinke diseasemdashyear 2011 display ofthe hotspots on the map

Figure 5 shows an example of display on the map ofhotspots obtained as final clusters for two consecutive subsetof events

In order to assess the expansion and the displacementof a hotspot we measure the radius of the hotspot and thedistance between the centroids of two intersecting hotspots

In the next section we present the results obtained byapplying this method for the data corresponding to surgicalinterventions to the oto-laryngo-pharyngeal apparatus inpatients residents in the district of Naples between the years2008 and 2012

We divide the dataset per year and analyze various typesof diseases

Among the types of the most frequent diseases thefollowing were analyzed

(i) carcinoma(ii) edema of bilateral Reinke(iii) hypertrophy of the inferior turbinate(iv) nasal polyposis(v) bilateral vocal fold prolapse

In the next section we show the most significant resultsobtained by applying this method to the each partitioneddataset of events

Figure 7 Edema of bilateral Reinke diseasemdashyear 2012 display ofthe hotspots on the map

Table 1 Endema of bilateral Reinke diseasemdashfinal number ofclusters and final error for year

Year Initial numberof clusters

Final numberof clusters |119880

(119897)minus 119880

(119897minus1)| 120576

2008 15 4 048 times 10

minus21 times 10

minus2

2009 15 4 055 times 10

minus21 times 10

minus2

2010 15 4 071 times 10

minus21 times 10

minus2

2011 15 4 067 times 10

minus21 times 10

minus2

2012 15 3 053 times 10

minus21 times 10

minus2

4 Test Results

Wepresent the results obtained on the event dataset describedabove in the period between the years 2008 and 2012

We consider first the subset of data corresponding to theedema of bilateral Reinke disease

We fix the fuzzifier parameter to 01 the initial number ofclusters to 15 and the final iteration error to 1 times 10minus2

Table 1 shows the results obtained for each yearWe present the details relating to the comparison of the

hotspots obtained by considering the event data for the years2011 and 2012

Figures 6 and 7 show respectively the hotspots obtainedby using the pattern subset of events that occurred in the years2011 and 2012

Figure 8 shows the overlap of the hotspots obtained forthe two years in red the hotspots corresponding to the year2011 in blue the ones corresponding to the year 2012

Table 2 shows in the first two columns the labels of thehotspots in 2011 and 2012 in third (resp fourth) column theradius obtained in 2011 (resp 2012) and the distance betweenthe centroids is given in the fifth column

The results show that only hotspot 3 obtained for the year2011 remains almost unchanged in the year 2012 Insteadhotspots 1 and 2 seem to merge into a single larger hotspot(the hotspot 1 obtained for the year 2012) and hotspot 4 thatshifts about 1 km is expanded the radius of this hotspot in2012 is about 65 km (hotspot 3 obtained for the year 2012 inFigure 8)

6 Advances in Fuzzy Systems

Figure 8 Edema of bilateral Reinke diseasemdashyears 2011 and 2012display of the two hotspotsrsquo series

Table 2 Endema of bilateral Reinke diseasemdashcomparison results ofthe hotspots obtained for the years 2011 and 2012

2011hotspot

Intersecting2012 hotspot

Radius 2011hotspot (km)

Radius 2012hotspot (km)

Centroidrsquosdistance (km)

1 1 1724 3848 17592 1 1943 3848 15073 2 3434 3453 00744 3 3591 6519 1115

Table 3 Nasal polyposismdashcomparison results of the hotspotsobtained for the years 2011 and 2012

2011hotspot

Intersecting2012 hotspot

Radius 2011hotspot (km)

Radius 2012hotspot (km)

Centroidrsquosdistance (km)

1 1 3087 4951 26562 2 4915 7103 1052

Now we show the results obtained for the disease nasalpolyposis

Figure 9 shows the overlap of the hotspots obtained forthe two years 2011 and 2012

In Table 3 the comparisonrsquos results are reportedThe results in Figure 9 show that in 2011 and 2012 there

are two hotspots the one covering an area of the city ofNaples and the other coveringmanyVesuvian townsThe twohotspots which in 2011 covered a circular area with a radii ofabout 3 and 5 km respectively in 2012 cover a circular areawith radii of about 5 and 7 km respectively

The histogram in Figure 10 shows the trend of the radii ofthe two hotspots in the course of time

It is relevant the spread in recent years of the hotspot thatsurrounds the Vesuvian towns (the radius of this hotspotfrom about 2 km in the year 2008 is about 7 km in the year2012)

Another significant trend concerns the hotspots obtainedfor the carcinoma disease

Also in this case the two main hotsposts cover the cityof Naples and many Vesuvian towns In in this case we havea very high spread of the hotspot covering the city of Naples

Figure 9 Nasal polyposis diseasemdashyears 2011 and 2012 display ofthe two hotspotsrsquo series

8

7

6

5

4

3

2

1

0

(km

)

2008 2009 2010 2011 2012Year

NaplesVesuvian towns

Figure 10 Nasal polyposis diseasemdashhistogram showing the varia-tion of the radius of the two hotspots over time

(cfr Figure 11) in recent years the radius of this hotspot isincreased up to 95 km

5 Conclusions

The hyperspheres obtained as clusters (circles in case oftwo dimensions) by using EFCM can represent hotspots inhotspot analysis this method has a linear computationalcomplexity and is robust to noises and outliers In hotspotsanalysis the patterns are bidimensional and the features areformed by geographic coordinates the cluster prototypes arecircles that can represent a good approximation of hotspotareas and can be displayed as circular areas on the map

In this paper we present a new method that uses theEFCM algorithm for studying the spatio-temporal evolutionof hotspots in disease analysis

Advances in Fuzzy Systems 7

12

10

8

6

4

2

0

(km

)

2008 2009 2010 2011 2012Year

NaplesVesuvian towns

Figure 11 Carcinoma diseasemdashhistogram showing the variation ofthe radius of the two hotspots over time

We consider the residencersquos information of patients inthe district of Naples (Italy) to whom a surgical interventionto the oto-laryngo-pharyngeal apparatus was carried outbetween the years 2008 and 2012 A geocoding process isused for geo-referencing the data then the georeferenceddataset is partitionedper year and type of disease we comparethe hotspots obtained for each pair of consecutive years andanalyze the trend of each hotspot over time measuring thevariation of the radius and the distance between intersectingcluster centroids concerning two consecutive years

The results show a consistent spread in the last years ofthe nasal polyposis disease hotspot covering some Vesuviantowns and of the carcinoma disease hotspot covering the cityof Naples

References

[1] S P Chainey S Reid and N Stuart ldquoWhen is a hotspot ahotspot A procedure for creating statistically robust hotspotgeo-graphic maps of crimerdquo in Innovations in GIS 9 Socioe-conomic Applications of Geographic Information Science DKidner G Higgs and S White Eds Taylor and FrancisLondon UK 2002

[2] K Harries Geographic Mapping Crime Principle and PracticeNational Institute of Justice Washington DC USA 1999

[3] A T Murray I McGuffog J S Western and P MullinsldquoExploratory spatial data analysis techniques for examiningurban crimerdquo British Journal of Criminology vol 41 no 2 pp309ndash329 2001

[4] F Di Martino and S Sessa ldquoThe extended fuzzy c-meansalgorithm for hotspots in spatio-temporal GISrdquo Expert Systemswith Applications vol 38 no 9 pp 11829ndash11836 2011

[5] R M Mullner K Chung K G Croke and E K MensahldquoIntroduction geographic information systems in public healthandmedicinerdquo Journal ofMedical Systems vol 28 no 3 pp 215ndash221 2004

[6] K Polat ldquoApplication of attribute weighting method based onclustering centers to discrimination of linearly non-separablemedical datasetsrdquo Journal of Medical Systems vol 36 no 4 pp2657ndash2673 2012

[7] C K Wei S Su and M C Yang ldquoApplication of data miningon the develoment of a disease distribution map of screenedcommunity residents of Taipei County in Taiwanrdquo Journal ofMedical Systems vol 36 no 3 pp 2021ndash2027 2012

[8] I Gath and A B Geva ldquoUnsupervised optimal fuzzy clus-teringrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 11 no 7 pp 773ndash780 1989

[9] R Krishnapuram and J Kim ldquoClustering algorithms based onvolume criteriardquo IEEE Transactions on Fuzzy Systems vol 8 no2 pp 228ndash236 2000

[10] J C Bezdek Pattern Recognition with Fuzzy Objective FunctionAlgorithms Plenum Press New York NY USA 1981

[11] U Kaymak R BabuskaM Setnes H B Verbruggen andHMvanNauta Lemke ldquoMethods for simplification of fuzzymodelsrdquoin Intelligent Hybrid Systems D Ruan Ed pp 91ndash108 KluwerAcademic Publishers Dordrecht The Netherlands 1997

[12] U Kaymak and M Setnes ldquoFuzzy clustering with volumeprototypes and adaptive cluster mergingrdquo IEEE Transactions onFuzzy Systems vol 10 no 6 pp 705ndash712 2002

[13] F Di Martino V Loia and S Sessa ldquoExtended fuzzy c-meansclustering algorithm for hotspot events in spatial analysisrdquoInternational Journal of Hybrid Intelligent Systems vol 4 pp 1ndash14 2007

[14] F Di Martino and S Sessa ldquoImplementation of the extendedfuzzy c-means algorithm in geographic information systemsrdquoJournal of Uncertain Systems vol 3 no 4 pp 298ndash306 2009

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 2: Research Article Spatiotemporal Hotspots Analysis for Exploring …downloads.hindawi.com/journals/afs/2013/385974.pdf · 2019. 7. 31. · Advances in Fuzzy Systems Let X ={x1,...,x

2 Advances in Fuzzy Systems

Let X = x1 x

119873 sub 119877

119899 be a dataset composed of119873 pattern x

119895= (119909

1119895 1199092119895 119909

119899119895)

119879 where 119909119896119895

is the 119896thcomponent (feature) of the pattern x

119895 The FCM algorithm

minimizes the following objective function

119869 (XUV) =

119862

sum

119894=1

119873

sum

119895=1

119906

119898

119894119895119889

2

119894119895 (1)

where 119862 is the number of clusters fixed a priori 119906119894119895is the

membership degree of the pattern x119895to the 119894th cluster (119894 =

1 119862) V = v1 v

119862 sub 119877

119899 is the set of points given bythe centers of the 119862 clusters (prototypes) 119898 is the fuzzifierparameter and 119889

119894119895is the distance between the center k

119894=

(V1119894 V2119894 V

119899119895)

119879 of the 119894th cluster and the 119895th vector x119895

calculated as the Euclidean norm

119889119894119895=

10038171003817100381710038171003817

(x119895minus k119894)

10038171003817100381710038171003817

= radic

119899

sum

119896=1

(x119896119895

minus k119896119894)

2

(2)

Using the Lagrange multipliers method for minimizing theobjective function (1) we obtain the following solution for thecenter of each cluster prototype

k119894=

sum

119873

119895=1119906

119898

119894119895x119895

sum

119873

119895=1119906

119898

119894119895

(3)

where 119894 = 1 119862 and for membership degrees 119906119894119895

119906119894119895=

1

(sum

119888

ℎ=1(119889

2

119894119895119889

2

ℎ119895))

2(119898minus1) (4)

subjected to the constraints

119862

sum

119894=1

119906119894119895= 1 forall119895 isin 1 119873

0 lt

119873

sum

119895=1

119906119894119895lt 119873 forall119894 isin 1 119862

(5)

Initially the 119906119894119895rsquos and the v

119894are assigned randomly and

updated in each iteration If 119880

(119897)= (119906

(119897)

119894119895) is the matrix U

calculated at the 119897-th step the iterative process stops when10038171003817100381710038171003817

U(119897) minus U(119897minus1)10038171003817100381710038171003817

= max119894119895

10038161003816100381610038161003816

119906

(119897)

119894119895minus 119906

(119897minus1)

119894119895

10038161003816100381610038161003816

lt 120576 (6)

where 120576 gt 0 is a prefixed parameterThis algorithm has a linear computational complexity

however it is sensitive to the presence of noise and outliersfurthermore the number of cluster 119862 is fixed a priori andneeds to use a validity index for determining an optimal valuefor the parameter 119862

In order to overcome these shortcomings in [11 12] theEFCM algorithm is proposed where the cluster prototypesare hyperspheres in the case of the Euclidean metric LikeFCM the EFCM algorithm is characterized by a linear com-putational complexity furthermore it is robust with respect

A minus (A cap B)

= A minus B

A cap BB minus (A cap B) =

B minus A

Figure 1 Intersections of two hotspots detected for events thathappened in two consecutive periods

to the presence of noise and outliers and the final number ofclusters is determined during the iterative process

In [13 14] the authors propose the use of the EFCMalgorithm for detecting hotspot areas The final hotspots areidentified as the detected cluster prototypes and shown on themap as circular areas In [4] the authors analyze the spatio-temporal evolution of the hotspots in the fire analysis Thepattern event dataset is partitioned according to the timeof the eventrsquos detection so each subset is corresponding toa specific time interval The authors compare the hotspotsobtained in two consecutive years by studying their inter-sections on the map In this way it is possible to follow theevolution of a particular phenomenon

The cluster prototypes detected from EFCM method arecircular areas on themap that can approximate a hotspot areaFigure 1 shows an example of two circular hotspots obtainedas clusters

Figure 1 shows three different regions

(i) An area in which the hotspot 119860 is not intersectedby the hotspot 119861 (corresponding to 119860 minus (119860 cap 119861) =

119860minus119861) this region can be considered as a geographicalarea in which prematurely detected event disappearssuccessively

(ii) The region of intersection of the two hotspots 119860 cap 119861this region can be considered a geographical area inwhich the event continues to persist

(iii) An area in which the hotspot B is not intersected bythe hotspotA (corresponding to 119861minus(119860cap119861) = 119861minus119860)this region can be considered as a geographical area inwhich the prematurely undetected event propagatessuccessively

We can study the spatio-temporal evolution of the hotspotsby analyzing the interactions between the correspondingcircular cluster prototypes obtained for consecutive periods

Advances in Fuzzy Systems 3

and detecting the presence of new hotspots in regions previ-ously not covered by hotspots and the absebce of hotspots inregions previously spatially included in hotspot areas

In this research we present a method for studyingthe spatio-temporal evolution of hotspots areas in diseaseanalysis we apply the EFCM algorithm for comparingin consecutive years event datasets corresponding to oto-laryngopharyngeal diseases diagnosis detected in the districtof Naples (I) Each event corresponds to the residence of thepatient who contracted the disease

We study the spatio-temporal evolution of the hotspotsanalyzing the intersections of hotspots corresponding totwo consecutive years the displacement of the centroidsthe increase or reduction of the hotspots areas and theemergence of new hotspots

In Section 2 we give an overview of the EFCM algo-rithm In Section 3 we present our method for studying thespatio-temporal evolution of hotspots in disease analysis InSection 4 we present the results of the spatio-temporal evolu-tion of hotspots for the otolaryngologist-laryngopharyngealdiseases diagnosis events detected in the district of Naples (I)Our conclusions are in Section 5

2 The EFCM Algorithm

In the EFCM algorithm we consider clustering prototypesgiven by hyperspheres in the 119899-dimensional featurersquos spaceThe 119894th hypersphere is characterized by a centroid v

119894=

(k1119894 k

119899119894) and a radius 119903

119894

Indeed if 119903119894is the radius of 119881

119894 we say that 119909

119895belongs to

119881119894if 119889119894119895le 119903119894

The radius 119903119894is obtained considering the covariance

matrix 119875119894associated with the 119894th cluster defined as

P119894=

sum

119873

119895=1119906

119898

119894119895(x119895minus k119894) (x119895minus k119894)

119879

sum

119873

119895=1119906

119898

119894119895

(7)

whose determinant gives the volume of the 119894th cluster SinceP119894is symmetric and positive it can be decomposed in the

following form

Pi = Q119894Λ119894Q119879119894 (8)

where 119876119894is an orthonormal matrix and Δ

119894= (120582

119894119895) is

a diagonal matrix The radius 119903119894is given by the following

formula (see [12])

119903119894=

1

119899

radic

119899

prod

119896=1

120582

1119899

119894119896=

radicdet (119875119894)

1119899

(9)

The objective function to be minimized is the following

119869 (119883119880 119881) =

119862

sum

119894=1

119873

sum

119895=1

119906

119898

119894119895(119889

2

119894119895minus 119903

2

119894) (10)

where the membership degrees 119906119894119895are updated as

119906119894119895= 1 times (

119862

sum

ℎ=1

((119889

2

119894119895max (0 1 minus 119903

2

119894119889

2

119894119895))

(119889

2

ℎ119895max (0 1 minus 119903

2

ℎ119889

2

ℎ119895)))

1(119898minus1)

)

minus1

=

1

sum

119862

ℎ=1(119889

2

119894119895119908119894119895119889

2

ℎ119895119908ℎ119895)

1(119898minus1)

(11)

We set 11988910158402119896119895

= 119889

2

119896119895119908119896119895

= max(0 1198892119896119895

minus 119903

2

119896) and define the

number 120593119895

= card119896 isin 1 119862 119889

1015840

119896119895= 0 for any 119895 =

1 119873 thus we obtain

119906119894119895=

1

sum

119862

ℎ=1(119889

1015840

119894119895119889

1015840

ℎ119895)

2(119898minus1)if 120593119895= 0

119906119894119895=

0 if 1198891015840119894119895gt 0

if 120593119895gt 0

1

120593119895

if 1198891015840119894119895= 0

(12)

However the usage of (12) produces the negative effect ofdiminishing the objective function (10) when a meaningfulnumber of features are placed in a cluster and this fact canprevent the separation of the clusters Then a solution to thisproblem consists in the assumption of a small starting valueof 119903119894and then it is increased gradually with the factor120573(119897)119862(119897)

where119862(119897) is the number of clusters at the 119897th iteration and120573

(119897)

is defined recursively as 120573(0) = 1 120573

(119897)= min(119862(119897minus1) 120573(119897minus1)) by

setting

119868119894119896

=

sum

119873

119895=1min (119906

119894119895 119906119896119894)

sum

119873

119895=1119906119894119895

(13)

and the symmetric matrix 119878 = (119878119894119896) where 119878

119894119896= max119868

119894119896 119868119896119894

is defined as well If 119878(119897) is the matrix 119878 at the 119897th iteration (119897 gt

1) and the threshold 120572

(119897)= 1(119862

(119897)minus 1) is introduced as limit

then two indexes 119894lowast and 119896

lowast are determined such that 119878(119897)119894lowast119896lowast

ge

120572

(119897) and thus 119894lowast and 119896

lowast are merged by setting

119906

(119897)

119894lowast119895= 119906

(119897)

119894lowast119895+ 119906

(119897)

119896lowast119895 forall119895 isin 1 119873

119862

(119897)= 119862

(119897minus1)+ 1

(14)

The 119896

lowastth row can be removed from the matrix 119880

(119897) Inother words the EFCM algorithm can be summarized in thefollowing steps

(1) The user assigns the initial number of clusters 119862

(0)

119898 gt 1 (usually119898 = 2) 120576 gt 0 the initial value 119878(0)119894119896

= 0and 120573

(0)= 1

(2) The membership degrees 119906

(0)

119894119895(119895 = 1 119873 and 119894 =

1 119862(0)) are assigned randomly

4 Advances in Fuzzy Systems

Figure 2 Example of events georeferenced on a road network

(3) The centers of the clusters V119894are calculated by using

(3)(4) The radii of the clusters are calculated by using (9)(5) 119906119894119895is calculated by using (12)

(6) The indexes 119894lowast and 119896

lowast are determined in such a waythat 119878(119897)119894lowast119896lowast assumes the possible greatest value at the 119897th

iteration(7) If |119878(119897)

119894lowast119896lowast minus 119878

(119897minus1)

119894lowast119896lowast | lt 120576 and 119878

(119897)

119894lowast119896lowast gt 120572(119897) = 1(119862

(119897minus1)minus 1)

then the 119894lowastth and 119896

lowastth clusters aremerged via (14) andthe 119896lowastth row is deleted from 119880

(119897)(8) If (6) is satisfied then the process stops otherwise go

to the step (3) for the (119897 + 1)th iteration

3 Hotspots Detection and Evolution inDisease Analysis

Each pattern is given by the event corresponding to theresidence of the patient to whom a specific disease has beendetected The two features of the pattern are the geographiccoordinates of the residence

The first step of our process is a geocoding activitynecessary for obtaining the event dataset starting by the streetaddress of the patients

To ensure an accurate matching for the geopositioning ofthe event we need the topologically correct road network andthe corresponding complete toponymic data

The starting data include the name of the street and thehouse number of the patientrsquos residence After the matchingprocess each data is converted in an event point georefer-enced on the map

In Figure 2 the road network of the district of Naples isshown the name of the street is labeled on themap the eventsare georeferenced as points on the map

Figure 3 shows the data corresponding to an eventselected on the map

After geo-referencing each event the event dataset canbe split partitioning them by time interval For example theevent in Figure 3 can be split by the field ldquoYearrdquo

For each subset of events we apply the EFCM algorithmto detect the final cluster prototypes

Figure 3 Data associated to an event on the map

Figure 4 A form created in theGIS Tool ESRIArcGIS formanagingthe EFCM process

In this research we point out the analysis of the temporalevolution and spread of oto-laryngo-pharyngeal diseasesdetected within the district of Naples The datasets dividedby time sequences corresponding to periods of one yearare made up of patterns for different events georeferencedcorresponding to ailments encountered in patients for whichan intervention and the subsequent histological examinationwere pointed out as well The event refers to the geoposition-ing of the location of the patient

The data have been further divided by the type of thedisease for analyzing the distribution and evolution of eachspecific disease on the area of the study

The EFCM algorithm has been encapsulated in the GISplatform ESRI ArcGIS Figure 4 shows the mask created forsetting the parameters and running the EFCM algorithm

We can set other numerical fields for adding otherfeatures to the geographical coordinates

Initially we set the initial number of clusters the fuzzifierm and the error threshold for stopping the iterations Afterrunning EFCM the number of iterations the final numberof clusters and the error calculated at the last iteration arereported The resultant clusters are shown as circular areason the map and can be saved in a new geographic layer

The final process concerns the comparative analysis ofthe hotspots obtained by the clusters corresponding to eachsubsets of events

Advances in Fuzzy Systems 5

Figure 5 Analysis of the spatio-temporal evolution of hotspotsdetected in two consecutive periods

Figure 6 Edema of bilateral Reinke diseasemdashyear 2011 display ofthe hotspots on the map

Figure 5 shows an example of display on the map ofhotspots obtained as final clusters for two consecutive subsetof events

In order to assess the expansion and the displacementof a hotspot we measure the radius of the hotspot and thedistance between the centroids of two intersecting hotspots

In the next section we present the results obtained byapplying this method for the data corresponding to surgicalinterventions to the oto-laryngo-pharyngeal apparatus inpatients residents in the district of Naples between the years2008 and 2012

We divide the dataset per year and analyze various typesof diseases

Among the types of the most frequent diseases thefollowing were analyzed

(i) carcinoma(ii) edema of bilateral Reinke(iii) hypertrophy of the inferior turbinate(iv) nasal polyposis(v) bilateral vocal fold prolapse

In the next section we show the most significant resultsobtained by applying this method to the each partitioneddataset of events

Figure 7 Edema of bilateral Reinke diseasemdashyear 2012 display ofthe hotspots on the map

Table 1 Endema of bilateral Reinke diseasemdashfinal number ofclusters and final error for year

Year Initial numberof clusters

Final numberof clusters |119880

(119897)minus 119880

(119897minus1)| 120576

2008 15 4 048 times 10

minus21 times 10

minus2

2009 15 4 055 times 10

minus21 times 10

minus2

2010 15 4 071 times 10

minus21 times 10

minus2

2011 15 4 067 times 10

minus21 times 10

minus2

2012 15 3 053 times 10

minus21 times 10

minus2

4 Test Results

Wepresent the results obtained on the event dataset describedabove in the period between the years 2008 and 2012

We consider first the subset of data corresponding to theedema of bilateral Reinke disease

We fix the fuzzifier parameter to 01 the initial number ofclusters to 15 and the final iteration error to 1 times 10minus2

Table 1 shows the results obtained for each yearWe present the details relating to the comparison of the

hotspots obtained by considering the event data for the years2011 and 2012

Figures 6 and 7 show respectively the hotspots obtainedby using the pattern subset of events that occurred in the years2011 and 2012

Figure 8 shows the overlap of the hotspots obtained forthe two years in red the hotspots corresponding to the year2011 in blue the ones corresponding to the year 2012

Table 2 shows in the first two columns the labels of thehotspots in 2011 and 2012 in third (resp fourth) column theradius obtained in 2011 (resp 2012) and the distance betweenthe centroids is given in the fifth column

The results show that only hotspot 3 obtained for the year2011 remains almost unchanged in the year 2012 Insteadhotspots 1 and 2 seem to merge into a single larger hotspot(the hotspot 1 obtained for the year 2012) and hotspot 4 thatshifts about 1 km is expanded the radius of this hotspot in2012 is about 65 km (hotspot 3 obtained for the year 2012 inFigure 8)

6 Advances in Fuzzy Systems

Figure 8 Edema of bilateral Reinke diseasemdashyears 2011 and 2012display of the two hotspotsrsquo series

Table 2 Endema of bilateral Reinke diseasemdashcomparison results ofthe hotspots obtained for the years 2011 and 2012

2011hotspot

Intersecting2012 hotspot

Radius 2011hotspot (km)

Radius 2012hotspot (km)

Centroidrsquosdistance (km)

1 1 1724 3848 17592 1 1943 3848 15073 2 3434 3453 00744 3 3591 6519 1115

Table 3 Nasal polyposismdashcomparison results of the hotspotsobtained for the years 2011 and 2012

2011hotspot

Intersecting2012 hotspot

Radius 2011hotspot (km)

Radius 2012hotspot (km)

Centroidrsquosdistance (km)

1 1 3087 4951 26562 2 4915 7103 1052

Now we show the results obtained for the disease nasalpolyposis

Figure 9 shows the overlap of the hotspots obtained forthe two years 2011 and 2012

In Table 3 the comparisonrsquos results are reportedThe results in Figure 9 show that in 2011 and 2012 there

are two hotspots the one covering an area of the city ofNaples and the other coveringmanyVesuvian townsThe twohotspots which in 2011 covered a circular area with a radii ofabout 3 and 5 km respectively in 2012 cover a circular areawith radii of about 5 and 7 km respectively

The histogram in Figure 10 shows the trend of the radii ofthe two hotspots in the course of time

It is relevant the spread in recent years of the hotspot thatsurrounds the Vesuvian towns (the radius of this hotspotfrom about 2 km in the year 2008 is about 7 km in the year2012)

Another significant trend concerns the hotspots obtainedfor the carcinoma disease

Also in this case the two main hotsposts cover the cityof Naples and many Vesuvian towns In in this case we havea very high spread of the hotspot covering the city of Naples

Figure 9 Nasal polyposis diseasemdashyears 2011 and 2012 display ofthe two hotspotsrsquo series

8

7

6

5

4

3

2

1

0

(km

)

2008 2009 2010 2011 2012Year

NaplesVesuvian towns

Figure 10 Nasal polyposis diseasemdashhistogram showing the varia-tion of the radius of the two hotspots over time

(cfr Figure 11) in recent years the radius of this hotspot isincreased up to 95 km

5 Conclusions

The hyperspheres obtained as clusters (circles in case oftwo dimensions) by using EFCM can represent hotspots inhotspot analysis this method has a linear computationalcomplexity and is robust to noises and outliers In hotspotsanalysis the patterns are bidimensional and the features areformed by geographic coordinates the cluster prototypes arecircles that can represent a good approximation of hotspotareas and can be displayed as circular areas on the map

In this paper we present a new method that uses theEFCM algorithm for studying the spatio-temporal evolutionof hotspots in disease analysis

Advances in Fuzzy Systems 7

12

10

8

6

4

2

0

(km

)

2008 2009 2010 2011 2012Year

NaplesVesuvian towns

Figure 11 Carcinoma diseasemdashhistogram showing the variation ofthe radius of the two hotspots over time

We consider the residencersquos information of patients inthe district of Naples (Italy) to whom a surgical interventionto the oto-laryngo-pharyngeal apparatus was carried outbetween the years 2008 and 2012 A geocoding process isused for geo-referencing the data then the georeferenceddataset is partitionedper year and type of disease we comparethe hotspots obtained for each pair of consecutive years andanalyze the trend of each hotspot over time measuring thevariation of the radius and the distance between intersectingcluster centroids concerning two consecutive years

The results show a consistent spread in the last years ofthe nasal polyposis disease hotspot covering some Vesuviantowns and of the carcinoma disease hotspot covering the cityof Naples

References

[1] S P Chainey S Reid and N Stuart ldquoWhen is a hotspot ahotspot A procedure for creating statistically robust hotspotgeo-graphic maps of crimerdquo in Innovations in GIS 9 Socioe-conomic Applications of Geographic Information Science DKidner G Higgs and S White Eds Taylor and FrancisLondon UK 2002

[2] K Harries Geographic Mapping Crime Principle and PracticeNational Institute of Justice Washington DC USA 1999

[3] A T Murray I McGuffog J S Western and P MullinsldquoExploratory spatial data analysis techniques for examiningurban crimerdquo British Journal of Criminology vol 41 no 2 pp309ndash329 2001

[4] F Di Martino and S Sessa ldquoThe extended fuzzy c-meansalgorithm for hotspots in spatio-temporal GISrdquo Expert Systemswith Applications vol 38 no 9 pp 11829ndash11836 2011

[5] R M Mullner K Chung K G Croke and E K MensahldquoIntroduction geographic information systems in public healthandmedicinerdquo Journal ofMedical Systems vol 28 no 3 pp 215ndash221 2004

[6] K Polat ldquoApplication of attribute weighting method based onclustering centers to discrimination of linearly non-separablemedical datasetsrdquo Journal of Medical Systems vol 36 no 4 pp2657ndash2673 2012

[7] C K Wei S Su and M C Yang ldquoApplication of data miningon the develoment of a disease distribution map of screenedcommunity residents of Taipei County in Taiwanrdquo Journal ofMedical Systems vol 36 no 3 pp 2021ndash2027 2012

[8] I Gath and A B Geva ldquoUnsupervised optimal fuzzy clus-teringrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 11 no 7 pp 773ndash780 1989

[9] R Krishnapuram and J Kim ldquoClustering algorithms based onvolume criteriardquo IEEE Transactions on Fuzzy Systems vol 8 no2 pp 228ndash236 2000

[10] J C Bezdek Pattern Recognition with Fuzzy Objective FunctionAlgorithms Plenum Press New York NY USA 1981

[11] U Kaymak R BabuskaM Setnes H B Verbruggen andHMvanNauta Lemke ldquoMethods for simplification of fuzzymodelsrdquoin Intelligent Hybrid Systems D Ruan Ed pp 91ndash108 KluwerAcademic Publishers Dordrecht The Netherlands 1997

[12] U Kaymak and M Setnes ldquoFuzzy clustering with volumeprototypes and adaptive cluster mergingrdquo IEEE Transactions onFuzzy Systems vol 10 no 6 pp 705ndash712 2002

[13] F Di Martino V Loia and S Sessa ldquoExtended fuzzy c-meansclustering algorithm for hotspot events in spatial analysisrdquoInternational Journal of Hybrid Intelligent Systems vol 4 pp 1ndash14 2007

[14] F Di Martino and S Sessa ldquoImplementation of the extendedfuzzy c-means algorithm in geographic information systemsrdquoJournal of Uncertain Systems vol 3 no 4 pp 298ndash306 2009

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 3: Research Article Spatiotemporal Hotspots Analysis for Exploring …downloads.hindawi.com/journals/afs/2013/385974.pdf · 2019. 7. 31. · Advances in Fuzzy Systems Let X ={x1,...,x

Advances in Fuzzy Systems 3

and detecting the presence of new hotspots in regions previ-ously not covered by hotspots and the absebce of hotspots inregions previously spatially included in hotspot areas

In this research we present a method for studyingthe spatio-temporal evolution of hotspots areas in diseaseanalysis we apply the EFCM algorithm for comparingin consecutive years event datasets corresponding to oto-laryngopharyngeal diseases diagnosis detected in the districtof Naples (I) Each event corresponds to the residence of thepatient who contracted the disease

We study the spatio-temporal evolution of the hotspotsanalyzing the intersections of hotspots corresponding totwo consecutive years the displacement of the centroidsthe increase or reduction of the hotspots areas and theemergence of new hotspots

In Section 2 we give an overview of the EFCM algo-rithm In Section 3 we present our method for studying thespatio-temporal evolution of hotspots in disease analysis InSection 4 we present the results of the spatio-temporal evolu-tion of hotspots for the otolaryngologist-laryngopharyngealdiseases diagnosis events detected in the district of Naples (I)Our conclusions are in Section 5

2 The EFCM Algorithm

In the EFCM algorithm we consider clustering prototypesgiven by hyperspheres in the 119899-dimensional featurersquos spaceThe 119894th hypersphere is characterized by a centroid v

119894=

(k1119894 k

119899119894) and a radius 119903

119894

Indeed if 119903119894is the radius of 119881

119894 we say that 119909

119895belongs to

119881119894if 119889119894119895le 119903119894

The radius 119903119894is obtained considering the covariance

matrix 119875119894associated with the 119894th cluster defined as

P119894=

sum

119873

119895=1119906

119898

119894119895(x119895minus k119894) (x119895minus k119894)

119879

sum

119873

119895=1119906

119898

119894119895

(7)

whose determinant gives the volume of the 119894th cluster SinceP119894is symmetric and positive it can be decomposed in the

following form

Pi = Q119894Λ119894Q119879119894 (8)

where 119876119894is an orthonormal matrix and Δ

119894= (120582

119894119895) is

a diagonal matrix The radius 119903119894is given by the following

formula (see [12])

119903119894=

1

119899

radic

119899

prod

119896=1

120582

1119899

119894119896=

radicdet (119875119894)

1119899

(9)

The objective function to be minimized is the following

119869 (119883119880 119881) =

119862

sum

119894=1

119873

sum

119895=1

119906

119898

119894119895(119889

2

119894119895minus 119903

2

119894) (10)

where the membership degrees 119906119894119895are updated as

119906119894119895= 1 times (

119862

sum

ℎ=1

((119889

2

119894119895max (0 1 minus 119903

2

119894119889

2

119894119895))

(119889

2

ℎ119895max (0 1 minus 119903

2

ℎ119889

2

ℎ119895)))

1(119898minus1)

)

minus1

=

1

sum

119862

ℎ=1(119889

2

119894119895119908119894119895119889

2

ℎ119895119908ℎ119895)

1(119898minus1)

(11)

We set 11988910158402119896119895

= 119889

2

119896119895119908119896119895

= max(0 1198892119896119895

minus 119903

2

119896) and define the

number 120593119895

= card119896 isin 1 119862 119889

1015840

119896119895= 0 for any 119895 =

1 119873 thus we obtain

119906119894119895=

1

sum

119862

ℎ=1(119889

1015840

119894119895119889

1015840

ℎ119895)

2(119898minus1)if 120593119895= 0

119906119894119895=

0 if 1198891015840119894119895gt 0

if 120593119895gt 0

1

120593119895

if 1198891015840119894119895= 0

(12)

However the usage of (12) produces the negative effect ofdiminishing the objective function (10) when a meaningfulnumber of features are placed in a cluster and this fact canprevent the separation of the clusters Then a solution to thisproblem consists in the assumption of a small starting valueof 119903119894and then it is increased gradually with the factor120573(119897)119862(119897)

where119862(119897) is the number of clusters at the 119897th iteration and120573

(119897)

is defined recursively as 120573(0) = 1 120573

(119897)= min(119862(119897minus1) 120573(119897minus1)) by

setting

119868119894119896

=

sum

119873

119895=1min (119906

119894119895 119906119896119894)

sum

119873

119895=1119906119894119895

(13)

and the symmetric matrix 119878 = (119878119894119896) where 119878

119894119896= max119868

119894119896 119868119896119894

is defined as well If 119878(119897) is the matrix 119878 at the 119897th iteration (119897 gt

1) and the threshold 120572

(119897)= 1(119862

(119897)minus 1) is introduced as limit

then two indexes 119894lowast and 119896

lowast are determined such that 119878(119897)119894lowast119896lowast

ge

120572

(119897) and thus 119894lowast and 119896

lowast are merged by setting

119906

(119897)

119894lowast119895= 119906

(119897)

119894lowast119895+ 119906

(119897)

119896lowast119895 forall119895 isin 1 119873

119862

(119897)= 119862

(119897minus1)+ 1

(14)

The 119896

lowastth row can be removed from the matrix 119880

(119897) Inother words the EFCM algorithm can be summarized in thefollowing steps

(1) The user assigns the initial number of clusters 119862

(0)

119898 gt 1 (usually119898 = 2) 120576 gt 0 the initial value 119878(0)119894119896

= 0and 120573

(0)= 1

(2) The membership degrees 119906

(0)

119894119895(119895 = 1 119873 and 119894 =

1 119862(0)) are assigned randomly

4 Advances in Fuzzy Systems

Figure 2 Example of events georeferenced on a road network

(3) The centers of the clusters V119894are calculated by using

(3)(4) The radii of the clusters are calculated by using (9)(5) 119906119894119895is calculated by using (12)

(6) The indexes 119894lowast and 119896

lowast are determined in such a waythat 119878(119897)119894lowast119896lowast assumes the possible greatest value at the 119897th

iteration(7) If |119878(119897)

119894lowast119896lowast minus 119878

(119897minus1)

119894lowast119896lowast | lt 120576 and 119878

(119897)

119894lowast119896lowast gt 120572(119897) = 1(119862

(119897minus1)minus 1)

then the 119894lowastth and 119896

lowastth clusters aremerged via (14) andthe 119896lowastth row is deleted from 119880

(119897)(8) If (6) is satisfied then the process stops otherwise go

to the step (3) for the (119897 + 1)th iteration

3 Hotspots Detection and Evolution inDisease Analysis

Each pattern is given by the event corresponding to theresidence of the patient to whom a specific disease has beendetected The two features of the pattern are the geographiccoordinates of the residence

The first step of our process is a geocoding activitynecessary for obtaining the event dataset starting by the streetaddress of the patients

To ensure an accurate matching for the geopositioning ofthe event we need the topologically correct road network andthe corresponding complete toponymic data

The starting data include the name of the street and thehouse number of the patientrsquos residence After the matchingprocess each data is converted in an event point georefer-enced on the map

In Figure 2 the road network of the district of Naples isshown the name of the street is labeled on themap the eventsare georeferenced as points on the map

Figure 3 shows the data corresponding to an eventselected on the map

After geo-referencing each event the event dataset canbe split partitioning them by time interval For example theevent in Figure 3 can be split by the field ldquoYearrdquo

For each subset of events we apply the EFCM algorithmto detect the final cluster prototypes

Figure 3 Data associated to an event on the map

Figure 4 A form created in theGIS Tool ESRIArcGIS formanagingthe EFCM process

In this research we point out the analysis of the temporalevolution and spread of oto-laryngo-pharyngeal diseasesdetected within the district of Naples The datasets dividedby time sequences corresponding to periods of one yearare made up of patterns for different events georeferencedcorresponding to ailments encountered in patients for whichan intervention and the subsequent histological examinationwere pointed out as well The event refers to the geoposition-ing of the location of the patient

The data have been further divided by the type of thedisease for analyzing the distribution and evolution of eachspecific disease on the area of the study

The EFCM algorithm has been encapsulated in the GISplatform ESRI ArcGIS Figure 4 shows the mask created forsetting the parameters and running the EFCM algorithm

We can set other numerical fields for adding otherfeatures to the geographical coordinates

Initially we set the initial number of clusters the fuzzifierm and the error threshold for stopping the iterations Afterrunning EFCM the number of iterations the final numberof clusters and the error calculated at the last iteration arereported The resultant clusters are shown as circular areason the map and can be saved in a new geographic layer

The final process concerns the comparative analysis ofthe hotspots obtained by the clusters corresponding to eachsubsets of events

Advances in Fuzzy Systems 5

Figure 5 Analysis of the spatio-temporal evolution of hotspotsdetected in two consecutive periods

Figure 6 Edema of bilateral Reinke diseasemdashyear 2011 display ofthe hotspots on the map

Figure 5 shows an example of display on the map ofhotspots obtained as final clusters for two consecutive subsetof events

In order to assess the expansion and the displacementof a hotspot we measure the radius of the hotspot and thedistance between the centroids of two intersecting hotspots

In the next section we present the results obtained byapplying this method for the data corresponding to surgicalinterventions to the oto-laryngo-pharyngeal apparatus inpatients residents in the district of Naples between the years2008 and 2012

We divide the dataset per year and analyze various typesof diseases

Among the types of the most frequent diseases thefollowing were analyzed

(i) carcinoma(ii) edema of bilateral Reinke(iii) hypertrophy of the inferior turbinate(iv) nasal polyposis(v) bilateral vocal fold prolapse

In the next section we show the most significant resultsobtained by applying this method to the each partitioneddataset of events

Figure 7 Edema of bilateral Reinke diseasemdashyear 2012 display ofthe hotspots on the map

Table 1 Endema of bilateral Reinke diseasemdashfinal number ofclusters and final error for year

Year Initial numberof clusters

Final numberof clusters |119880

(119897)minus 119880

(119897minus1)| 120576

2008 15 4 048 times 10

minus21 times 10

minus2

2009 15 4 055 times 10

minus21 times 10

minus2

2010 15 4 071 times 10

minus21 times 10

minus2

2011 15 4 067 times 10

minus21 times 10

minus2

2012 15 3 053 times 10

minus21 times 10

minus2

4 Test Results

Wepresent the results obtained on the event dataset describedabove in the period between the years 2008 and 2012

We consider first the subset of data corresponding to theedema of bilateral Reinke disease

We fix the fuzzifier parameter to 01 the initial number ofclusters to 15 and the final iteration error to 1 times 10minus2

Table 1 shows the results obtained for each yearWe present the details relating to the comparison of the

hotspots obtained by considering the event data for the years2011 and 2012

Figures 6 and 7 show respectively the hotspots obtainedby using the pattern subset of events that occurred in the years2011 and 2012

Figure 8 shows the overlap of the hotspots obtained forthe two years in red the hotspots corresponding to the year2011 in blue the ones corresponding to the year 2012

Table 2 shows in the first two columns the labels of thehotspots in 2011 and 2012 in third (resp fourth) column theradius obtained in 2011 (resp 2012) and the distance betweenthe centroids is given in the fifth column

The results show that only hotspot 3 obtained for the year2011 remains almost unchanged in the year 2012 Insteadhotspots 1 and 2 seem to merge into a single larger hotspot(the hotspot 1 obtained for the year 2012) and hotspot 4 thatshifts about 1 km is expanded the radius of this hotspot in2012 is about 65 km (hotspot 3 obtained for the year 2012 inFigure 8)

6 Advances in Fuzzy Systems

Figure 8 Edema of bilateral Reinke diseasemdashyears 2011 and 2012display of the two hotspotsrsquo series

Table 2 Endema of bilateral Reinke diseasemdashcomparison results ofthe hotspots obtained for the years 2011 and 2012

2011hotspot

Intersecting2012 hotspot

Radius 2011hotspot (km)

Radius 2012hotspot (km)

Centroidrsquosdistance (km)

1 1 1724 3848 17592 1 1943 3848 15073 2 3434 3453 00744 3 3591 6519 1115

Table 3 Nasal polyposismdashcomparison results of the hotspotsobtained for the years 2011 and 2012

2011hotspot

Intersecting2012 hotspot

Radius 2011hotspot (km)

Radius 2012hotspot (km)

Centroidrsquosdistance (km)

1 1 3087 4951 26562 2 4915 7103 1052

Now we show the results obtained for the disease nasalpolyposis

Figure 9 shows the overlap of the hotspots obtained forthe two years 2011 and 2012

In Table 3 the comparisonrsquos results are reportedThe results in Figure 9 show that in 2011 and 2012 there

are two hotspots the one covering an area of the city ofNaples and the other coveringmanyVesuvian townsThe twohotspots which in 2011 covered a circular area with a radii ofabout 3 and 5 km respectively in 2012 cover a circular areawith radii of about 5 and 7 km respectively

The histogram in Figure 10 shows the trend of the radii ofthe two hotspots in the course of time

It is relevant the spread in recent years of the hotspot thatsurrounds the Vesuvian towns (the radius of this hotspotfrom about 2 km in the year 2008 is about 7 km in the year2012)

Another significant trend concerns the hotspots obtainedfor the carcinoma disease

Also in this case the two main hotsposts cover the cityof Naples and many Vesuvian towns In in this case we havea very high spread of the hotspot covering the city of Naples

Figure 9 Nasal polyposis diseasemdashyears 2011 and 2012 display ofthe two hotspotsrsquo series

8

7

6

5

4

3

2

1

0

(km

)

2008 2009 2010 2011 2012Year

NaplesVesuvian towns

Figure 10 Nasal polyposis diseasemdashhistogram showing the varia-tion of the radius of the two hotspots over time

(cfr Figure 11) in recent years the radius of this hotspot isincreased up to 95 km

5 Conclusions

The hyperspheres obtained as clusters (circles in case oftwo dimensions) by using EFCM can represent hotspots inhotspot analysis this method has a linear computationalcomplexity and is robust to noises and outliers In hotspotsanalysis the patterns are bidimensional and the features areformed by geographic coordinates the cluster prototypes arecircles that can represent a good approximation of hotspotareas and can be displayed as circular areas on the map

In this paper we present a new method that uses theEFCM algorithm for studying the spatio-temporal evolutionof hotspots in disease analysis

Advances in Fuzzy Systems 7

12

10

8

6

4

2

0

(km

)

2008 2009 2010 2011 2012Year

NaplesVesuvian towns

Figure 11 Carcinoma diseasemdashhistogram showing the variation ofthe radius of the two hotspots over time

We consider the residencersquos information of patients inthe district of Naples (Italy) to whom a surgical interventionto the oto-laryngo-pharyngeal apparatus was carried outbetween the years 2008 and 2012 A geocoding process isused for geo-referencing the data then the georeferenceddataset is partitionedper year and type of disease we comparethe hotspots obtained for each pair of consecutive years andanalyze the trend of each hotspot over time measuring thevariation of the radius and the distance between intersectingcluster centroids concerning two consecutive years

The results show a consistent spread in the last years ofthe nasal polyposis disease hotspot covering some Vesuviantowns and of the carcinoma disease hotspot covering the cityof Naples

References

[1] S P Chainey S Reid and N Stuart ldquoWhen is a hotspot ahotspot A procedure for creating statistically robust hotspotgeo-graphic maps of crimerdquo in Innovations in GIS 9 Socioe-conomic Applications of Geographic Information Science DKidner G Higgs and S White Eds Taylor and FrancisLondon UK 2002

[2] K Harries Geographic Mapping Crime Principle and PracticeNational Institute of Justice Washington DC USA 1999

[3] A T Murray I McGuffog J S Western and P MullinsldquoExploratory spatial data analysis techniques for examiningurban crimerdquo British Journal of Criminology vol 41 no 2 pp309ndash329 2001

[4] F Di Martino and S Sessa ldquoThe extended fuzzy c-meansalgorithm for hotspots in spatio-temporal GISrdquo Expert Systemswith Applications vol 38 no 9 pp 11829ndash11836 2011

[5] R M Mullner K Chung K G Croke and E K MensahldquoIntroduction geographic information systems in public healthandmedicinerdquo Journal ofMedical Systems vol 28 no 3 pp 215ndash221 2004

[6] K Polat ldquoApplication of attribute weighting method based onclustering centers to discrimination of linearly non-separablemedical datasetsrdquo Journal of Medical Systems vol 36 no 4 pp2657ndash2673 2012

[7] C K Wei S Su and M C Yang ldquoApplication of data miningon the develoment of a disease distribution map of screenedcommunity residents of Taipei County in Taiwanrdquo Journal ofMedical Systems vol 36 no 3 pp 2021ndash2027 2012

[8] I Gath and A B Geva ldquoUnsupervised optimal fuzzy clus-teringrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 11 no 7 pp 773ndash780 1989

[9] R Krishnapuram and J Kim ldquoClustering algorithms based onvolume criteriardquo IEEE Transactions on Fuzzy Systems vol 8 no2 pp 228ndash236 2000

[10] J C Bezdek Pattern Recognition with Fuzzy Objective FunctionAlgorithms Plenum Press New York NY USA 1981

[11] U Kaymak R BabuskaM Setnes H B Verbruggen andHMvanNauta Lemke ldquoMethods for simplification of fuzzymodelsrdquoin Intelligent Hybrid Systems D Ruan Ed pp 91ndash108 KluwerAcademic Publishers Dordrecht The Netherlands 1997

[12] U Kaymak and M Setnes ldquoFuzzy clustering with volumeprototypes and adaptive cluster mergingrdquo IEEE Transactions onFuzzy Systems vol 10 no 6 pp 705ndash712 2002

[13] F Di Martino V Loia and S Sessa ldquoExtended fuzzy c-meansclustering algorithm for hotspot events in spatial analysisrdquoInternational Journal of Hybrid Intelligent Systems vol 4 pp 1ndash14 2007

[14] F Di Martino and S Sessa ldquoImplementation of the extendedfuzzy c-means algorithm in geographic information systemsrdquoJournal of Uncertain Systems vol 3 no 4 pp 298ndash306 2009

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 4: Research Article Spatiotemporal Hotspots Analysis for Exploring …downloads.hindawi.com/journals/afs/2013/385974.pdf · 2019. 7. 31. · Advances in Fuzzy Systems Let X ={x1,...,x

4 Advances in Fuzzy Systems

Figure 2 Example of events georeferenced on a road network

(3) The centers of the clusters V119894are calculated by using

(3)(4) The radii of the clusters are calculated by using (9)(5) 119906119894119895is calculated by using (12)

(6) The indexes 119894lowast and 119896

lowast are determined in such a waythat 119878(119897)119894lowast119896lowast assumes the possible greatest value at the 119897th

iteration(7) If |119878(119897)

119894lowast119896lowast minus 119878

(119897minus1)

119894lowast119896lowast | lt 120576 and 119878

(119897)

119894lowast119896lowast gt 120572(119897) = 1(119862

(119897minus1)minus 1)

then the 119894lowastth and 119896

lowastth clusters aremerged via (14) andthe 119896lowastth row is deleted from 119880

(119897)(8) If (6) is satisfied then the process stops otherwise go

to the step (3) for the (119897 + 1)th iteration

3 Hotspots Detection and Evolution inDisease Analysis

Each pattern is given by the event corresponding to theresidence of the patient to whom a specific disease has beendetected The two features of the pattern are the geographiccoordinates of the residence

The first step of our process is a geocoding activitynecessary for obtaining the event dataset starting by the streetaddress of the patients

To ensure an accurate matching for the geopositioning ofthe event we need the topologically correct road network andthe corresponding complete toponymic data

The starting data include the name of the street and thehouse number of the patientrsquos residence After the matchingprocess each data is converted in an event point georefer-enced on the map

In Figure 2 the road network of the district of Naples isshown the name of the street is labeled on themap the eventsare georeferenced as points on the map

Figure 3 shows the data corresponding to an eventselected on the map

After geo-referencing each event the event dataset canbe split partitioning them by time interval For example theevent in Figure 3 can be split by the field ldquoYearrdquo

For each subset of events we apply the EFCM algorithmto detect the final cluster prototypes

Figure 3 Data associated to an event on the map

Figure 4 A form created in theGIS Tool ESRIArcGIS formanagingthe EFCM process

In this research we point out the analysis of the temporalevolution and spread of oto-laryngo-pharyngeal diseasesdetected within the district of Naples The datasets dividedby time sequences corresponding to periods of one yearare made up of patterns for different events georeferencedcorresponding to ailments encountered in patients for whichan intervention and the subsequent histological examinationwere pointed out as well The event refers to the geoposition-ing of the location of the patient

The data have been further divided by the type of thedisease for analyzing the distribution and evolution of eachspecific disease on the area of the study

The EFCM algorithm has been encapsulated in the GISplatform ESRI ArcGIS Figure 4 shows the mask created forsetting the parameters and running the EFCM algorithm

We can set other numerical fields for adding otherfeatures to the geographical coordinates

Initially we set the initial number of clusters the fuzzifierm and the error threshold for stopping the iterations Afterrunning EFCM the number of iterations the final numberof clusters and the error calculated at the last iteration arereported The resultant clusters are shown as circular areason the map and can be saved in a new geographic layer

The final process concerns the comparative analysis ofthe hotspots obtained by the clusters corresponding to eachsubsets of events

Advances in Fuzzy Systems 5

Figure 5 Analysis of the spatio-temporal evolution of hotspotsdetected in two consecutive periods

Figure 6 Edema of bilateral Reinke diseasemdashyear 2011 display ofthe hotspots on the map

Figure 5 shows an example of display on the map ofhotspots obtained as final clusters for two consecutive subsetof events

In order to assess the expansion and the displacementof a hotspot we measure the radius of the hotspot and thedistance between the centroids of two intersecting hotspots

In the next section we present the results obtained byapplying this method for the data corresponding to surgicalinterventions to the oto-laryngo-pharyngeal apparatus inpatients residents in the district of Naples between the years2008 and 2012

We divide the dataset per year and analyze various typesof diseases

Among the types of the most frequent diseases thefollowing were analyzed

(i) carcinoma(ii) edema of bilateral Reinke(iii) hypertrophy of the inferior turbinate(iv) nasal polyposis(v) bilateral vocal fold prolapse

In the next section we show the most significant resultsobtained by applying this method to the each partitioneddataset of events

Figure 7 Edema of bilateral Reinke diseasemdashyear 2012 display ofthe hotspots on the map

Table 1 Endema of bilateral Reinke diseasemdashfinal number ofclusters and final error for year

Year Initial numberof clusters

Final numberof clusters |119880

(119897)minus 119880

(119897minus1)| 120576

2008 15 4 048 times 10

minus21 times 10

minus2

2009 15 4 055 times 10

minus21 times 10

minus2

2010 15 4 071 times 10

minus21 times 10

minus2

2011 15 4 067 times 10

minus21 times 10

minus2

2012 15 3 053 times 10

minus21 times 10

minus2

4 Test Results

Wepresent the results obtained on the event dataset describedabove in the period between the years 2008 and 2012

We consider first the subset of data corresponding to theedema of bilateral Reinke disease

We fix the fuzzifier parameter to 01 the initial number ofclusters to 15 and the final iteration error to 1 times 10minus2

Table 1 shows the results obtained for each yearWe present the details relating to the comparison of the

hotspots obtained by considering the event data for the years2011 and 2012

Figures 6 and 7 show respectively the hotspots obtainedby using the pattern subset of events that occurred in the years2011 and 2012

Figure 8 shows the overlap of the hotspots obtained forthe two years in red the hotspots corresponding to the year2011 in blue the ones corresponding to the year 2012

Table 2 shows in the first two columns the labels of thehotspots in 2011 and 2012 in third (resp fourth) column theradius obtained in 2011 (resp 2012) and the distance betweenthe centroids is given in the fifth column

The results show that only hotspot 3 obtained for the year2011 remains almost unchanged in the year 2012 Insteadhotspots 1 and 2 seem to merge into a single larger hotspot(the hotspot 1 obtained for the year 2012) and hotspot 4 thatshifts about 1 km is expanded the radius of this hotspot in2012 is about 65 km (hotspot 3 obtained for the year 2012 inFigure 8)

6 Advances in Fuzzy Systems

Figure 8 Edema of bilateral Reinke diseasemdashyears 2011 and 2012display of the two hotspotsrsquo series

Table 2 Endema of bilateral Reinke diseasemdashcomparison results ofthe hotspots obtained for the years 2011 and 2012

2011hotspot

Intersecting2012 hotspot

Radius 2011hotspot (km)

Radius 2012hotspot (km)

Centroidrsquosdistance (km)

1 1 1724 3848 17592 1 1943 3848 15073 2 3434 3453 00744 3 3591 6519 1115

Table 3 Nasal polyposismdashcomparison results of the hotspotsobtained for the years 2011 and 2012

2011hotspot

Intersecting2012 hotspot

Radius 2011hotspot (km)

Radius 2012hotspot (km)

Centroidrsquosdistance (km)

1 1 3087 4951 26562 2 4915 7103 1052

Now we show the results obtained for the disease nasalpolyposis

Figure 9 shows the overlap of the hotspots obtained forthe two years 2011 and 2012

In Table 3 the comparisonrsquos results are reportedThe results in Figure 9 show that in 2011 and 2012 there

are two hotspots the one covering an area of the city ofNaples and the other coveringmanyVesuvian townsThe twohotspots which in 2011 covered a circular area with a radii ofabout 3 and 5 km respectively in 2012 cover a circular areawith radii of about 5 and 7 km respectively

The histogram in Figure 10 shows the trend of the radii ofthe two hotspots in the course of time

It is relevant the spread in recent years of the hotspot thatsurrounds the Vesuvian towns (the radius of this hotspotfrom about 2 km in the year 2008 is about 7 km in the year2012)

Another significant trend concerns the hotspots obtainedfor the carcinoma disease

Also in this case the two main hotsposts cover the cityof Naples and many Vesuvian towns In in this case we havea very high spread of the hotspot covering the city of Naples

Figure 9 Nasal polyposis diseasemdashyears 2011 and 2012 display ofthe two hotspotsrsquo series

8

7

6

5

4

3

2

1

0

(km

)

2008 2009 2010 2011 2012Year

NaplesVesuvian towns

Figure 10 Nasal polyposis diseasemdashhistogram showing the varia-tion of the radius of the two hotspots over time

(cfr Figure 11) in recent years the radius of this hotspot isincreased up to 95 km

5 Conclusions

The hyperspheres obtained as clusters (circles in case oftwo dimensions) by using EFCM can represent hotspots inhotspot analysis this method has a linear computationalcomplexity and is robust to noises and outliers In hotspotsanalysis the patterns are bidimensional and the features areformed by geographic coordinates the cluster prototypes arecircles that can represent a good approximation of hotspotareas and can be displayed as circular areas on the map

In this paper we present a new method that uses theEFCM algorithm for studying the spatio-temporal evolutionof hotspots in disease analysis

Advances in Fuzzy Systems 7

12

10

8

6

4

2

0

(km

)

2008 2009 2010 2011 2012Year

NaplesVesuvian towns

Figure 11 Carcinoma diseasemdashhistogram showing the variation ofthe radius of the two hotspots over time

We consider the residencersquos information of patients inthe district of Naples (Italy) to whom a surgical interventionto the oto-laryngo-pharyngeal apparatus was carried outbetween the years 2008 and 2012 A geocoding process isused for geo-referencing the data then the georeferenceddataset is partitionedper year and type of disease we comparethe hotspots obtained for each pair of consecutive years andanalyze the trend of each hotspot over time measuring thevariation of the radius and the distance between intersectingcluster centroids concerning two consecutive years

The results show a consistent spread in the last years ofthe nasal polyposis disease hotspot covering some Vesuviantowns and of the carcinoma disease hotspot covering the cityof Naples

References

[1] S P Chainey S Reid and N Stuart ldquoWhen is a hotspot ahotspot A procedure for creating statistically robust hotspotgeo-graphic maps of crimerdquo in Innovations in GIS 9 Socioe-conomic Applications of Geographic Information Science DKidner G Higgs and S White Eds Taylor and FrancisLondon UK 2002

[2] K Harries Geographic Mapping Crime Principle and PracticeNational Institute of Justice Washington DC USA 1999

[3] A T Murray I McGuffog J S Western and P MullinsldquoExploratory spatial data analysis techniques for examiningurban crimerdquo British Journal of Criminology vol 41 no 2 pp309ndash329 2001

[4] F Di Martino and S Sessa ldquoThe extended fuzzy c-meansalgorithm for hotspots in spatio-temporal GISrdquo Expert Systemswith Applications vol 38 no 9 pp 11829ndash11836 2011

[5] R M Mullner K Chung K G Croke and E K MensahldquoIntroduction geographic information systems in public healthandmedicinerdquo Journal ofMedical Systems vol 28 no 3 pp 215ndash221 2004

[6] K Polat ldquoApplication of attribute weighting method based onclustering centers to discrimination of linearly non-separablemedical datasetsrdquo Journal of Medical Systems vol 36 no 4 pp2657ndash2673 2012

[7] C K Wei S Su and M C Yang ldquoApplication of data miningon the develoment of a disease distribution map of screenedcommunity residents of Taipei County in Taiwanrdquo Journal ofMedical Systems vol 36 no 3 pp 2021ndash2027 2012

[8] I Gath and A B Geva ldquoUnsupervised optimal fuzzy clus-teringrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 11 no 7 pp 773ndash780 1989

[9] R Krishnapuram and J Kim ldquoClustering algorithms based onvolume criteriardquo IEEE Transactions on Fuzzy Systems vol 8 no2 pp 228ndash236 2000

[10] J C Bezdek Pattern Recognition with Fuzzy Objective FunctionAlgorithms Plenum Press New York NY USA 1981

[11] U Kaymak R BabuskaM Setnes H B Verbruggen andHMvanNauta Lemke ldquoMethods for simplification of fuzzymodelsrdquoin Intelligent Hybrid Systems D Ruan Ed pp 91ndash108 KluwerAcademic Publishers Dordrecht The Netherlands 1997

[12] U Kaymak and M Setnes ldquoFuzzy clustering with volumeprototypes and adaptive cluster mergingrdquo IEEE Transactions onFuzzy Systems vol 10 no 6 pp 705ndash712 2002

[13] F Di Martino V Loia and S Sessa ldquoExtended fuzzy c-meansclustering algorithm for hotspot events in spatial analysisrdquoInternational Journal of Hybrid Intelligent Systems vol 4 pp 1ndash14 2007

[14] F Di Martino and S Sessa ldquoImplementation of the extendedfuzzy c-means algorithm in geographic information systemsrdquoJournal of Uncertain Systems vol 3 no 4 pp 298ndash306 2009

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 5: Research Article Spatiotemporal Hotspots Analysis for Exploring …downloads.hindawi.com/journals/afs/2013/385974.pdf · 2019. 7. 31. · Advances in Fuzzy Systems Let X ={x1,...,x

Advances in Fuzzy Systems 5

Figure 5 Analysis of the spatio-temporal evolution of hotspotsdetected in two consecutive periods

Figure 6 Edema of bilateral Reinke diseasemdashyear 2011 display ofthe hotspots on the map

Figure 5 shows an example of display on the map ofhotspots obtained as final clusters for two consecutive subsetof events

In order to assess the expansion and the displacementof a hotspot we measure the radius of the hotspot and thedistance between the centroids of two intersecting hotspots

In the next section we present the results obtained byapplying this method for the data corresponding to surgicalinterventions to the oto-laryngo-pharyngeal apparatus inpatients residents in the district of Naples between the years2008 and 2012

We divide the dataset per year and analyze various typesof diseases

Among the types of the most frequent diseases thefollowing were analyzed

(i) carcinoma(ii) edema of bilateral Reinke(iii) hypertrophy of the inferior turbinate(iv) nasal polyposis(v) bilateral vocal fold prolapse

In the next section we show the most significant resultsobtained by applying this method to the each partitioneddataset of events

Figure 7 Edema of bilateral Reinke diseasemdashyear 2012 display ofthe hotspots on the map

Table 1 Endema of bilateral Reinke diseasemdashfinal number ofclusters and final error for year

Year Initial numberof clusters

Final numberof clusters |119880

(119897)minus 119880

(119897minus1)| 120576

2008 15 4 048 times 10

minus21 times 10

minus2

2009 15 4 055 times 10

minus21 times 10

minus2

2010 15 4 071 times 10

minus21 times 10

minus2

2011 15 4 067 times 10

minus21 times 10

minus2

2012 15 3 053 times 10

minus21 times 10

minus2

4 Test Results

Wepresent the results obtained on the event dataset describedabove in the period between the years 2008 and 2012

We consider first the subset of data corresponding to theedema of bilateral Reinke disease

We fix the fuzzifier parameter to 01 the initial number ofclusters to 15 and the final iteration error to 1 times 10minus2

Table 1 shows the results obtained for each yearWe present the details relating to the comparison of the

hotspots obtained by considering the event data for the years2011 and 2012

Figures 6 and 7 show respectively the hotspots obtainedby using the pattern subset of events that occurred in the years2011 and 2012

Figure 8 shows the overlap of the hotspots obtained forthe two years in red the hotspots corresponding to the year2011 in blue the ones corresponding to the year 2012

Table 2 shows in the first two columns the labels of thehotspots in 2011 and 2012 in third (resp fourth) column theradius obtained in 2011 (resp 2012) and the distance betweenthe centroids is given in the fifth column

The results show that only hotspot 3 obtained for the year2011 remains almost unchanged in the year 2012 Insteadhotspots 1 and 2 seem to merge into a single larger hotspot(the hotspot 1 obtained for the year 2012) and hotspot 4 thatshifts about 1 km is expanded the radius of this hotspot in2012 is about 65 km (hotspot 3 obtained for the year 2012 inFigure 8)

6 Advances in Fuzzy Systems

Figure 8 Edema of bilateral Reinke diseasemdashyears 2011 and 2012display of the two hotspotsrsquo series

Table 2 Endema of bilateral Reinke diseasemdashcomparison results ofthe hotspots obtained for the years 2011 and 2012

2011hotspot

Intersecting2012 hotspot

Radius 2011hotspot (km)

Radius 2012hotspot (km)

Centroidrsquosdistance (km)

1 1 1724 3848 17592 1 1943 3848 15073 2 3434 3453 00744 3 3591 6519 1115

Table 3 Nasal polyposismdashcomparison results of the hotspotsobtained for the years 2011 and 2012

2011hotspot

Intersecting2012 hotspot

Radius 2011hotspot (km)

Radius 2012hotspot (km)

Centroidrsquosdistance (km)

1 1 3087 4951 26562 2 4915 7103 1052

Now we show the results obtained for the disease nasalpolyposis

Figure 9 shows the overlap of the hotspots obtained forthe two years 2011 and 2012

In Table 3 the comparisonrsquos results are reportedThe results in Figure 9 show that in 2011 and 2012 there

are two hotspots the one covering an area of the city ofNaples and the other coveringmanyVesuvian townsThe twohotspots which in 2011 covered a circular area with a radii ofabout 3 and 5 km respectively in 2012 cover a circular areawith radii of about 5 and 7 km respectively

The histogram in Figure 10 shows the trend of the radii ofthe two hotspots in the course of time

It is relevant the spread in recent years of the hotspot thatsurrounds the Vesuvian towns (the radius of this hotspotfrom about 2 km in the year 2008 is about 7 km in the year2012)

Another significant trend concerns the hotspots obtainedfor the carcinoma disease

Also in this case the two main hotsposts cover the cityof Naples and many Vesuvian towns In in this case we havea very high spread of the hotspot covering the city of Naples

Figure 9 Nasal polyposis diseasemdashyears 2011 and 2012 display ofthe two hotspotsrsquo series

8

7

6

5

4

3

2

1

0

(km

)

2008 2009 2010 2011 2012Year

NaplesVesuvian towns

Figure 10 Nasal polyposis diseasemdashhistogram showing the varia-tion of the radius of the two hotspots over time

(cfr Figure 11) in recent years the radius of this hotspot isincreased up to 95 km

5 Conclusions

The hyperspheres obtained as clusters (circles in case oftwo dimensions) by using EFCM can represent hotspots inhotspot analysis this method has a linear computationalcomplexity and is robust to noises and outliers In hotspotsanalysis the patterns are bidimensional and the features areformed by geographic coordinates the cluster prototypes arecircles that can represent a good approximation of hotspotareas and can be displayed as circular areas on the map

In this paper we present a new method that uses theEFCM algorithm for studying the spatio-temporal evolutionof hotspots in disease analysis

Advances in Fuzzy Systems 7

12

10

8

6

4

2

0

(km

)

2008 2009 2010 2011 2012Year

NaplesVesuvian towns

Figure 11 Carcinoma diseasemdashhistogram showing the variation ofthe radius of the two hotspots over time

We consider the residencersquos information of patients inthe district of Naples (Italy) to whom a surgical interventionto the oto-laryngo-pharyngeal apparatus was carried outbetween the years 2008 and 2012 A geocoding process isused for geo-referencing the data then the georeferenceddataset is partitionedper year and type of disease we comparethe hotspots obtained for each pair of consecutive years andanalyze the trend of each hotspot over time measuring thevariation of the radius and the distance between intersectingcluster centroids concerning two consecutive years

The results show a consistent spread in the last years ofthe nasal polyposis disease hotspot covering some Vesuviantowns and of the carcinoma disease hotspot covering the cityof Naples

References

[1] S P Chainey S Reid and N Stuart ldquoWhen is a hotspot ahotspot A procedure for creating statistically robust hotspotgeo-graphic maps of crimerdquo in Innovations in GIS 9 Socioe-conomic Applications of Geographic Information Science DKidner G Higgs and S White Eds Taylor and FrancisLondon UK 2002

[2] K Harries Geographic Mapping Crime Principle and PracticeNational Institute of Justice Washington DC USA 1999

[3] A T Murray I McGuffog J S Western and P MullinsldquoExploratory spatial data analysis techniques for examiningurban crimerdquo British Journal of Criminology vol 41 no 2 pp309ndash329 2001

[4] F Di Martino and S Sessa ldquoThe extended fuzzy c-meansalgorithm for hotspots in spatio-temporal GISrdquo Expert Systemswith Applications vol 38 no 9 pp 11829ndash11836 2011

[5] R M Mullner K Chung K G Croke and E K MensahldquoIntroduction geographic information systems in public healthandmedicinerdquo Journal ofMedical Systems vol 28 no 3 pp 215ndash221 2004

[6] K Polat ldquoApplication of attribute weighting method based onclustering centers to discrimination of linearly non-separablemedical datasetsrdquo Journal of Medical Systems vol 36 no 4 pp2657ndash2673 2012

[7] C K Wei S Su and M C Yang ldquoApplication of data miningon the develoment of a disease distribution map of screenedcommunity residents of Taipei County in Taiwanrdquo Journal ofMedical Systems vol 36 no 3 pp 2021ndash2027 2012

[8] I Gath and A B Geva ldquoUnsupervised optimal fuzzy clus-teringrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 11 no 7 pp 773ndash780 1989

[9] R Krishnapuram and J Kim ldquoClustering algorithms based onvolume criteriardquo IEEE Transactions on Fuzzy Systems vol 8 no2 pp 228ndash236 2000

[10] J C Bezdek Pattern Recognition with Fuzzy Objective FunctionAlgorithms Plenum Press New York NY USA 1981

[11] U Kaymak R BabuskaM Setnes H B Verbruggen andHMvanNauta Lemke ldquoMethods for simplification of fuzzymodelsrdquoin Intelligent Hybrid Systems D Ruan Ed pp 91ndash108 KluwerAcademic Publishers Dordrecht The Netherlands 1997

[12] U Kaymak and M Setnes ldquoFuzzy clustering with volumeprototypes and adaptive cluster mergingrdquo IEEE Transactions onFuzzy Systems vol 10 no 6 pp 705ndash712 2002

[13] F Di Martino V Loia and S Sessa ldquoExtended fuzzy c-meansclustering algorithm for hotspot events in spatial analysisrdquoInternational Journal of Hybrid Intelligent Systems vol 4 pp 1ndash14 2007

[14] F Di Martino and S Sessa ldquoImplementation of the extendedfuzzy c-means algorithm in geographic information systemsrdquoJournal of Uncertain Systems vol 3 no 4 pp 298ndash306 2009

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 6: Research Article Spatiotemporal Hotspots Analysis for Exploring …downloads.hindawi.com/journals/afs/2013/385974.pdf · 2019. 7. 31. · Advances in Fuzzy Systems Let X ={x1,...,x

6 Advances in Fuzzy Systems

Figure 8 Edema of bilateral Reinke diseasemdashyears 2011 and 2012display of the two hotspotsrsquo series

Table 2 Endema of bilateral Reinke diseasemdashcomparison results ofthe hotspots obtained for the years 2011 and 2012

2011hotspot

Intersecting2012 hotspot

Radius 2011hotspot (km)

Radius 2012hotspot (km)

Centroidrsquosdistance (km)

1 1 1724 3848 17592 1 1943 3848 15073 2 3434 3453 00744 3 3591 6519 1115

Table 3 Nasal polyposismdashcomparison results of the hotspotsobtained for the years 2011 and 2012

2011hotspot

Intersecting2012 hotspot

Radius 2011hotspot (km)

Radius 2012hotspot (km)

Centroidrsquosdistance (km)

1 1 3087 4951 26562 2 4915 7103 1052

Now we show the results obtained for the disease nasalpolyposis

Figure 9 shows the overlap of the hotspots obtained forthe two years 2011 and 2012

In Table 3 the comparisonrsquos results are reportedThe results in Figure 9 show that in 2011 and 2012 there

are two hotspots the one covering an area of the city ofNaples and the other coveringmanyVesuvian townsThe twohotspots which in 2011 covered a circular area with a radii ofabout 3 and 5 km respectively in 2012 cover a circular areawith radii of about 5 and 7 km respectively

The histogram in Figure 10 shows the trend of the radii ofthe two hotspots in the course of time

It is relevant the spread in recent years of the hotspot thatsurrounds the Vesuvian towns (the radius of this hotspotfrom about 2 km in the year 2008 is about 7 km in the year2012)

Another significant trend concerns the hotspots obtainedfor the carcinoma disease

Also in this case the two main hotsposts cover the cityof Naples and many Vesuvian towns In in this case we havea very high spread of the hotspot covering the city of Naples

Figure 9 Nasal polyposis diseasemdashyears 2011 and 2012 display ofthe two hotspotsrsquo series

8

7

6

5

4

3

2

1

0

(km

)

2008 2009 2010 2011 2012Year

NaplesVesuvian towns

Figure 10 Nasal polyposis diseasemdashhistogram showing the varia-tion of the radius of the two hotspots over time

(cfr Figure 11) in recent years the radius of this hotspot isincreased up to 95 km

5 Conclusions

The hyperspheres obtained as clusters (circles in case oftwo dimensions) by using EFCM can represent hotspots inhotspot analysis this method has a linear computationalcomplexity and is robust to noises and outliers In hotspotsanalysis the patterns are bidimensional and the features areformed by geographic coordinates the cluster prototypes arecircles that can represent a good approximation of hotspotareas and can be displayed as circular areas on the map

In this paper we present a new method that uses theEFCM algorithm for studying the spatio-temporal evolutionof hotspots in disease analysis

Advances in Fuzzy Systems 7

12

10

8

6

4

2

0

(km

)

2008 2009 2010 2011 2012Year

NaplesVesuvian towns

Figure 11 Carcinoma diseasemdashhistogram showing the variation ofthe radius of the two hotspots over time

We consider the residencersquos information of patients inthe district of Naples (Italy) to whom a surgical interventionto the oto-laryngo-pharyngeal apparatus was carried outbetween the years 2008 and 2012 A geocoding process isused for geo-referencing the data then the georeferenceddataset is partitionedper year and type of disease we comparethe hotspots obtained for each pair of consecutive years andanalyze the trend of each hotspot over time measuring thevariation of the radius and the distance between intersectingcluster centroids concerning two consecutive years

The results show a consistent spread in the last years ofthe nasal polyposis disease hotspot covering some Vesuviantowns and of the carcinoma disease hotspot covering the cityof Naples

References

[1] S P Chainey S Reid and N Stuart ldquoWhen is a hotspot ahotspot A procedure for creating statistically robust hotspotgeo-graphic maps of crimerdquo in Innovations in GIS 9 Socioe-conomic Applications of Geographic Information Science DKidner G Higgs and S White Eds Taylor and FrancisLondon UK 2002

[2] K Harries Geographic Mapping Crime Principle and PracticeNational Institute of Justice Washington DC USA 1999

[3] A T Murray I McGuffog J S Western and P MullinsldquoExploratory spatial data analysis techniques for examiningurban crimerdquo British Journal of Criminology vol 41 no 2 pp309ndash329 2001

[4] F Di Martino and S Sessa ldquoThe extended fuzzy c-meansalgorithm for hotspots in spatio-temporal GISrdquo Expert Systemswith Applications vol 38 no 9 pp 11829ndash11836 2011

[5] R M Mullner K Chung K G Croke and E K MensahldquoIntroduction geographic information systems in public healthandmedicinerdquo Journal ofMedical Systems vol 28 no 3 pp 215ndash221 2004

[6] K Polat ldquoApplication of attribute weighting method based onclustering centers to discrimination of linearly non-separablemedical datasetsrdquo Journal of Medical Systems vol 36 no 4 pp2657ndash2673 2012

[7] C K Wei S Su and M C Yang ldquoApplication of data miningon the develoment of a disease distribution map of screenedcommunity residents of Taipei County in Taiwanrdquo Journal ofMedical Systems vol 36 no 3 pp 2021ndash2027 2012

[8] I Gath and A B Geva ldquoUnsupervised optimal fuzzy clus-teringrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 11 no 7 pp 773ndash780 1989

[9] R Krishnapuram and J Kim ldquoClustering algorithms based onvolume criteriardquo IEEE Transactions on Fuzzy Systems vol 8 no2 pp 228ndash236 2000

[10] J C Bezdek Pattern Recognition with Fuzzy Objective FunctionAlgorithms Plenum Press New York NY USA 1981

[11] U Kaymak R BabuskaM Setnes H B Verbruggen andHMvanNauta Lemke ldquoMethods for simplification of fuzzymodelsrdquoin Intelligent Hybrid Systems D Ruan Ed pp 91ndash108 KluwerAcademic Publishers Dordrecht The Netherlands 1997

[12] U Kaymak and M Setnes ldquoFuzzy clustering with volumeprototypes and adaptive cluster mergingrdquo IEEE Transactions onFuzzy Systems vol 10 no 6 pp 705ndash712 2002

[13] F Di Martino V Loia and S Sessa ldquoExtended fuzzy c-meansclustering algorithm for hotspot events in spatial analysisrdquoInternational Journal of Hybrid Intelligent Systems vol 4 pp 1ndash14 2007

[14] F Di Martino and S Sessa ldquoImplementation of the extendedfuzzy c-means algorithm in geographic information systemsrdquoJournal of Uncertain Systems vol 3 no 4 pp 298ndash306 2009

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 7: Research Article Spatiotemporal Hotspots Analysis for Exploring …downloads.hindawi.com/journals/afs/2013/385974.pdf · 2019. 7. 31. · Advances in Fuzzy Systems Let X ={x1,...,x

Advances in Fuzzy Systems 7

12

10

8

6

4

2

0

(km

)

2008 2009 2010 2011 2012Year

NaplesVesuvian towns

Figure 11 Carcinoma diseasemdashhistogram showing the variation ofthe radius of the two hotspots over time

We consider the residencersquos information of patients inthe district of Naples (Italy) to whom a surgical interventionto the oto-laryngo-pharyngeal apparatus was carried outbetween the years 2008 and 2012 A geocoding process isused for geo-referencing the data then the georeferenceddataset is partitionedper year and type of disease we comparethe hotspots obtained for each pair of consecutive years andanalyze the trend of each hotspot over time measuring thevariation of the radius and the distance between intersectingcluster centroids concerning two consecutive years

The results show a consistent spread in the last years ofthe nasal polyposis disease hotspot covering some Vesuviantowns and of the carcinoma disease hotspot covering the cityof Naples

References

[1] S P Chainey S Reid and N Stuart ldquoWhen is a hotspot ahotspot A procedure for creating statistically robust hotspotgeo-graphic maps of crimerdquo in Innovations in GIS 9 Socioe-conomic Applications of Geographic Information Science DKidner G Higgs and S White Eds Taylor and FrancisLondon UK 2002

[2] K Harries Geographic Mapping Crime Principle and PracticeNational Institute of Justice Washington DC USA 1999

[3] A T Murray I McGuffog J S Western and P MullinsldquoExploratory spatial data analysis techniques for examiningurban crimerdquo British Journal of Criminology vol 41 no 2 pp309ndash329 2001

[4] F Di Martino and S Sessa ldquoThe extended fuzzy c-meansalgorithm for hotspots in spatio-temporal GISrdquo Expert Systemswith Applications vol 38 no 9 pp 11829ndash11836 2011

[5] R M Mullner K Chung K G Croke and E K MensahldquoIntroduction geographic information systems in public healthandmedicinerdquo Journal ofMedical Systems vol 28 no 3 pp 215ndash221 2004

[6] K Polat ldquoApplication of attribute weighting method based onclustering centers to discrimination of linearly non-separablemedical datasetsrdquo Journal of Medical Systems vol 36 no 4 pp2657ndash2673 2012

[7] C K Wei S Su and M C Yang ldquoApplication of data miningon the develoment of a disease distribution map of screenedcommunity residents of Taipei County in Taiwanrdquo Journal ofMedical Systems vol 36 no 3 pp 2021ndash2027 2012

[8] I Gath and A B Geva ldquoUnsupervised optimal fuzzy clus-teringrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 11 no 7 pp 773ndash780 1989

[9] R Krishnapuram and J Kim ldquoClustering algorithms based onvolume criteriardquo IEEE Transactions on Fuzzy Systems vol 8 no2 pp 228ndash236 2000

[10] J C Bezdek Pattern Recognition with Fuzzy Objective FunctionAlgorithms Plenum Press New York NY USA 1981

[11] U Kaymak R BabuskaM Setnes H B Verbruggen andHMvanNauta Lemke ldquoMethods for simplification of fuzzymodelsrdquoin Intelligent Hybrid Systems D Ruan Ed pp 91ndash108 KluwerAcademic Publishers Dordrecht The Netherlands 1997

[12] U Kaymak and M Setnes ldquoFuzzy clustering with volumeprototypes and adaptive cluster mergingrdquo IEEE Transactions onFuzzy Systems vol 10 no 6 pp 705ndash712 2002

[13] F Di Martino V Loia and S Sessa ldquoExtended fuzzy c-meansclustering algorithm for hotspot events in spatial analysisrdquoInternational Journal of Hybrid Intelligent Systems vol 4 pp 1ndash14 2007

[14] F Di Martino and S Sessa ldquoImplementation of the extendedfuzzy c-means algorithm in geographic information systemsrdquoJournal of Uncertain Systems vol 3 no 4 pp 298ndash306 2009

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 8: Research Article Spatiotemporal Hotspots Analysis for Exploring …downloads.hindawi.com/journals/afs/2013/385974.pdf · 2019. 7. 31. · Advances in Fuzzy Systems Let X ={x1,...,x

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014