Hyperspherical cluster based distributed anomaly detection in wireless sensor networks

Accepted Manuscript

Hyperspherical cluster based distributed anomaly detection in wirelesssensor networks

Sutharshan Rajasegarar, Christopher Leckie, Marimuthu Palaniswami

PII: S0743-7315(13)00201-3DOI: http://dx.doi.org/10.1016/j.jpdc.2013.09.005Reference: YJPDC 3232

To appear in: J. Parallel Distrib. Comput.

Received date: 25 July 2011Revised date: 27 June 2013Accepted date: 13 September 2013

Please cite this article as: S. Rajasegarar, C. Leckie, M. Palaniswami, Hyperspherical clusterbased distributed anomaly detection in wireless sensor networks, J. Parallel Distrib. Comput.(2013), http://dx.doi.org/10.1016/j.jpdc.2013.09.005

This is a PDF file of an unedited manuscript that has been accepted for publication. As aservice to our customers we are providing this early version of the manuscript. The manuscriptwill undergo copyediting, typesetting, and review of the resulting proof before it is published inits final form. Please note that during the production process errors may be discovered whichcould affect the content, and all legal disclaimers that apply to the journal pertain.

http://dx.doi.org/10.1016/j.jpdc.2013.09.005

• Detecting anomalies in data is challenging on resource constrained networks • We propose a distributed algorithm using hyperspherical cluster based data models • The scheme is capable of identifying global anomalies at an individual node level • Comparable detection accuracy with significant reduction in communication overhead • Implemented and demonstrated on a real wireless sensor network test-bed

*Highlights (for review)

Hyperspherical Cluster based Distributed Anomaly Detectionin Wireless Sensor Networks✩

Sutharshan Rajasegarara,∗, Christopher Leckieb, Marimuthu Palaniswamia

aDepartment of Electrical and Electronic Engineering, The University of Melbourne, Victoria 3010, Australia.bDepartment of Computing and Information Systems, The University of Melbourne, Victoria 3010, Australia.

Abstract

This article describes a distributed hyperspherical cluster based algorithm for identifying anomalies in measurementsfrom a wireless sensor network, and an implementation on a real wireless sensor network testbed. The communicationoverhead incurred in the network is minimised by clusteringsensor measurements and merging clusters before sendinga compact description of the clusters to other nodes. An evaluation on several real and synthetic data sets demonstratesthat the distributed hyperspherical cluster-based schemeachieves comparable detection accuracy with a significantreduction in communication overhead compared to a centralised scheme, where all the sensor node measurements arecommunicated to a central node for processing.

Keywords: Distributed processing, Wireless sensor networks, Anomaly detection

1. Introduction

Wireless sensor networks that are deployed for mon-itoring purposes aim to identify interesting events thathave occurred in the area under observation. Further-more, the sensor nodes are resource constrained, interms of their energy, memory and computational ca-pabilities. An important factor concerning the radiocommunication in sensor nodes is that it constitutes themajority of the power consumption in sensor networks[1, 2]. For example, the ratio between communica-tion and computation energy consumption per bit rangesfrom 103 to 104 in Sensoria sensors and Berkeley motes[3]. Moreover, Marthur et al. have observed that using

✩We acknowledge the support from Australian Research Coun-cil (ARC) Research Network on Intelligent Sensors, Sensor Net-works and Information Processing (ISSNIP); REDUCE projectgrant(EP/I000232/1) under the Digital Economy Programme run by Re-search Councils UK - a cross council initiative led by EPSRC andcontributed to by AHRC, ESRC and MRC; the ARC Linkage projectgrant (LP120100529) and the ARC Linkage Infrastructure, Equip-ment and Facilities scheme (LIEF) grant (LE120100129).∗Corresponding author: Address: Department of Electrical and

Electronic Engineering, The University of Melbourne, Victoria 3010,Australia. Phone:+61 383440112

Email addresses:[email protected](Sutharshan Rajasegarar),[email protected]

(Christopher Leckie),[email protected](Marimuthu Palaniswami)

parallel NAND Flash technology reduces energy con-sumption, and enables large storage capacities for sen-sor nodes [4]. Therefore, it is advantageous to performin-network processing at the nodes and reduce commu-nication overhead in the network in order to prolong thelifetime of the wireless sensor network.

For robust and reliable functioning of the network, itis essential to identify and mitigate any misbehaviors inthe network in an accurate and timely manner. Misbe-havior in sensor networks can occur due to faulty sen-sors, gradual drift in the sensor elements, loss of cali-bration, noisy sensor transducers, movement of sensors,changes in observed phenomena or malicious attackssuch as denial of service attacks [5]. A key challenge inwireless sensor networks is to identify any misbehaviorsor interesting events (anomalies) with minimal commu-nication overhead within the network while achievinghigh detection accuracy.

An anomaly(or outlier) in a set of data is defined byBarnett et al. [6] as “an observation (or subset of ob-servations) which appears to be inconsistent with theremainder of that set of data”. In sensor networks,anomalies can be at the level of individual measure-ments with respect to the other measurements at thesame sensor node, or at the level of the measurements ofone node with respect to other sensor nodes in the net-work. Henceforth, we have identified three categories ofanomalies in sensor networks [7, 8]. First are anomalies

Preprint submitted to Elsevier June 27, 2013

hscdad.texClick here to view linked References

in individual measurements or network traffic attributesat a sensor node, i.e., some observations at a node areanomalous with respect to the rest of the data. Second,all of the data at a sensor node may be anomalous withrespect to its neighbouring nodes. In this case that sen-sor node will be identified as an anomalous node. Third,a set of sensor nodes in the network can be anomalous.These three types of anomalies are called first, secondand third order anomalies. In this paper we proposea distributed scheme to detect first and second orderanomalies.

The approach proposed in this paper is a distributed,non-parametric anomaly detection algorithm that iden-tifies anomalous measurements at nodes. It uses dataclustering to model the data at each node. Data cluster-ing (hereafter called clustering) is the process of findinggroups of similar data points, such that each group ofdata points is well separated [9]. In sensor networks,measurements or traffic data collected by the sensornodes can beclusteredby identifying groups of simi-lar measurements in the data. Heresimilarity means thecloseness of data vectors to each other. Then anomalydetection can be performed on the clustered data to clas-sify data vectors as either normal or anomalous. Ratherthan communicating raw sensor measurements to a cen-tral node for analysis, the data at each node are clus-tered using hyperspherical clusters. Sensor nodes thenreport cluster summaries, rather than individual mea-surements to other nodes in the network. Intermediatesensor nodes then merge cluster summaries before com-municating with other nodes. This distributed schememinimises communication overhead, which is a majorsource of energy consumption for wireless sensor net-works. Previous attempts at distributed data clusteringin sensor networks such as [10] and [11] have not con-sidered co-operation between nodes for anomaly detec-tion.

The effectiveness of our proposed approach isdemonstrated using several real and synthetic data sets.Comparison to a centralised approach shows that thisdistributed approach can achieve significant reductionsin communication overhead, while achieving compara-ble detection accuracy. Further we compared our de-tection scheme with other existing schemes in the lit-erature. Also, we provide a mechanism to select theparameter settings of the algorithm such that an appro-priate trade-off between the detection accuracy and thecommunication overhead can be found. Moreover, ourdistributed algorithm is implemented on a real wirelesssensor network consisting of SunSPOT nodes.

2. Related work

Anomaly or outlier detection has long been a researchtopic in the data mining and machine learning commu-nities. Several alternative definitions of anomalies havebeen proposed in the literature, as well as a variety ofdetection algorithms. Several surveys of these algo-rithms can be obtained from [12, 13, 14, 15, 6]. Further-more, data aggregation has received considerable atten-tion in the sensor network community and several al-gorithms [16, 17, 18] have been proposed in the litera-ture. However, anomaly detection that utilizes such in-network processing capability is still an open researchchallenge.

Anomaly or outlier detection mechanisms used insensor nodes to detect anomalies can be categorised intothree general approaches depending on the type of back-ground knowledge of the data that is available [13]. Thefirst approach finds outliers without prior knowledge ofthe underlying data. This approach uses unsupervisedlearning or clustering, and assumes that the outliers arewell separated from the data points that are normal. Thesecond approach uses supervised classification, wherea classifier is trained with labeled data, i.e., the train-ing data is marked as normal or abnormal. Then thetrained classifier can be used to classify new data as ei-ther normal or abnormal. This approach requires theclassifier to be retrained if the characteristics of normaland abnormal data changes in the system. The third ap-proach is novelty detection, which is analogous to semi-supervised recognition. Here a classifier learns a suc-cinct generalisation of a given set of data, which canthen be used to recognise anomalies. This approach canincrementally learn the normal model as and when datais available, so that it can adapt to changes in the distri-bution of the data.

We can also categorise anomaly detection techniquesin terms of the type of model that they learn, e.g., a para-metric or non-parametric model. In parametric tech-niques, the underlying assumption is that the densitydistribution of the data points that are being analysedfor anomalies is known a priori, e.g., a Gaussian dis-tribution. Anomaly detection techniques that do notassume any prior knowledge about the distribution ofthe data are called non-parametric anomaly detectiontechniques. These techniques are suitable for resourceconstrained sensor networks where the data distributionmay change frequently. For example, these changescan be caused by energy depletion of sensors over thelifetime of the network, which affects the stability ofthe routing topology, and can thus affect anomaly-basedintrusion detection. Changes or uncertainty about the

2

type of monitored environment can also affect the dis-tribution of measurement values. Below we look inmore detail at some of the recent non-parametric ap-proaches that have been proposed for sensor networks,namely, clustering and nearest neighbour based ap-proaches. Further details about anomaly detection inwireless sensor network can be seen from [8, 19].

Data clustering is a process of finding groups of sim-ilar data points such that each group of data points iswell separated [9]. In this approach the data are firstclustered and then anomaly detection is performed us-ing those clusters. In [11], a cluster based technique isused to detect routing attacks in sensor networks. In thisapproach, each sensor node monitors the routing mes-sages that it receives. At regular intervals, each sensornode characterises the set of routing records it has seenin terms of a vector of features, such as the number ofroute requests, and the average hop count to the base sta-tion. Each vector of features can be viewed as a pointin a multidimensional feature space. It is assumed thatattack traffic occurs far less frequently than normal traf-fic, and that it is statistically different to normal trafficsamples, hence they appear as anomalies in the featurespace.

The anomaly detection process comprises twophases. The first phase is the training phase, where atraining data set (with normal and abnormal traffic) isused to model the distribution of the traffic using clus-ters. The clustering algorithm used here is called fixed-width clustering. This algorithm creates hypersphericalclusters of fixed radius for a data set. Then the clustersare labeled as normal if they contain more than a givenfraction of the total data points, or are otherwise labeledas anomalous. If any new traffic data points fall out-side the normal clusters, they are labeled as anomalous.It is demonstrated that this algorithm effectively detectssinkhole attacks in sensor networks. An issue that wasidentified for further work was how to introduce collab-oration between sensors when making a decision aboutthe anomalies. In [10], a distributed k-means cluster-ing algorithm is proposed for peer to peer sensor net-work environments. Clusters are formed at a sensornode using cluster statistics communicated by neighbor-ing nodes, but are not used for anomaly detection.

Nearest neighbour based approaches use distancesamong data vectors as the similarity metric. Examplesof such metrics include the distance to a nearest neigh-bour data vector (NN), the distance to the kth nearestneighbour data vector (kNNM), and the distance to theaverage of thek nearest data vectors (Average kNN).Note thatk is a user defined parameter. These similaritymeasures are used to sort the data vectors and to classify

them as normal or anomalous.Branch et al. [20] proposed a distributed unsuper-

vised anomaly detection algorithm to identify the top-noutliers in sensor network data. They considered a sen-sor network topology where all the nodes can commu-nicate with each other in the network. They used severaldistance metrics such as nearest neighbour distance andaveragek-nearest neighbour distance to compute thesimilarity between data vectors at every node. Then theidentified top-n anomalies and a subset of data, called asupport set, is broadcast to all its neighbour nodes. Afterseveral iterations of communication among the nodes,each node identifies global anomalies. This involvesmultiple rounds of raw data communication in the net-work.

Zhang et al. [21] proposed an improved frameworkfor the above scheme by Branch et al. [20] for detect-ing the top-n outliers using thek-nearest neighbour ap-proach. In this approach, all the nodes are arranged inan aggregation tree topology, where a parent-child rela-tionship exists. In order to find global anomalies in thenetwork, it uses three phases iteratively, namely:com-mit, disseminateandverify.

During the firstcommitphase each node identifies thetop-n anomalies on their local data. It then sends theidentified anomalies and a subset of data, called a sup-port set, to its intermediate parent node. The interme-diate parent node recomputes the top-n anomalies andthe support set considering all the received informationfrom its children. Intermediate parent nodes then for-ward this information to their parent. This process con-tinues until the sink node (the top-most parent node ofthe aggregation tree) is reached. The sink node uses allthe received information from its children and computesthe global anomalies, and their deviation rank, which isa metric used to measure how far each anomaly is devi-ating from its closest data in the support set.

During the disseminatephase, the sink node thenbroadcasts the global anomalies and the deviation rankto all its children along the tree for verification by itschildren. Then, during theverify phase, each node ver-ifies the received global anomalies from the sink nodewith their local data by using the deviation rank metric.If it finds that it has another set of data vectors in its lo-cal data that may modify the global results, it then sendsthose data vectors and the corresponding deviation rankto the sink node to compute the global anomalies again.After several iterations of the commit, disseminate andverify phases, all the nodes will agree on a global resultfor the anomalies. This scheme incurs lower communi-cation overhead than the algorithm proposed by Branchet al. [20], even though it involves some form of raw

3

data communication along the tree.In [22] a cluster based distributed scheme is proposed

for anomaly detection. Here the data vectors are col-lected at individual nodes and clustered using hyper-spherical clusters. These clusters are communicated toa gateway node along the hierarchy. Anomaly detectionis performed at the gateway node to identify anoma-lous clusters at the gateway node. However, anoma-lous data vectors are not identified at each node level.Furthermore, the clusters are merged at the intermedi-ate parent node level along the hierarchy using a singlelevel merging operation. This can be further improvedto reduce the communication overhead in the network.In this paper, we propose a distributed anomaly detec-tion scheme using a hyperspherical clustering algorithmand k-nearest neighbour scheme to collaboratively de-tect anomalies in wireless sensor network data. Further,the anomalous data vectors are identified at each nodelevel, and no raw data measurements are communicatedin the network. A recursive cluster merging operation isused to merge the maximum number of clusters at theintermediate parent node level, consequently reducingthe communication overhead in the network. The pro-posed scheme is implemented and evaluated on a realwireless sensor network testbed using SunSpot nodes.

Potential applications of our proposed anomaly de-tection scheme include identifying inefficient energyconsumption behaviour in a indoor (work) environment[23, 24], monitoring the health of coral reefs such as theGreat Barrier Reef, Australia [25], Internet of Things(IoT) [26] applications such as Smart City monitoringfor noise, pollution and environmental parameters inMelbourne, Australia [27, 28, 29] and SmartSantander,Spain [30]. Below we explain our proposed scheme indetail.

3. Network Architecture and Problem statement

We consider a set of sensor nodesS = {sj : j = 1...s}having a hierarchical topology as shown in Figure 1.The sensors are deployed in a homogeneous environ-ment, in which the measurements taken have the sameunknown distribution. All the sensor nodes are timesynchronised. Note that there are several time synchro-nisation algorithms [31] available for sensor networksthat can be utilised for this purpose. At every timeinterval ∆i each sensor nodesj measures a data vec-tor x j

i (we sometimes refer tox ji as a measurement).

Each data vector is composed of attributesx jik, where

x ji = {x j

ik : k = 1...p} and x ji ∈ ℜp. After a window

of n measurements, each sensorsj has collected a set ofmeasurementsX j = {x j

i : i = 1...n}.The aim is to findlocal andglobal anomalies in the

data measurements collected by the nodes in the net-work. Local anomalies are anomalous measurements ina sensor nodes own (local) data measurements, whereonly the measurements at the same node are used as abasis for comparison. However, we are also interestedin cases where the majority of measurements at a sen-sor node are anomalous in comparison to other nodes inthe network. These global anomaliesO ⊂ X are anoma-lous measurements in the union of the measurementscollected from multiple sensor nodes in the networkX =

⋃j=1..s X j. Local anomalies can be detected by

considering local measurements of a sensor node with-out incurring any energy intensive communication over-head in the network. However, detecting global anoma-lies requires all the measurements from multiple nodesto be considered. Below we describe the approaches todetect the local and global anomalies in detail.

4. Anomaly Detection Approaches

4.1. Centralised Approach

A simple approach to detecting global anomalies isto use a centralised approach. In this approach, at theend of every time window of measurements, each sensornodesj sends all its data to its gateway nodesg ∈ S. Thegateway nodesg combines its own dataXg with the re-ceived data setXR =

⋃j=1..s−1, j,g X j to form a combined

data setX = XR ∪ Xg. A clustering algorithm is run onX to find a set of clustersC = {cr : r = 1...c}. The clus-tering algorithm used here is based on fixed-width clus-tering, used by Eskin et al. [32] for anomaly detection.Euclidean distance is used as the dissimilarity measurebetween pairs of data [9]. Once the clusters are formed,outlier clusters (anomalous clusters) are classified usingan anomaly detection algorithm. This algorithm iden-tifies anomalous clusters using theK nearest neighbordistance between clusters (refer to Section 4.2.3).

Figure 1(a) shows an example of the centralised ap-proach for a multi-level hierarchical topology. Thedata vectors at each nodeS2,S3,S4,S5,S6 and S7are transmitted to the gateway nodeS1. The combineddata vectors atS1 are used for clustering the data. Fi-nally, the anomaly detection algorithm is run at nodeS1 on those clusters. However, the centralised ap-proach has several drawbacks, including its high com-munication overhead and processing delay at the centralnode. Hence, we propose a distributed anomaly detec-tion scheme that overcomes these limitations by min-

4

(a) Centralised approach (b) Distributed approach

Figure 1: (a) Centralised approach (b) Distributed approach: (i) Clusters formed at each node, (ii) Distributed detection.

imising the communication overhead requirement in thenetwork.

4.2. Distributed Approach

In the distributed approach, the anomaly detectionprocess is distributed to all sensors in the network. First,we give the steps involved in the distributed approachand then, in the following subsections, we explain themain functions in the steps in detail.

In the distributed approach, at every time window ofnmeasurements the following operations are performed.

Step 1 Each sensorsj ∈ S performs the clustering op-eration (see Section 4.2.1) on its dataX j and pro-duces clustersC j = {c j

r : r = 1...ρ j}.Note that the number of clustersρ j is determinedalgorithmically. Also, a unique identifier (ID) isassigned to each of the clusters produced at thatnode. This unique ID not only identifies the clusteruniquely, but also identifies the sensor node fromwhich it is generated. This can be easily achievedby associating the cluster ID with the unique nodeID. We assume that a localisation protocol used inthe network assigns unique node IDs to each node.

In order to detect thelocal anomalous data vec-tors, the anomaly detection algorithm (see Section4.2.3) can be applied to the clustersC j to identifythelocal anomalous clusters, and subsequently thelocal anomalous data vectors in the node. In order

to detect theglobal anomalies, the following needto be performed.

Step 2 Sensorsj sends asummaryof its clusters (in-cluding its unique ID) to its immediate parentsu = Parent(sj), i.e., each sensor sends a tuple< LS jr , n jr , ID jr >, whereLS jr is the linear sumof the data vectors in clusterc j

r , n jr is the numberof data vectors in clusterc j

r , andID jr is the uniquecluster ID of clusterc j

r .

Each (hyperspherical) clusterc jr ∈ C j can be com-

pletely represented by its centroid and the numberof data vectors it contains [33]. Let the number ofdata vectors inc j

r be n jr ≤ n and the set of datavectors contained inc j

r be X jr = {x j

q : q = 1...n jr }.Then the linear sum of the data vectors of that clus-ter can be defined asLS jr =

∑njr

q=1 x jq. Hence, the

centroid ofc jr is LS jr /n jr . Thus,c j

r is summarisedby n jr and the linear sum of the data vectorsLS jr

of that cluster.

Step 3 The parent nodesu combines its clustersCu withthe clustersC =

⋃j∈children(su) C j from its immedi-

ate children and forms a combined set of clustersCϑ = C ∪Cu.

Step 4 The parent nodesu merges the combined clustersetCϑ to produce a merged cluster setCh = {ch

r :r = 1...e}, wheree≤ |Cϑ|.

5

See Section 4.2.2 for the cluster merging algo-rithm. Also note that whenever a pair of similarclustersc1 andc2 are merged to produce a mergedclusterc3, a new cluster ID is assigned forc3. Inaddition a new field called<Merged cluster IDs>is also added. This field contains a combinationof cluster and node IDs ofc1 and c2. Hence, ifthe clusterc3 is identified as anomalous, then theMerged cluster IDs would reveal from which clus-ters and from which nodes this merged clusterc3

has been produced.

Step 5 Then the parent nodesu sends the summaries ofCh to its immediate parent.

Step 6 This process continues recursively up to thegateway nodesg ∈ S, where an anomaly detec-tion algorithm is applied to its merged clustersCg

to identify the anomalous cluster setCa ⊂ Cg (Sec-tion 4.2.3).

Step 7 Summary of anomalous global clustersCa arecommunicated to all the nodes in the network.

Step 8 Each nodesj , using the anomalous globalcluster Ca, identifies the corresponding globallyanomalous data vectors at the node.

Figure 1(b) shows an example of our distributed ap-proach for a multi-level hierarchical topology with leafnodesS4,S5,S6 and S7, intermediate parent nodesS2 and S3 and gateway nodeS1. Initially, all sen-sor nodes perform the clustering operation on their owndata. NodesS4 andS5 send cluster summaries to theirparent nodeS2. NodesS6 andS7 send cluster sum-maries to their parent nodeS3. NodesS2 andS3 com-bine the received summaries from their respective chil-dren with their own clusters, and then produce mergedclustersCh. NodesS2 andS3 send their cluster sum-maries to the gateway nodeS1. Finally, nodeS1 mergesthe clusters from its children and its own clusters pro-ducing a merged cluster setCg. The anomaly detec-tion algorithm is run on theCg to identify the globallyanomalous cluster setCa. Then the globally anomalouscluster summaries (Ca) are communicated back to allthe sensor nodes. Each node fromS2 to S7 usesCa

to identify the globally anomalous data vectors at theirnode.

In this approach, globally anomalous data vectors areidentified at each node level. Further, better load bal-ancing is achieved in the network by distributing theclustering process between all the nodes. Also the com-munication overhead is reduced by only sending clustersummaries instead of (all of) the raw data. This helps

to extend the lifetime of the network. Note that, in thecentralised case, the gateway node has complete infor-mation about the data in the network, whereas in thedistributed case, the gateway node only has the mergedcluster information. Therefore, there may be a slightreduction in detection accuracy for the distributed casecompared to the centralised case.

In general, the computation of the global clusters canbe performed at any parent node (or at any level) of thehierarchy. For example, in Figure 1(b), the global modelcomputed at the parent node S3 will consider the clus-ters from its children S6 and S7 only. Then, S6 and S7will perform global detection using the global clusterssent by S3. In this case, the region considered for dis-tributed detection is the region covered by the nodes S3,S6 and S7. If the global clusters are computed at thetop most parent node, i.e., the base station S1, then itwill consider the cluster information from all the chil-dren from S2 to S7. Here, the region considered fordistributed detection will be all of the nodes in the topol-ogy. This method of distribution over a hierarchicaltopology provides flexibility to the user in selecting thecoverage region for distributed computation.

Moreover, the distributed detection approach is notlimited to hierarchical topologies, such as the Tenet ar-chitecture [34]. It is applicable to any network topol-ogy where a set of sensors can communicate with theirneighbors in the network. In this case, any sensor nodein a connected group of sensors can be selected as aleader node for performing the global clustering and theanomaly detection of the clusters. This also gives flex-ibility for the leader node to select or ignore any par-ticipating neighbors for the computation of the globalclustering. This provides robustness against faulty ormalicious nodes in the network.

In addition to detecting interesting events in the datacollected from environmental monitoring applications,the proposed approach can be used for detecting sometypes of malicious attacks caused by adversaries, suchas routing attacks in sensor networks. Periodic route er-ror attack, active sinkhole attacks and passive sinkholeattacks are some of the examples of routing attacks thatcan be detected using this scheme in a similar way to[11]. Moreover, alternative routing trees can be con-structed to generate alternative ways of aggregating thedata. If an inconsistency is found between these alterna-tive aggregation results, then the presence of a maliciousnode can potentially be detected. A detailed investiga-tion of such an approach is beyond the scope of thispaper, and is a direction for future research.

The following sections describe the fixed-width clus-tering algorithm, merging algorithm and the anomaly

6

detection algorithm in detail.

4.2.1. Fixed-width Clustering AlgorithmThe clustering algorithm used here is based on

the fixed-width clustering algorithm used in [32, 11].Fixed-width clustering creates a set of hypersphericalclusters of fixed radius (width)w. The widthw is a pa-rameter specified by the user. The procedure begins byrandomly choosing any data point as the centroid (cen-ter) of the first cluster with radiusw. In the generalstep, the Euclidean distance is computed between thecentroids of the current clusters to the next remainingdata vector. If the distance to the closest cluster centerfrom a data vector islessthan the radiusw, the data vec-tor is added to that cluster and the centroid is updated.If the distance to the closest cluster center ismorethanthe radiusw, then a new cluster is formed with that datavector as its centroid. This procedure is shown in Algo-rithm 1.

The principle advantage of this simple approach isthat only one pass is required through the data vectors,thus minimising storage, computation and energy con-sumption. This efficiency is traded against the loss offlexibility and possible accuracy engendered by using asingle threshold (w) to determine all the clusters. An-other disadvantage of this scheme is the (implicit) as-sumption that clusters in the input space are hyperspher-ical in shape.

Algorithm 1 Fixed-width ClusteringRequire: Cluster widthw and Data setX = {xi : i =

1...n}C⇐ ⊘ {empty cluster setC}Nc⇐ 0 {Number of clusters}

Create the first clusterc1 with the center asx1

C⇐ c1

Nc⇐ 1for i = 2 ton do

Compute Euclidean distancedi betweenxi and theclosestclustercr ∈ C with centermr

if di ≤ w thencr ⇐ xi {Add xi to cr }mr ⇐ mean of data vectors incr

elseCreate a new clustercυ with xi as the centerC⇐ cυNc⇐ Nc + 1

end ifend forOutput: Cluster setC = {cr : r = 1...Nc}

4.2.2. Merging ClustersTwo clusters are merged when they aresimilar. We

define a pair of clustersc1 andc2 to besimilar if theinter-cluster distanced(c1, c2) between their centers isless than a given merging thresholdτ (≤ w). If c1 andc2 are similar, then a new clusterc3 is produced whosecenter is the mean of the data vectors in the clusters withcenterc1 andc2 and whose number of data vectors isthe sum of those inc1 andc2. This merging procedureis performed recursively until there is no more mergingpossible (Refer to Figure 2). The merging algorithm isshown in Algorithm 2.

When two clusters are merged into a new cluster withfixed width w, a small error can be introduced due tofinding the new cluster while keeping the cluster widthfixed. When the merging operation is performed recur-sively until there are no more clusters to merge, this er-ror would start to accumulate. This error can be reducedby setting a small value forτ, i.e., merge only if thepair of clusters are very close to each other. However,the trade-off is the increased communication overheadin the network. In the future, the merging algorithm andthe clustering algorithm will be improved to accommo-date clusters with varying widths, however, with an in-creased computational overhead.

Figure 2: Cluster merging, wheren1 andn2 are the number of datavectors in each clusterc1 andc2 respectively.

4.2.3. Anomaly Detection AlgorithmThe anomaly detection algorithm classifiesclusters

as either normal or anomalous. The average inter-cluster distance of theK nearest neighbor (KNN) clus-ters is used [35] to identify anomalous clusters. Thealgorithm is as follows.

• For each clustercβ in the cluster setC, a set ofinter-cluster distancesDcβ = {d(cβ, cγ) : γ =1...(|C| − 1), γ , β} is computed. Hered(cβ, cγ)is the Euclidean distance between the centroids ofcβ andcγ, and|C| is the number of clusters in thecluster setC.

• Among the set of inter-cluster distancesDcβ forclustercβ, the shortestK (parameter of KNN) dis-tances are selected and using those, theaverage

7

Algorithm 2 Merge ClustersRequire: Cluster setC = {cr : r = 1...Nc}, merging

thresholdτ and cluster widthw.Cm⇐ ⊘ {empty merged cluster set Cm}Cm⇐ C {copy all the clusters from C to Cm}MergeClusters(Cm) {recursively merges clusters}Output: Merged cluster setCm

Procedure MergeClusters(Cm)l = |Cm| {store initial size of Cm}Cu⇐ ⊘ {empty merged cluster set Cu}for r = 1 to Nc do

if cr is not mergedthenIsMergingDone= falsefor j = r + 1 to Nc do

if c j is not mergedthenCompute Euclidean distanceDr betweencr

andc j

if Dr ≤ τ thennv ⇐ nr + n j {nr and nj are number ofdata vectors in clusters cr and cj respec-tively}mv ⇐ 1

nv(LSr + LS j) {LSr and LSj are

linear sum of data vectors of clusters cr

and cj respectively}Create a merged clustercv usingmv andnv {mv is the cluster center and nv is thenumber of data vectors}Cu⇐ cv

Mark cr andc j as mergedIsMergingDone= truebreak

end ifend if

end forif IsMergingDone== falsethen

Cu⇐ cr {add cr to merged cluster set Cu}end if

end ifend forif l < |Cu| then

MergeClusters(Cu)end if

inter-cluster distance ICDβ of clustercβ is com-puted as follows,

ICDβ =

1K

K∑

γ=1,,β

d(cβ, cγ) K ≤ |C| − 1

1|C|−1

|C|−1∑

γ=1,,β

d(cβ, cγ) K > |C| − 1

Our average inter-cluster distance computation dif-fers from the one proposed by Chan et al. [36]in the following way. Chan et al. used the wholecluster setC to compute the average inter-clusterdistanceICDβ for a clustercβ, whereas here onlythe K nearest neighbor clusters ofcβ are used tocompute the average inter-cluster distanceICDβ.The advantage of this approach is that clusters atthe edge of a dense region are not overly penalisedcompared to clusters in the center of the region.

• A cluster is identified as anomalous if its averageinter-cluster distanceICDβ is more thanψ num-ber of standard deviations of the inter-cluster dis-tanceS D(ICD) from the mean inter-cluster dis-tanceAVG(ICD), i.e., the set of anomalous clus-tersCa ⊂ C is defined as

Ca = {cβ ∈ C|ICDβ > AVG(ICD)+ ψ×S D(ICD)}(1)

whereICD is the set of average inter-cluster dis-tances.

Figure 3 shows an example of normal and anomalousclusters.

Figure 3: Example of normal and anomalous clusters.

4.3. Complexity Analysis

The centralised approach requires memory for storingdata measurements ofO(np) at each node andO(snp)at the gateway node, wheres is the number of sensornodes in the network,p is the number of dimensions ina data vector, andn is the number of data vectors at a

8

sensor node. The computational overhead is only forthe gateway node and its complexity isO(snNc), whereNc(<< n) is the number of clusters. The communicationcomplexity per link (i.e., for communicating between apair of nodes) isO(np).

The complexity for the distributed approach is as fol-lows.

Thefixed-width clustering algorithmrequires a singlepass over the data at each node. For each data vector, itcomputes the distance to each existing cluster. Hencethe computational complexity isO(nNc). The memorycomplexity for each sensor node isO(Ncp), since eachcluster requires a fixed length record.

Thecluster mergingoperation compares cluster pairsrecursively inς iterations with computational complex-ity of O(N2

cς). The anomaly detection algorithm com-pares each cluster to all other clusters in the cluster setC to find theK nearest neighbors with computationalcomplexityO(N2

c ).In the distributed algorithm, each nodej, for each

clusterc jr , needs to communicate the linear sum of data

vectorsLS jr , the number of data vectors in the clustern jr , and the cluster IDID jr between nodes. Further, af-ter the global anomalous clusters are found at the centralnode,Na number of anomalous clusters are sent back tothe nodes. Therefore, the communication complexityper link isO(Ncp+ 2Nc + Nap+ 2Np).

In summary, in the distributed anomaly detection al-gorithm, each sensor node requires memory to keepNc clusters with maximum complexity ofO(Ncp) andmemory for data measurementsO(np). Each sensor in-curs a maximumO(nNc) computational complexity anda maximumO((Nc + Na)(p+ 2)) communication com-plexity. Table 1 provides a summary of the complexityper sensor node.

5. Evaluation

The aim of this evaluation is to compare the per-formance of our proposed cluster-based distributedanomaly detection approach with the centralised ap-proach and other existing schemes in the literature. Fur-thermore, we aim to provide a mechanism to select theparameter settings of the algorithm such that a suitabletrade-off between the detection accuracy and the com-munication overheads is found. Moreover, we aim toinvestigate whether our proposed algorithm can be ap-plied in a real life scenario by implementing on a net-work consisting of real wireless sensor nodes.

We use two real sensor network deployment datasets and two synthetic data sets for evaluation purposes,

namely theIBRL, GDI, BananaandGaussmixdata sets.We now describe each data set in detail.

5.1. MethodologyThe IBRL data is a publicly available set of sensor

measurements from a wireless sensor network consist-ing of 55Mica2Dotsensor nodes (including a gatewaynode), which was deployed in the Intel Berkeley Re-search Laboratory (IBRL) in Berkeley, California, USA[37, 38]. This deployment is an example of indoor envi-ronmental monitoring using sensor networks. The sen-sors collect five measurements at 31 second intervals:temperature in degrees Celsius, light in Lux, tempera-ture corrected relative humidity as a percentage, voltagein volts and topology information. Figure 4(a) showsthe sensor deployment in the laboratory. Sensor nodesare labeled from 0 to 54, where Node 0 is the gatewaynode, which only transmits data to a host computer anddoes not collect any sensor measurements. The sen-sors collected measurements for a period between 28thFebruary 2004 and 5th April 2004 [38]. All the mea-surements were communicated to the gateway node viamultiple hops. We consider a four hour time windowof measurements collected on 1st March 2004 duringthe time interval from 00:00am to 03.59am. During thiswindow, nodes 5 and 15 did not contain any data. Forease of visualisation, we focus on two types of features:temperature and humidity. Since the original data didnot contain any labels as to which data is normal andanomalous, we use scatter plots to visually identify andlabel them as normal and anomalous using the follow-ing procedure. We first make a visual inspection of theobserved data to highlight the presence of any appar-ent anomalies in the data. Figure 4(b) is the scatter plotof the combined measurements from all 54 nodes. Inthis plot, two apparent anomalies can be observed. Onepossible anomaly is a small collection of data vectors inthe lower right hand corner of the plot that differ sig-nificantly from the mass of the data. This anomalousdata constitutes a part of the data from node 14. Sec-ond, there are three visual “islands” of data coming fromnode 37 (whole data) which lie “above” the mass of thedata. We label these visually apparent anomalous datavectors asanomalies, and the rest of the data vectors asnormalfor our evaluation purposes. We formed a singlelevel hierarchical topology with node 1 as the gatewaynode and the others (node 2 to node 54) as leaf nodes.Note that we omit the node 0 (the gateway node), node 5and node 15 which did not contain any data (see Figure4(a)) in our evaluation.

The other real data set that we used was the sen-sor measurements collected at the Great Duck Island

9

Table 1:Comparison of the complexities at a sensor node for the distributed and centralised scheme, where,n is the number of datavectors at node,p is the dimension of a data vector,Nc is the number of clusters at a node (p,Nc << n), Na is the number of globalanomalous clusters (Na << n), ands is the number of sensors in a network

ComplexityScheme Computation Memory Communication

(per node) (per node) (per link)

Distributed O(nNc) O(Ncp+ np) O((Nc + Na)(p+ 2))

CentralisedO(snNc) O(snp)

O(np)(at central node) (at central node)

(a) Sensor node locations (b) IBRL data

Figure 4: (a) Sensor node locations in the IBRL deployment. Node locations are shown in black with their corresponding node IDs. Node 0 is thegateway node (reproduced from [38]). (b) Scatter plot of IBRL data at the centralised nodeSH .

Project [39]. The network was deployed for monitor-ing the habitat of a sea bird called theLeach’s StormPetrel [40]. This is an example of an outdoor environ-mental monitoring deployment. In every five minute in-terval, each sensor recorded light, temperature, humid-ity, and pressure readings. Seven sensor nodes are se-lected for the evaluation, namely nodes 101, 109, 111,116, 118, 122 and 123 over a 24 hour period on 1st July2003. Three features were used: humidity, temperatureand pressure readings from each sensor node. 3D Scat-ter plots were made for the data at each node, and eachdata set was manually cleaned to remove overly exces-sive values for the attributes in the data, such as negativevalues for the humidity readings. The cleaned data werelabeled asNormaland used for the evaluation. A threelevel hierarchical topology as shown in Figure 1 wasformed by using node 101 as the gateway node, nodes109 and 111 as the intermediate parent nodes, and theother four nodes as leaf nodes.

We generated two types of anomalous data to usewith this normal GDI data set. First, a uniformly dis-tributed, randomly generated set of anomalous data (of20 data vectors for each node) were introduced into thetails of the distribution of each feature fortwo of the

nodes (nodes 118 and 123). These introduced anoma-lous data measurements were labeled asAnomalies,andwe refer to this data set asGDI-ISO, which stands forGDI data with isolated anomalies. The second type ofanomalous data is a randomly generated set of anoma-lous data added to thenormal GDI data at each node.The anomalous data for each attribute comprised a setof vectors drawn from a uniform distribution over thenormal measurements of each attribute. The numberof introduced anomalous data were 10% of the normalmeasurements and were labeled asAnomalies. We re-fer to this data set asGDI-UNI, which stands forGDIdata with uniform anomalies. Both GDI-ISO and GDI-UNI were normalised to the range [0,1]. The scatterplots of the combined data vectors from all of the sevennodes are shown in Figure 5. The first synthetic dataused was aBananadata set [41], where thenormalandthe anomalousdata is distributed in the shape of a ba-nana. A single level hierarchical network topology con-sisting of 15 nodes was used for the evaluation. Eachnode comprises 100 data vectors of two dimensions,each with 10% of anomalous data. ThisBananadataset (including the anomalous data) was generated usingthe gendatbfunction from DDtools [41]. Figure 6(a)

10

Table 2: Data set description

Data set typeData set Network topology Number of Number ofname (levels of hierarchy) dimensions nodes

RealIBRL Single 2 52GDI-ISO Three 3 7GDI-UNI Three 3 7

SyntheticBanana Single 2 15Gaussmix Single 2 15

0

0.5

1

0

0.5

10

0.2

0.4

0.6

0.8

1

Humidity (%)Temp (deg C)

Pre

ssur

e

0

0.5

1

0

0.5

10

0.2

0.4

0.6

0.8

1

Humidity (%)Temp (deg C)

Pre

ssur

e

Figure 5: Scatter plots of GDI-UNI (left) and GDI-ISO (right) datasets. Blue crosses (x) represent normal data and red stars (*) representanomalous data.

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Attribute 1

Attrib

ute

2

(a) Banana

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Attribute 1

Att

rib

ute

1

(b) Gaussmix

Figure 6: Scatter plots of Banana and Gaussmix data sets. Thebluestars (*) represent the normal vectors and the red crosses (x) representthe anomalous vectors.

shows a sctter plot of the Banana data set, which is thecombined data vectors from all of the 15 nodes.

The second synthetic data set is calledGaussmix. TheGaussmix data set was similar to the one used by Sub-ramaniam et al. [42]. This consists of two features,each generated from a mixture of Gaussian distributionswith means randomly selected from (0.3, 0.35, 0.45)andwith a standard deviation of 0.03. Uniformly distributednoise (anomalies) ranging between [0.50, 1] was intro-duced to each feature of the data. The data for 15 sen-sor nodes were created including 5% noise data at ev-ery node. Each node comprises 105 data vectors. Thewhole data set is normalised to the range [0,1]. A single

level hierarchical network topology was formed usingthese nodes.

Table 2 gives a summary of the data sets used in ourevaluation.

5.2. Results

The centralised and distributed cluster-basedanomaly detection algorithms were implemented inC++. All the data sets were normalised by subtractingfrom the mean and dividing by the standard deviationbefore use in the evaluation. First we evaluate thedistributed and centralised scheme using theGDI-ISOdata set, by changing one of the parameters (w or K) ata time while the other is fixed. The cluster width valuew used ranged from 0.02 to 2.02 in 0.02 intervals, andthe KNN parameter valueK ranged from 1 to 51. Weusedτ = w, andψ = 1 in our evaluations. For eachsimulation, the false positive and false negative rateswere calculated. Afalse positiveoccurs when a normalmeasurement is identified as anomalous by an algo-rithm, and afalse negativeoccurs when an anomalousmeasurement is identified as normal. Thefalse positiverate (FPR) is the ratio between false positives and theactual normal measurements, and thefalse negativerate (FNR) is the ratio between false negatives and theactual anomalous measurements. Thedetection rate(DR) is given byDR = 100− FNR. Also during theevaluation, the number of data vectors and the numberof clusters communicated were recorded as a measureof the reduction in communication overhead. Here,results are reported for the distributed and centraliseddetection at the top most parent node (gateway node) ofthe network topology.

Figures 7(a) and 7(b) show graphs of the false posi-tive and the detection rates as a function of the clusterwidth for the centralised and distributed schemes. TheKNN parameter is fixed atk = 3, and the trends seenin these two graphs are very similar. The lower clus-ter widths (e.g.,w = 0.02 for the centralised scheme)yield higher false positive rates because at lower clusterwidths, a larger number of clusters are produced (some

11

0 0.5 1 1.5 2 2.50

20

40

60

80

100

Width

Rat

e (%

)

DR

FPR

(a) Centralised scheme

0 0.5 1 1.5 2 2.50

20

40

60

80

100

Width

Rat

e (%

)

FPR

DR

(b) Distributed scheme

Figure 7: Detection rate (DR) and False positive rate (FPR) (%) with Cluster width (w) for GDI-ISO dataset.

of the clusters may even be singletons). Consequently,there is a greater chance of the anomalous data being di-vided into multiple clusters in close proximity. This re-sults in the average inter-cluster distances being smallerfor the anomalous data, rendering them undetectable.

Similarly, at higher cluster widths (> 0.7 for thecentralised scheme) the detection rate again decreaseswhile the false positive rate becomes zero. This is be-cause, at larger cluster widths, a small number of largeclusters are produced. Hence there is a higher chanceof the anomalous data being included in the larger nor-mal clusters. The range of cluster widths at which thesystem performs better isw ∈ [0.06, 0.46] for the cen-tralised scheme andw ∈ [0.06, 0.22] for the distributedscheme. This shows that the cluster width (w) is an im-portant parameter for achieving better detection perfor-mance. In practice a good cluster width can be selectedby training the system before deployment and then pe-riodically adjustingw based on the detection accuracyand phenomenal changes in the environment of the net-work.

Figures 8(a) and 8(b) show graphs of the false posi-tive and detection rates as a function of the KNN param-eter for the centralised and distributed schemes. Herethe cluster width is fixed atw = 0.18. These graphsshow there is a threshold for the KNN parameter beyondwhich detection performance begins to diminish. In thisevaluation, the threshold KNN value for the centralisedscheme is 10, and for the distributed scheme is 3, i.e.,for KNN values≤ 10 (for the centralised scheme), falsepositive rates are much lower. This shows that the KNNvalue (K) is also an important parameter for achievingbetter detection performance. In practice a good KNNvalue can be selected by training the system before de-ployment. Furthermore, the number of global clustersproduced after the merging operation for the distributedscheme is 15 (see Figure 9(a)). From the empirical re-sults, the KNN parameter may be selected as a percent-

age of the number of clusters produced in the scheme(e.g., 20% of the final number of clusters produced bythe distributed detection algorithm).

When comparing the distributed and the centralisedschemes in Figures 7 and 8, it can be noticed that ahigher detection performance is observed for the largerrange of cluster widths and a larger range of KNN val-ues for the centralised scheme compared to the dis-tributed scheme. This is because the number of clus-ters produced for the centralised scheme (24) is largerthan the number of clusters produced in the distributedscheme (15) at cluster width 0.18 (see Figure 9(a)).Note that in the distributed scheme, the cluster mergingprocess is performed to combine similar clusters in or-der to minimise the communication overhead in the net-work. Hence the useful window of values for the clusterwidth and the KNN parameters are smaller for the dis-tributed scheme compared to the centralised scheme.

The average false positive rate for the distributed andthe centralised case is about 3% (refer to Figure 7).From this it follows that the distributed and centralisedschemes achieve roughly the same accuracy. More-over, a significant saving in communication overheadis achieved by the distributed approach compared to thecentralised approach. Figure 9(b) shows that the dis-tributed approach, when compared to the centralisedcase (withK = 3), realises savings in communicationoverhead that ranges from 86.04% to 97.95% over thecluster width range 0.06 to 0.22.

Next, we compare the detection performance usingthe other data sets. We performed simulations for thefollowing parameters. Both distributed and centralisedalgorithms were evaluated for different cluster widthvaluesw ranging from 0.001 to 0.201 in 0.002 inter-vals, and for different KNN parameter valuesK rangingfrom 1 to 25. In each simulation, the following mea-sures are recorded for each (w,K) pair: the FPR and DR.Receiver operating characteristic(ROC) curves, which

12

0 10 20 30 40 50 600

20

40

60

80

100

KNN

Rat

e (%

)

DR

FPR

(a) Centralised scheme

0 10 20 30 40 50 600

20

40

60

80

100

KNN

Rat

e (%

)

DR

FPR

(b) Distributed scheme

Figure 8: Detection rate (DR) and False positive rate (FPR) (%) with KNN parameter (k) for GDI-ISO dataset.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

100

200

300

400

500

600

Width

Tot

al N

o of

clu

ster

s

CentralisedDistributed

(a) Number of clusters

0 0.5 1 1.5 2 2.540

50

60

70

80

90

100

Width

Sav

ings

inco

mm

unic

atio

n ov

erhe

ad (

%)

(b) Savings in communication overhead

Figure 9: (a) Number of clusters produced for the centralised and distributed scenarios with cluster width (w). (b) Percentage reduction in thecommunication overhead in the network vs Cluster width (w). The reduction in the communication overhead in the network is calculated as (thenumber of data vectors transmitted in the centralised case -the number of clusters transmitted in the distributed case)/ (the number of data vectorstransmitted in the centralised case)

are DR vs FPR curves, are drawn by varying one of theparameters while the other is fixed. Thearea under theROC curve(AUC) is computed while varying these pa-rameters. The closer the AUC value is to 1, the better theclassifier performance is. AUC values of 0.5 and lowerindicate that the performance of a classifier is poorerthan random guessing. Here, results are reported for thedistributed and centralised detection at the top most par-ent node (gateway node) of the network topology.

Figures 10, 11, 12, 13 and 14 show graphs for all fivedata sets separately. When the AUC values obtained forthe centralised and distributed schemes are comparedfor each data set, they reveal that the distributed algo-rithm generally achieves a comparable detection accu-racy to the centralised scheme with significant savingsin communication overhead. Overall, the savings incommunication overhead depend on the cluster widthparameter, which affects the number of clusters gener-ated.

In the figures showing the percentage savings in com-munication overhead (SCO) vs (cluster) width graphs(see Figures 10(c), 11(c), 12(c), 13(c) and 14(c)), neg-

ative values for the SCO can be observed for smallercluster widths. The reason is as follows. In the cen-tralised scheme, the data vectors from each node arecommunicated to the gateway node once. Whereas,in the distributed scheme, there are two communica-tion overheads incurred. One, when each sensor nodesends their data clusters to gateway node up along thehierarchy. Second, when the gateway node sends theglobal anomalous clusters to all the sensor nodes downalong the hierarchy. Therefore, if the cluster width issmall, there will be a larger number of clusters pro-duced in each node (sometimes singleton clusters areproduced for very small cluster widths). Hence, in to-tal, the number of clusters communicated up and downalong the hierarchy for the distributed scheme becomesmore than the number of raw measurements communi-cated up along the hierarchy (only once) for the cen-tralised scheme. This will result in negative values forthe SCO. It is advantageous to operate the distributedscheme with the range of cluster widths that give a pos-itive value for the SCO and more than 0.5 for the AUCvalues.

13

0 0.05 0.1 0.15 0.2 0.250.94

0.95

0.96

0.97

0.98

0.99

1

Width

AU

C

(a) Centralised detection

0 0.05 0.1 0.15 0.2 0.250.94

0.95

0.96

0.97

0.98

0.99

1

Width

AU

C(b) Distributed detection

0 0.05 0.1 0.15 0.2 0.25−60

−40

−20

0

20

40

60

80

100

Width

Sa

vin

gs

inco

mm

un

ica

tion

ove

rhe

ad

(%

)

(c) Savings in communication overhead

Figure 10: Graphs for the area under the ROC curve (AUC) with cluster width (w), and the percentage savings in communication overhead forthecentralised and distributed detection algorithm on the GDI-ISO data.

0 0.05 0.1 0.15 0.2 0.250.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

Width

AU

C


0 0.05 0.1 0.15 0.2 0.250.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

Width

AU

C

(b) Distributed detection

0 0.05 0.1 0.15 0.2 0.25−40

−20

0

20

40

60

80

100

Width

Sa

vin

gs

inco

mm

un

ica

tion

ove

rhe

ad

(%

)


Figure 11: Graphs for the area under the ROC curve (AUC) with cluster width (w), and the percentage savings in communication overhead forthecentralised and distributed detection algorithm on the GDI-UNI data.

The communication savings obtained for the GDI-ISO and IBRL data sets (Figures 10(c) and 12(c)) arehigher than those obtained for the GDI-UNI, Banana,and Gaussmix data sets (refer to Figures 11(c), 13(c)and 14(c)). This is because, in the GDI-ISO and IBRLdata sets, the anomalies lie a larger distance apart fromthe normal data, and are alsoconcentratedin a partic-ular location in the space (for example, see the anoma-lies shown in Figure 4(b)). Whereas, for the Gaussmixand GDI-UNI data sets, the anomalies are spread overa much larger space, hence requiring more clusters tocapture them. However for the Banana data set (refer toFigure 6(a)), the anomalies are not so distant from thenormal data. Hence, to achieve a higher detection rate,smaller width clusters are needed to capture the less dis-tant anomalies while ensuring that normal data are notincluded. This is evident from the AUC plots in Fig-ures 13(a) and 13(b), where AUC values of above 0.7are achieved for cluster width valuesw ≤ 0.015 for thecentralised and distributed cases. The smaller clusterwidths needed for good detection performance resultsin a larger number of clusters being produced. This isthe reason for the lower savings in communication over-head compared to the GDI-ISO and IBRL data sets.

The above evaluation demonstrates that our dis-tributed algorithm is capable of detecting a variety of

anomalies with comparable detection accuracy to thatof a centralised scheme, but with varying efficiency interms of communication overhead, depending on howthe anomalies are spread and located in the data set. Ifthe anomalies are distant and concentrated, our schemeachieves very high savings in communication overhead(e.g., 50% savings in communication overhead with anAUC of about 98% in the case of the GDI-ISO data set).If the anomalies are less concentrated and spread over alarge space or if they lie much closer to the normal data,the savings in communication will be less. Furthermore,the above graphs provide a mechanism to choose theright parameter values (e.g., cluster width) such that agiven detection accuracy is achieved for a known com-munication overhead in the network. This is useful inpractice as a variety of commercial sensor nodes (sensornetoworks) exist with varying resources, such as com-putation and battetry life. Hence this gives a guide toselect the right parameter settings for a selected sensornetwork platform, and to achieve the desired detectionaccuracy with knowledge of the amount of communica-tion overhead that will be incured in the network.

14

0 0.05 0.1 0.15 0.2 0.250.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

Width

AU

C


0 0.05 0.1 0.15 0.2 0.250.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

Width

AU


0 0.05 0.1 0.15 0.2 0.25−400

−350

−300

−250

−200

−150

−100

−50

0

50

100

Width

Sa

vin

gs

inco

mm

un

ica

tion

ove

rhe

ad

(%

)


Figure 12: Graphs for the area under the ROC curve (AUC) with cluster width (w), and the percentage savings in communication overhead forthecentralised and distributed detection algorithm on the IBRL data.

0 0.05 0.1 0.15 0.2 0.250.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

Width

AU

C


0 0.05 0.1 0.15 0.2 0.250.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

Width

AU

C

(b) Distributed detection

0 0.05 0.1 0.15 0.2 0.25−150

−100

−50

0

50

100

Width

Sa

vin

gs

inco

mm

un

ica

tion

ove

rhe

ad

(%

)(c) Savings in communication overhead

Figure 13: Graphs for the area under the ROC curve (AUC) with cluster width (w), and the percentage savings in communication overhead forthecentralised and distributed detection algorithm on the Banana data.

Table 3:Detection rate (DR %) and false positive rate (FPR %) of various detection schemes for the IBRL, GDI-UNI and GDI-ISOdatasets.

Scheme HSCBS NN KNNM AvgKNN DBO

IBRLDR 100 6.43 11.43 6.43 100FPR 1.05 5.87 5.87 5.98 2.41

Params W=0.027, KNN=4 N=6% N=6%, KNN=10 N=6%, KNN=10 r=0.1105,α = 282

GDI-UNIDR 85.47 87.21 86.62 88.37 92.44FPR 1.48 4.27 4.33 4.15 7.29


GDI-ISODR 100 0 100 100 100FPR 1.462 10.11 7.73 7.73 0.24


6. Comparison with Other Schemes

Next we compare the effectiveness of our hyperspher-ical cluster based scheme (HSCBS) with other existinganomaly detection schemes from the literature. Three ofthe comparison methods search for topN anomalies (inthe terminology of [20]) using three different distancemetrics: (1) Nearest neighbour distance (NN) (2) Kthnearest neighbour distance method (KNNM) and (3) Av-erage ofK nearest neighbour distances (AvgKNN). Thefourth comparison method is a distance based outliersalgorithm proposed by Knorr et al. [43], which we de-

note asDBO. In DBO, a data vectorx is anomalous if atmost a givenα number of the data vectors in the data setX lie within a given distancer from x. TheN value forthe top-N anomalies is set as a percentage of the numberof data vectors. TheK value for KNN and AvgKNN areselected after performing a systematic search for eachalgorithm. All the results are reported for the centralisedscenario.

Figures 5 and 4(b) show the scatterplots of the la-belled (data vectors are pre-marked as normal or anoma-lous) real data of each dataset used in our comparisonevaluation. Table 3 provides the detection performance

15

0 0.05 0.1 0.15 0.2 0.250.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

Width

AU

C


0 0.05 0.1 0.15 0.2 0.250.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

Width

AU


0 0.05 0.1 0.15 0.2 0.25−150

−100

−50

0

50

100

Width

Sa

vin

gs

inco

mm

un

ica

tion

ove

rhe

ad

(%

)


Figure 14: Graphs for the area under the ROC curve (AUC) with cluster width (w), and the percentage savings in communication overhead forthecentralised and distributed detection algorithm on the Gaussmix data.

of each of the detection schemes in terms of the detec-tion rate (DR %) and the false positive rate (FPR %)for different data sets, along with the parameter settings(Params) used for each of the schemes. The parametersfor each of the algorithms are selected after perform-ing a systematic search for the parameter values thatyield the highest detection rate with a lower false posi-tive rate.

For the IBRL data set, we see that the HSCBS andDBO algorithms successfully detect the anomalies (bothfirst and second order anomalies), but DBO incurs manyfalse positives. This suggests that the proposed method(HSCBS) is more successful than the other methods indetecting these anomalies in the IBRL data set.

In the GDI-ISO data set, the anomalies are far apartfrom the greater mass of the data. Hence, all schemessuccessfully separate these anomalies, except the NNscheme, which also incurs higher false positives. Thissuggests that the NN scheme struggles to detect easilyidentifiable anomalies that are quite isolated.

In the GDI-UNI data set, the anomalies are unifor-maly distributed over the greater mass of the data. Allschemes successfully separate these anomalies with ahigher detection rate. However, the HSCBS method in-curs the lowest false positive rate.

The major advantage of the HSCBS algorithm overDBO comes from the reduced communication overheadwhen used in the distributed model. The DBO schemeis used in a distributed manner in [42]. In their scheme,each sensor performs local anomaly detection usingDBO on its local data. Then, all the local anomalies arecommunicated to the parent node to verify whether theyare globally anomalous. This scheme involves commu-nication of raw data vectors between nodes. Further,if the number of anomalies is high, it causes substan-tial energy consumption in the network due to increasedcommunication activity. Our distributed scheme trans-mits only a summary information about the hyper-

spheres (normal model), which is communication effi-cient. Hence, the distributed HSCBS scheme achievesaccuracy comparable to the DBO scheme, but has re-duced communication overhead. Moreover, our schemeis capable of identifying both local or global anomaliesat the individual node level.

7. Implementation on a Real Wireless Sensor Net-work Testbed

In this section we investigate whether our proposeddistributed anomaly detection algorithm can be imple-mented on a real wireless sensor network.

We implemented our distributed scheme on a realwireless sensor network testbed consisting of SunSPOT[44, 45] sensor nodes. SunSPOT (Sun Small Pro-grammable Object Technology) nodes consist of a180MHz, 32-bit ARM920T core processor, 512KBRAM and 4MB Flash. It uses a TI CC2420 (for-merly ChipCon) radio and communicates wirelessly at2.4GHz. It runs on a rechargeable 3.7V Lithium-ionbattery. The code is written using Java and executed us-ing the Squawk Virtual Machine on board. Each nodeis equipped with on-board sensing elements measuringtemperature, light and acceleration (3 axes). We usedtemperature (in degrees Celsius) and light (in Lux) mea-surements for our experiment.

A multi-hop hierarchical network is created using 10nodes as shown in Figure 16(a). The base station nodeis connected to a computer via Universal Serial Bus(USB). The base station node only provides processingand communication between the computer and the restof the nodes in the network, and does not participate incollecting any data measurements on-board.

There are two modules implemented in Java. The firstmodule runs on the computer and communicates withthe other nodes via the base station node. A graphicaluser interface (GUI) is implemented as shown in Figure

16

Figure 15: Graphical user interface of the distributed anomaly detection scheme

15, which provides an interface to communicate withthe sensor nodes using messages sent via the wirelessmedium from the base station node. The GUI providesfunctionalities to set parameters in the nodes, such ascluster widthw, KNN parameterK and merging thresh-old τ, and execute commands such as collect data intoflash, run local clustering, run global clustering and col-lect labeled data. It also provides graphs to visualise thelabelled data vectors and the labelled clusters (local andglobal) collected from each node in the network.

The second module runs on each node and performsall the processing required for anomaly detection. EachSunSPOT node in the network has been provided with aunique address, which is a 64-bit IEEE extended MACaddress expressed as four sets of four-digit hexadecimal.We used the last four digits of the IEEE extended MACaddress for the node ID. Each node is assigned a parentnode ID and its immediate children node IDs along withthe port for communication.

The following functionalities are provided in the im-plementation. Refer to Figure 16 for a summary of these

functionalities.

• Parameter Settings: Parameters required for dis-tributed detection can be set from the GUI, such ascluster width, KNN parameter, sample time, num-ber of records to collect, merge threshold and con-nection timeout. When an appropriate value for aparameter is entered in the GUI, a message is com-municated to all the nodes via the hierarchy andthe corresponding parameter value is set at eachnode. Furthermore, the current parameter valuesset at each node can also be requested and viewedon the GUI.

• Collect Data: When a collect data command is is-sued from the GUI, each node samples the mea-surements at the pre-specified interval (set usingparameter settings from the GUI) and saves themin its flash memory along with the date and time.

• Local Clusters: When the local clusters commandis issued from the GUI, fixed width clustering isperformed at each node on the local data saved in

17

(a) Hierarchy (b) Functionalities

Figure 16: Node hierarchy and the functionalities implemented for the distributed anomaly detection scheme using SunSPOT sensor nodes.

its flash memory using the predefined parameters.Then the anomaly detection algorithm is run on thelocal clusters to identify and label them as locallynormal or anomalous clusters. The locally anoma-lous data vectors are identified using the anoma-lous local clusters and the local labels (locally nor-mal or anomalous) are recorded in the flash mem-ory against each data vector together with the Clus-ter ID. Note that the Cluster IDs are unique num-bers. The labelled local clusters from each nodeare communicated to the base station for viewingpurposes. When communicating the cluster infor-mation, each cluster has the ID in a concatenatedform as<Node ID:Cluster ID> so that a cluster canbe uniquely identified as to which node it belongs.

• Global Clusters: When the global clusters com-mand is issued from the GUI, each node first per-forms local clustering on the data it has collectedand saved in its flash memory. Then the summaryof the local clusters are communicated along thehierarchy to the top-most parent node (in this caseto node ID 3550; see Figure 15). Cluster mergingis performed at the intermediate parent nodes andat the top-most parent node. The anomaly detec-tion algorithm is run on the global clusters at thetop-most parent node. The anomalous global clus-ters are then communicated back to all the nodes.Each node then uses these anomalous global clus-ters to classify and label its data vectors as glob-ally normal or anomalous. The global labels are

recorded in the flash memory against each datavector. The labelled global clusters are communi-cated to the base station from the top-most parentnode to the GUI for viewing purposes.

Each data record saved in the flash memory ateach node is in the format of<Date/Time, Tem-perature, Light, ClusterID, Local label, Globallabel>. The data packet format used for commu-nicating each cluster information is of the form<ClusterID, Node ID, Merged Cluster IDs, Noof Features, Label, Sum of Data Vectors, No ofData Vectors>. Note that when clusters aremerged recursively at a node, a new merged clus-ter is produced with a new cluster ID (i.e., itsNode ID and Cluster ID). In addition a new fieldis also added to the merged cluster informationcalled “Merged Cluster IDs”, which is of theform < Node ID1:ClusterID1; Node ID2:ClusterID2;Node ID3:Cluster ID3;,....>. This new fieldenables identification of the participating clustersthat have been involved in the formation of thisnew merged cluster. This information will be usedby each node to backtrack and identify the corre-sponding data vectors, so that the global labels ofthe merged clusters can be assigned to the corre-sponding data vectors in the flash memory.

• Labelled Data: When the labelled data commandis issued from the GUI, the labelled data vectorssaved in the the flash memory of each sensor nodeare communicated to the base station and viewed

18

Figure 17: Screenshots of distributed anomaly detection results. Green and blue circles represent normal clusters (not to scale) and normal datavectors respectively, and red circles represent the anomalies.

on the GUI.

• GC from Node: This command from the GUI isused to perform the above explained “Global Clus-ters” command functionalitiesat a specified nodelevel. Note that the distributed anomaly detectioncan be performed at any intermediate parent level.This command enables to specify the node ID atwhich the “Global Clusters” should be performed.

• Periodic: This command is provided to iterativelyperform the “Collect Data”, “Global Clusters” and“Labelled Data” commands in the sequel for aspecified number of times. The results from eachiteration are saved in a text file for later visualisa-tion purposes.

Figure 17 provides a graph of the local and globalclusters and the corresponding labelled data vectors ob-tained from the experiment. All 10 nodes were deployedinside a lab (indoor environment). The anomalies were

introduced by means of controlling the light falling onto some of the nodes as follows. One of the nodes wascovered with a paper to simulate a low lighting condi-tion, and two of the nodes were kept outside the win-dow to simulate high lighting conditions. Note that theclusters shown in the graphs are not to scale. The pa-rameters used for the experiment are as follows: Clusterwidth: 15.0, KNN Value: 3, Cluster Merge TH: 15.0,Data Sample Period (ms):1000, No of Data records:100. The code footprint is approximately 50 k bytes.The Global detection is performed at the top most par-ent node (Node ID 3550). This implementation demon-strates that our proposed anomaly detection scheme canbe used in real wireless sensor network hardware withlimited resources.

8. Conclusion

In this article, a distributed anomaly detection al-gorithm based on data clustering using hyperspherical

19

clusters is presented to identify anomalies in wirelesssensor network data. The scheme is capable of iden-tifying global anomalies at an individual node level.The evaluation on several real and synthetic data setsshows that the distributed approach achieves compa-rable detection performance to a centralised approach,while achieving a significant reduction in communi-cation overhead. Furthermore, our distributed schemehas been implemented on a real wireless sensor net-work testbed and demonstrated its functionalities. In thefuture, more sophisticated clustering methods (whichwould incur more computational overhead in each node)to capture the pattern of normal data in a node will beconsidered.

References

1. Raghunathan, V., Schurgers, C., Park, S., Srivastava, M..Energy-aware wireless microsensor networks.IEEE Signal Pro-cessing Magazine2002;19(2):40–50.

2. Doherty, L., Warneke, B., Boser, B., Pister, K.. Energyand performance considerations for smart dust.InternationalJournal of Parallel Distributed Systems and Networks2001;4(3):121–133.

3. Zhao, F., Liu, J., Liu, J., Guibas, L., Reich, J.. Collabora-tive signal and information processing: an information-directedapproach.Proceedings of the IEEE2003;91(8):1199 – 1209.

4. Mathur, G., Desnoyers, P., Ganesan, D., Shenoy, P.. Ultralow power data storage for sensor networks. In:Proceedings ofthe Fifth international Conference on Information Processing inSensor Networks (IPSN). USA; 2006, p. 374–381.

5. Perrig, A., Stankovic, J., Wagner, D.. Security in wirelesssensor networks.Communications of the ACM2004;47(6):53–57.

6. Barnett, V., Lewis, T..Outliers in Statistical Data. John Wileyand Sons; 3rd ed.; 1994.

7. Rajasegarar, S., Bezdek, J.C., Leckie, C., Palaniswami,M..Elliptical anomalies in wireless sensor networks.ACM Trans-actions on Sensor Networks (ACM TOSN)2009;6(1):28.

8. Rajasegarar, S., Leckie, C., Palaniswami, M.. Detectingdataanomalies in sensor networks. In: Beyah, R., McNair, J.,Corbett, C., editors.Security in Ad-hoc and Sensor Networks.World Scientific Publishing, Inc; 2009, p. 231–260.

9. Han, J., Kamber, M..Data Mining: Concepts and Techniques.Morgan Kaufmann Publishers; 2001.

10. Bandyopadhyay, S., Gianella, C., Maulik, U., H.Kargupta,, Liu, K., Datta, S.. Clustering distributed data streams inpeer-to-peer environments.The Information Sciences2006;176(14):1952–1985.

11. Loo, C., Ng, M., Leckie, C., Palaniswami, M.. Intrusiondetection for routing attacks in sensor networks.InternationalJournal of Distributed Sensor Networks2006;2(4):313–332.

12. Chandola, V., Banerjee, A., Kumar, V.. Anomaly detec-tion: A survey.ACM Computing Surveys2009;41(3):1–58. doi:\bibinfo{doi}{http://doi.acm.org/10.1145/1541880.1541882} .

13. Hodge, V., Austin, J.. A survey of outlier detection methodolo-gies.Artificial Intelligence Review2004;:85 – 126.

14. Markos, M., Sameer, S.. Novelty detection: A review part1: statistical approaches.Signal Processing2003;83(12):2481–2497.

15. Petrovskiy, M.I.. Outlier detection algorithms in datamin-ing systems. Programming and Computing Software2003;29(4):228–237.

16. Rajagopalan, R., Varshney, P.. Data-aggregation techniques insensor networks: a survey.Communications Surveys Tutorials,IEEE 2006;8(4):48 –63.

17. Ranjani, S., Krishnan, S., Thangaraj, C.. Energy-efficientcluster based data aggregation for wireless sensor networks. In:International Conference on Recent Advances in Computing andSoftware Systems (RACSS). 2012, p. 174 –179.

18. Roy, S., Conti, M., Setia, S., Jajodia, S.. Secure data aggre-gation in wireless sensor networks.Information Forensics andSecurity, IEEE Transactions on2012;7(3):1040 –1052.

19. Rajasegarar, S., Leckie, C., Palaniswami, M.. Anomaly de-tection in wireless sensor networks.IEEE Wireless Communi-cations2008;15(4):34–40.

20. Branch, J., Szymanski, B., Giannella, C., Wolff, R., Kar-gupta, H.. In-network outlier detection in wireless sesnornet-works. In: Proceedings of the IEEE International Conferenceon Distributed Computing Systems (ICDCS). 2006, p. 51–51.

21. Zhang, K., Shi, S., Gao, H., Li, J.. Unsupervised outlierdetec-tion in sensor networks using aggregation tree. In:Proceedingsof the Advanced Data Mining and Applications (ADMA). 2007,p. 158–169.

22. Rajasegarar, S., Leckie, C., Palaniswami, M., Bezdek, J.C..Distributed anomaly detection in wireless sensor networks. In:Proceedings of the IEEE International Conference on Commu-nications Systems (ICCS). Singapore; 2006, .

23. REDUCE. http://info.ee.surrey.ac.uk/CCSR/

REDUCE/; 2013.24. REDUCE network architecture. http://www.

smartsantander.eu/wiki/index.php/Testbeds/

Guilford; 2013.25. Great Barrier Reef, Australia. http://www.issnip.

unimelb.edu.au/research_program/sensor_

networks/environmental_monitoring/gbroos; 2013.26. Gubbi, J., Buyya, R., Marusic, S., Palaniswami, M.. Internet

of Things (IoT): A Vision, Architectural Elements, and FutureDirections.Accepted for publication in Future Generation Com-puter SystemsJan 2013;.

27. Internet of Things. http://issnip.unimelb.edu.au/

research_program/sensor_networks/Internet_of_

Things; 2013.28. Rajasegarar, S., Havens, T.C., Karunasekera, S., Leckie, C.,

Bezdek, J.C., Jamriska, M., et al. High resolution monitoringof atmospheric pollutants using a system of low-cost sensors.Accepted for publication in IEEE Transactions on Geoscienceand Remote Sensing (IEEE TGRS)June 2013;.

29. Jin, J., Gubbi, J., Luo, T., Palaniswami, M.. Network architec-ture and QoS issues in the Internet of Things for a Smart City.In: International Symposium on Communications and Informa-tion Technologies (ISCIT). 2012, p. 974–979.

30. Smart Santander.http://www.smartsantander.eu/; 2013.31. Sundararaman, B., Buy, U., Kshemkalyani, A.D.. Clock syn-

chronization for wireless sensor networks: A survey.Ad HocNetworks (Elsevier2005;3(3):281–323.

32. Eskin, E., Arnold, A., Prerau, M., Portnoy, L., Stolfo, S.. Ageometric framework for unsupervised anomaly detection: De-tecting intrusions in unlabeled data. In:Data Mining for Secu-rity Applications. 2002, .

33. Zhang, T., Ramakrishnan, R., Livny, M.. Birch: A newdata clustering algorithm and its applications.Data Mining andKnowledge Discovery1997;1(2):141–182.

34. Paek, J., Greenstein, B., Gnawali, O., Jang, K.Y., Joki,A., Vieira, M., et al. The tenet architecture for tiered sensor

20

networks.ACM Transaction on Sensor Networks2010;6:34:1–34:44.

35. Ramaswamy, S., Rastogi, R., Shim, K.. Efficient algorithmsfor mining outliers from large data sets. In:Proceedings of theACM Special Interest Group on Management Of Data (ACMSIGMOD). 2000, p. 427–438.

36. Chan, P.K., Mahoney, M.V., Arshad, M.H.. Learning rulesandclusters for anomaly detection in network traffic. In: Manag-ing Cyber Threats: Issues, Approaches and Challenges; vol. 5.Springer; 2005, p. 81–99.

37. Buonadonna, P., Gay, D., Hellerstein, J., Hong, W., Madden,S.. TASK: Sensor network in a box. In:Proceedings of theSecond European Workshop on Wireless Sensor Networks. 2005,p. 133–144.

38. IBRL web. http://db.lcs.mit.edu/labdata/labdata.

html.; 2011.39. Szewczyk, R., Mainwaring, A., Polastre, J., Anderson, J.,

Culler, D.. An analysis of a large scale habitat monitoring ap-plication. In: Proceedings of the International Conference onEmbedded Networked Sensor Systems (SenSys). 2004, p. 214–226.

40. Szewczyk, R., Osterweil, E., Polastre, J., Hamilton, M.,Mainwaring, A., Estrin, D.. Habitat monitoring with sensornetworks.Communications of the ACM2004;47(6):34–40.

41. Tax, D.M.J.. DDtools, the data description toolbox forMatlab, version 1.5.4. 2006.http://ict.ewi.tudelft.nl/~davidt/dd_tools.html.

42. Subramaniam, S., Palpanas, T., Papadopoulos, D., Kalogeraki,V., Gunopulos, D.. Online outlier detection in sensor data usingnon-parametric models. In:Proceedings of the InternationalConference on Very Large Data Bases (VLDB). 2006, p. 187–198.

43. Knorr, E.M., Ng, R.T.. Algorithms for mining distance basedoutliers in large datasets. In:Proceedings of the 24th Interna-tional Conference on Very Large Data Bases (VLDB). 1998, p.392–403.

44. BigNet. http://issnip.unimelb.edu.au/research_

program/sensor_networks/infrastructure/bignet_

testbed; 2013.45. SunSPOT.https://www.sunspotworld.com/; 2013.

21

Dr Sutharshan Rajasegarar received his B.Sc. Engineering degree

in Electronic and Telecommunication Engineering (with first class

honours) in 2002, from the University of Moratuwa, Sri Lanka, and

the Ph.D. degree in 2009 from the University of Melbourne,

Australia. He is currently a Research Fellow with the Department of

Electrical and Electronic Engineering, the University of Melbourne,

Australia. His research interests include Internet of Things (IoT), wireless sensor

networks, distributed anomaly/outlier detection, machine learning, pattern recognition,

signal processing and wireless communication.

Prof. Christopher Leckie is a professor at the Department of

Computing and Information Systems, the University of Melbourne,

Australia. He received the B.Sc. degree in 1985, the B.E. degree in

electrical and computer systems engineering (with first class

honours) in 1987, and the Ph.D. degree in computer Science in 1992,

all from Monash University, Australia. He is currently the deputy

director of NICTAVictoria Research Laboratory. His research

interests include scalable data mining, network intrusion detection,

wireless sensor networks, artificial intelligence (AI), telecommunications, machine

learning, fault diagnosis, distributed systems and design automation.

Prof. Marimuthu Palaniswami is a professor at the Department

of Electrical and Electronic Engineering, the University of

Melbourne, Australia. He received his B.E. (Hons) from the

University of Madras, India in 1977, ME from the Indian Institute

of science, India in 1979, MEngSc from the University of

Melbourne in 1983 and the PhD from the University of Newcastle,

Australia in 1987. He currently leads the ARC Research Network

on Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP)

programme. His research interests include Internet of Things (IoT), SVMs, Sensors and

Sensor Networks, Machine Learning, Neural Network, Pattern Recognition, Signal

Processing and Control.

*Author Biography & Photograph

Hyperspherical cluster based distributed anomaly detection in wireless sensor networks

Documents

Transcript of Hyperspherical cluster based distributed anomaly detection in wireless sensor networks