ForChaos: Real Time Application DDoS Detection Using...

Research ArticleForChaos Real Time Application DDoS Detection UsingForecasting and Chaos Theory in Smart Home IoT Network

Andria Procopiou 1 Nikos Komninos 1 and Christos Douligeris2

1Centre for Soware Reliability Department of Computer Science City University of London UK2Department of Informatics University of Piraeus Greece

Correspondence should be addressed to Andria Procopiou andriaprocopiou1cityacuk

Received 19 September 2018 Revised 1 January 2019 Accepted 10 January 2019 Published 3 February 2019

Academic Editor Yu Chen

Copyright copy 2019 Andria Procopiou et al This is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited

Recently DDoS attacks have been launched by zombie IoT devices in smart home networks They pose a great threat to networksystems with Application Layer DDoS attacks being especially hard to detect due to their stealth and seemingly legitimacy In thispaper we propose ForChaos a lightweight detection algorithm for IoT devices which is based on forecasting and chaos theoryto identify flooding and DDoS attacks For every time-series behaviour collected a forecasting-technique prediction is generatedbased on a number of features and the error between the two values is calculated In order to assess the error of the forecastingfrom the actual value the Lyapunov exponent is used to detect potential malicious behaviour In NS-3 we evaluate our detectionalgorithm through a series of experiments in flooding and slow-rate DDoS attacksThe results are presented and discussed in detailand compared with related studies demonstrating its effectiveness and robustness

1 Introduction

Smart Homes consist of a great number of different devicesall deployed in a single network monitoring the environ-ment collecting and sharing important data and informationwith the owners and other smart IoT devices and externalservices through internal and external networks The noderesponsible for this communication is the Energy ServicesInterface (ESI) It acts as a bidirectional interface whereinformation can be exchanged between the Smart Home andexternal domains Furthermore it protects internal energyresources from security failures and ensures secure internalcommunication between the devices deployed in the SmartHome ESIrsquos importance to the Smart Home and in theoutside domains makes it an excellent target for cyberattacks

DDoS attacks can be conducted across all the layers ofthe TCPIP model Application layer DDoS attacks are muchharder to be detected efficiently and accurately than theirperspective ones in lower layers as they do not violate anyprotocol rules or make usage of malicious behaviour TheTCP connections are established successfully and normalrequests are sent to the target in contrast to DDoS attacks

in lower layer such as the TCP Flooding which sends aburst amount of SYN packets without acknowledging theSYNACK packets sent from the server The ApplicationLayer Flooding instead sends a burst amount of legitimaterequests to the server which the server cannot refuse butto reply As a result it becomes unresponsive due to greatamount of incoming requests

On the contrary Slow-Rate Application Layer DDoSattack exploits a serverrsquos ability to wait for connections tobe completed in a range of time if the incoming connectionis legitimately slow As long as the client manages to send asubsequent packet in an attempt to complete the request theserver is obliged to keep the connection open Based on thatthe slow-rate attack opens a great number of connections andinitiates requests that never complete themAs time is passingby more and more connections are open and that results inthe target becoming once again unresponsive

Such attacks can be changed in their form of conduc-tion as there is no fixed behaviour As a result traditionalsignature-based Detection Techniques should fail Anomalydetection is the obvious solution for detecting Applica-tion Layer DDoS attacks both flooding and slow-rate in

HindawiWireless Communications and Mobile ComputingVolume 2019 Article ID 8469410 14 pageshttpsdoiorg10115520198469410

2 Wireless Communications and Mobile Computing

heterogeneous networks such as the Smart Home networkThe potential security system must be able to identify largedeviations in the traffic behaviour from the normal behaviourthat it is expected but also being robust against temporarynormal spikes

One way of detecting such attacks is to have a holisticbehaviour of the attack in terms of time ldquoTimerdquo is whatdifferentiates this malicious behaviour from normal and canclassify it as an attack The anomaly-detection solution mustbe able to detect changes in behaviour of the network bymonitoring its behaviour over a set of time-series

Due to the nature of the attacks described above it isevident that the detection algorithm must monitor closelythe present short-term traffic and not base its knowledge onlong-term previous behaviour of the system The reason forthis is that due to time being such a volatile entity and thetraffic being heterogeneous the network traffic generated candifferentiate from to its past history but appear legitimatenevertheless

Therefore we must predict what the future short-time-interval traffic is based on present or short-time-interval pre-vious behaviour of the system and not just classify the presentnetwork trafficmonitoring based on long-termprevious fixedbehaviour of the system It is important to avoid incorrectlyclassifying a behaviour as malicious simply because it is notsimilar to the history of the network Our detection algorithmmakes use of forecasting techniques to make short-termpredictions and identify the attacks through the constructionof Lyapunov exponents for every time-series interval

The remainder of this paper is organized as followsSection 2 presents the related studies Section 3 describes theproposed detection algorithm with the appropriate explana-tion of it Section 4 explains the simulation testbed imple-mented for conducting the experiments Section 5 presentsand discusses the experimental results Finally a conclusionand future directions are provided in Section 6

2 Related Studies

Various forecasting techniques have been proposed and usedin the past for the effective detection of DDoS attacks Themost popular are Moving Average (MA) Weighted MovingAverage (WMA) Simple Exponential Smoothing (SES) alsocalled Exponential Weighted Moving Average (EWMA)Double Exponential Smoothing (DES) and Triple Exponen-tial Smoothing (TES) also called Holtrsquos Winter SmoothingMA and WMA make a forecast solely based on previousobservations only while the rest of the techniques considerboth past observations and past forecasts

The authors in [6] used the Box-J Cox Jekings ARIMAmodel to detect DDoS attacks by monitoring the ip addressesand the packetsThey use the Lyapunov exponents tomeasurethe error generatedThey have constructed their algorithm tomake predictions every 5 minutes with a training time of 100minutes They report their algorithm to have generated a TPrate of 944 a FP of 01 and FN of 56

The authors in [1] used simple exponential smoothing andwavelet analysis in monitoring incoming bytes number of

packets and the proportion between incoming and outgoingpackets to detect UDP Flooding DDoS attacks They choosewindow sizes 600 and seconds with an overlap of 10

Their algorithm after a series of experiments and adjust-ments (window size of 100 seconds with an overlap of either50 of 80) generates no false positives and manages todetect 10 different types of attack scenarios However theydo not discuss whether their algorithm generates any FNs inthe scenarios which is quite important especially since inprevious experiments their algorithm was not able to detectthe attack at its start time but 200 seconds after initiated

In [3] the authors proposed linear exponential smooth-ing to detect Flooding and Scanning DDoS attacks usingbytes flows and packets per minutes They report a 24-hour training time and a window size of 5 minutes toaccurately detect deviations in trafficThe 119886 value is set to 005and the deviation is measured using entropy Through theirexperiments their algorithmcandetect the attackwhich lastsfor 11 minutes and 13 hours after the algorithm has startedbeing trained However it misclassified instances both beforeand after the actual attack In total 11 predictions misclassifynormal behaviour as attack out of 288 This gives a FP 382

In [4] the authors used the triple exponential smoothingto detect TCP SYN Flooding DDoS and Slammer Worm byanalysing each packets source ip destination ip source portand destination portTheyused awindow size of 900 secondsTo assess the error they used of the relative entropy Theydetect the attacks inserted in real-traffic from the BrazilianNational Research and Education Network which consistedof 5 days of traffic With a 24-hour training time and a 5-minute window size the algorithm is able to detect a 15 hourTCPSYN-Flooding attack but no further information is givenon the time the algorithm detected the deviation

In [2] the authors used a number of forecasting algo-rithms including Moving Average Weighted Moving Aver-age Exponential Smoothing and Linear Regression to pre-dict the intensity and size of DDoS attacks in the TCPprotocol such as SYN Flooding DNS Flooding and ICMPFlooding They used a window of 60 seconds betweenpredictions tomonitor the number of packetsThe two-thirdsof the attack from a total of 53minutes were used for trainingand the remaining one-third was used for testing To asseswhich forecasting algorithm has the less error rate they madeuse of the Absolute Prediction Error metric For the TCPSYN Flooding 563 packets were generated at each second forDNS Flooding 8 packets and for ICMP 1 packet per secondThrough the three types of attacks Exponential Smoothingwas the best in detecting the intensity of the attack and sizeof DNS TCP and ICMP Flooding while Linear Regressionwas the most successful in detecting the size of the TCP SYNFlooding Overall the algorithm reports to generate an errorrate of 1

Besides entropy and error to assess the error of forecastsanother way to measure error is chaos theory and theLyapunov Exponents Chaos theory is an area ofmathematicsthat aims to study nonlinear phenomena that are hard ornearly impossible to predict More specifically chaos theorystudies dynamic complex systems that are sensitive to initialconditions A small change in the initial conditions can cause

Wireless Communications and Mobile Computing 3

Table 1 List of features

Feature Name DescriptionRequests No Total number of requests in a time-series intervalPackets No Total number of packets in a time-series intervalData Rate Average data rate in Megabits in a time-series intervalAvg Packet Size Average packet size in a time-series intervalAvg Time Betw Requests Average time between two requests in a time-series intervalAvg Time Betw Response amp Request Average time between the response and the first requests encountered in a time-series intervalAvg Time Betw Responses Average time between two responses in a time-series intervalParallel Requests Total number of parallel requests in a time-series interval

crucial changes in the outcome of the dynamic systemThis isalso known as the butterfly effectThis highlights that even indeterministic systems which are entirely dependent on theirinitial conditions without any random elements involvedtheir future cannot be predicted This is described as chaoticbehaviour or simply chaos

In [5] the authors have designed a DDoS detectionsystem which uses self-similarity theory to monitor thetraffic and local Lyapunov exponents to distinguish betweennormal behaviour and DDoS attack In addition they use theLyapunov exponents to train neural networks they achievea detection rate of 88-94 with a false positive of 045-005 Similarly in [8] the authors have once again used chaostheory and Lyapunov exponents in a combinationwith neuralnetworks to detect DDoS attacks inspired by [5] achieving abetter result with a 984 detection rate

The authors in [7] conduct preprocessing on the networktraffic by calculating the simple cumulative average of timeseries values at every time interval The local Lyapunovexponent is used to detect a DDoS attack from the DARPADataset by using packets flow information Then they makeuse of a neural network to improve the DDoS detectionaccuracy They report a detection rate of 9375

As discussed in the studies mentioned above there aremultiple ways of assessing the error forecasting algorithmsproduced on each forecast with themost popular beingmeansquare error entropy and Lyapunov exponents Howeverthere is a major difference between the Lyapunov exponentsand themean square error and entropy By using either squareerror or entropy measures a threshold is assumed whilepositive Lyapunov exponents indicate chaos In forecastinga positive Lyapunov exponent indicates that the distancebetween the actual value and the forecasting one is high Thenetwork traffic orbit is both chaotic and unstableThis meansthat the nearby points diverge to any arbitrary separation sothe change of traffic is due to an attack A negative Lyapunovexponent shows that the error is not chaotic because thedifference between the forecast and the actual observation issmall

A nonchaotic error indicates normal behaviour Thenetwork traffic orbits are attracted to a stable fixed point fromwhen they diverge due to new legitimate traffic entering thesystem Hence the change of traffic is not due an attack Onthe contrary in case of an attack the network traffic orbits arenot attracted to a stable fixed point

3 ForChaos Detection Algorithm

Below we explain the basic mathematical concepts that havebeen used in the structure of ForChaos Algorithm

31 Feature Selection Most of the studies using forecastingtechniques against DDoS attacks mentioned in the previoussection usuallymake use of a single feature for prediction anddetect the possible deviations in the number of packets thenumber of packets per IP the packet flags and so onHoweverthese studies focus on detecting DDoS attacks on lowerlayers and not on the application layer The adoption of onesingle feature on the application layer is likely not to producesatisfactory results since Application DDoS attacks do notviolate any protocol rules and do not produce malformedpackets Therefore we have designed a new set of featuresto detect Application Layer Flooding and slow-rate DDoSattacks All of the features (Table 1) are heavily dependent ontime and form ideal features for forecasting-based algorithmsand presented in Table 1

(1) Requests Number In a flooding scenario the number ofrequests over a period of time will be much greater comparedto normal traffic request rates In a slow-rate attack scenariothe number of requests over two consecutive periods of timewill have large difference Since slow-rate opens connectionsto send requests after an amount of minutes there will bea pattern of low number of requests then high number ofrequests then low and so on

(2) Packets Number In a flooding attack the number ofpackets on application layer is rapidly increased over a shortperiod of time In a slow-rate attack the number of packetsis lower due to the packets being sent slowly just before arequest can be rejected Hence the request stays alive but veryfew packets are sent

(3) Data Rate In a flooding attack the data rate on applicationlayer is rapidly increased over a short period of time In aslow-rate attack the data rate is lower due to the slow datarate of the malicious requests

(4) Average Packet Size In a flooding attack the average packetsize is decreased due to the requests being simply a get requestthat features no payload In a slow-rate attack the averagesize packet is decreased since a great series of connections


sends small incomplete packets over the attack period oftime

(5) Average Time between Requests In a flooding attack theaverage time between each request is vastly decreased dueto an outburst of requests being sent over a short period oftime simultaneously In a slow-rate attack the average timebetween each request is decreased compared to the samevalue in a normal scenario Instead there are spikes of thenumber of requests being sentWhen the attack is at the stageof opening new requests the rate of them is increased Whenthe attack is at the stage of maintaining the connections opennormal subsequent packets are sent therefore the rate of therequests is less than the previous stage

(6) Average Time between Response and Request In a floodingattack the average time between a response and the nextrequest is vastly decreased In a slow-rate attack the averagetime between a response and the next request the slow-rateattack malicious requests cause

(7) Average Time between Responses In a flooding attackthe average time between two consecutive responses is vastlydecreased since a burst of requests is being sent to the servertherefore a great number of responses have to be generatedIn a slow-rate attack the same feature is increased since theslow-rate attackrsquos requests are initiated but never completed

(8) Parallel Requests In a flooding attack the average concur-rent requests are decreased since the attackers send multiplerequests that are very quickly finished In a slow-rate attackmany requests are initiated and stay open over a period oftime and the average number of parallel open requests is large

32 Our Approach to Forecasting We choose to exclude MAand WMA because we believe that in order for a forecastingmodel to make effective and accurate forecasts it must have abalance between past forecasting and observationsOur argu-ment is supported by [9] which makes an evaluation studybetween MA WMA and SES It concludes that SES is themost effective forecasting algorithm as opposed to the othertwo against DDoS attacks Between the three ARIMA-basedforecasting models there is one basic difference between eachother that assists us into making our final decision SES hasnoway of considering trend if there is present in observationswhile DES considers the potential trend in the formulaTES also considers trend in data and seasonality Althoughsuch assumptions and additional elements are useful in otherapplications such as making stock market predictions thereis no need to adopt any of these techniques due to the networktraffic being unaffected by any trend or seasonality as theauthors in [3] claim The formula of SES is shown in

119865119905+1 = 119886119860 119905 + (1 minus 119886) 119865119905 (1)

In an observed time series consisting of observations1198601 1198602 1198603 for an event the SES formula 119865119905+1 is definedas the next forecast at time 119905 119860 119905 is the previous actualobservation 119865119905 is the previous forecast and 119886 is defined asthe smoothing constant The smoothing constant a can take

the values 0 ⩽ 119886 ⩽ 1 As it can be observed by (1) thechoice of 119886 value plays an important role in having a balancebetween the forecast and the actual observation By havingthe 119886 constant value towards 1 more weight is given to theactual observation If the 119886 constant value is going towards 0more weight is given to the past observations Since we havea set of 119865 features at every time interval 119905 a forecast for eachof the features is made based on history

33 Chaos Approach for Analysing Error To measure asystemrsquos sensitivity to initial conditions Lyapunov exponentsare used A Lyapunov exponent of a dynamical system is ametric that describes incredibly small trajectories that areclose Lyapunov exponent is calculated using (4)

It is inevitable that every forecasting system will pro-duce a series of errors between their prediction and theobserved values The error at every time interval 119905119894 is definedby

119890119894 = 1003816100381610038161003816119865119894 minus 119860 1198941003816100381610038161003816 (2)

At every time interval we will have one forecast for eachof the features and therefore an equivalent amount of errorsWe define the total error from all the features as

119864119905119900119905119886119897 =1119899119899

sum119894=1

119890119894 (3)

119890119894 is the error value for every of the 119899 features In ForChaos119899 = 8 as 8 is the total number of features used

It is important for these errors to be correctly analysed somalicious behaviour can be accurately and quickly identifiedWhen an error occurs between the actual observer value andthe perspective forecast it means that either there is an attackor there is just a temporary unexpected value that is stilllegitimate We choose the Lyapunov exponent as the mean toanalyse the errors encountered and classify them as chaoticand nonchaotic The local Lyapunov exponent is calculatedas shown

120582119894 =1119905119894ln

10038161003816100381610038161003816100381610038161003816Δ119883119894Δ1198830

10038161003816100381610038161003816100381610038161003816(4)

A positive Lyapunov exponent indicates a chaoticbehaviour at time instant 119905119894 which means the forecastprediction 119901119894 has large distance to the normal observed value119909119894 The network traffic orbit is both chaotic and unstableThis means that the nearby points diverge to any arbitraryseparation so the change of traffic is due to an attackOn the contrary a negative Lyapunov exponent shows thatthe error is not chaotic because the difference between theforecast and the actual observation is small A nonchaoticerror indicates normal behaviour since the network trafficorbits are attracted to a stable fixed point from when theydiverge due to new legitimate traffic or bursty legitimatetraffic entering the system Hence the rate of traffic changeis not due to attack

120582119894 =

gt 0 DDoS attackle 0 Legitimate Traffic

(5)


1 Input Set of 8 Features f Time Series t 120572 Window Size 119908 Traffic State 119860 119905minus12Output alert not-alert3 for 119905 = 1 to n do4 for 119891 = 1 to 8 do5 119865

119905 = 119886119860 119905minus1 + (1 minus 119886)119865119905minus16 119890119905 = |119865119894 minus 119860 119894|7 119890119905119900119905119886119897119905 = 119890119905119900119905119886119897119905 +

18119899

sum119894=1

119890119894

8 120582119905= 1

119905 ln100381610038161003816100381610038161003816100381610038161003816Δ119890119905119900119905119886119897119905Δ1198900

1003816100381610038161003816100381610038161003816100381610038169 119868119891 120582119905 gt 010 return alert11 119890119897119904119890 119903119890119905119906119903119899 119899119900119905 minus 119886119897119890119903119905

Algorithm 1 ForChaos detection algorithm

Table 2 NS-3 protocols used

TCPIP Layer ProtocolsPhysicalMAC Layer SH 80211Network Layer SH IPv4Transport Layer SH TCPUDPApplication Layer SH HTTP CoAP MQTT XMPP AMQP

34 ForChaos Algorithm Pseudocode Based on the previousmathematical concepts described above we have constructedthe ForChaos Algorithm 1 for detecting Application LayerFlooding and Slow-Rate DDoS attacks in the IoT SmartHome Networks A specific amount of time is defined as thetraining procedure During that time the forecasts are nottaken into consideration Instead they are used to build thedetection model and to make correct predictions A windowsize N is chosen at the start ForChaos Algorithm accepts acontinuous 119905 time Series where 119905 belongs to 119879 as an inputEvery 119905 is a 119909 of N window size where 119909rsquos value ranges from 1to infinity Specifically if the window size is set to 30 then theT range of values is 30 60 90 120 and so on At every endof the window size the values for the eight features discussedin Section 2 are calculated Based on previous observationsa forecast is made for the value for each feature accordingto smooth exponential smoothing This forecast is comparedto the actual value for every feature and an error value isgenerated The error is the absolute value of subtracting theactual value from the forecast The total error is calculatedas the summation of errors for all features divided by theerrors standard deviation For that Lyapunov exponent errorat that specific Time Series 119905 is calculated based on (3) If itis positive then we have a DDoS attack and if it is negativethen the behaviour is legitimate

4 NS3 Simulation Testbed

41 Smart Home IoT Network Simulation For the simulationof the Smart Home network Network Simulator 3 (NS3) wasused Network Simulator 3 (NS3) is an open-source discrete-event simulator developed in C++ A summary of thenetworksrsquo parameters and protocols used is given in Table 2

ATTACKER

NORMAL

ESI

Figure 1 Smart Home architecture in NS3

A graphical representation of a typical Smart Home isillustrated in Figure 1 Various IoT devices are deployed in theHome Area Network that connect to the gateway called ESIto exchange data with the Smart City infrastructures

Figure 1 forms the representation of the IoT Smart Homesimulation in NS-3 10 nodes were simulated 9 of them rep-resent IoT devices and Smart Appliances and the final nodeforms the Smart Home gateway called ESI All simulatednodes were static with no mobility In the physical layerall the connections were wireless and the 80211 protocolwas used The structure was set to ad hoc using AODVrouting protocol In the network layer IPv4 was used andin the transport layer both UDP and TCP were used In theapplication layer a wide variety of application layer protocolswere simulated including CoAP MQTT XMPP and AMQPand HTTP

Our simulation has been constructed to represent real-istic Smart Home traffic as much as possible This has beenachieved by using the Smart Home traffic dataset generatedby [10] In [10] they have deployed 30 different Smart Homedevices including smart camers blood pressure meters


NEST protect smoke alarm smart sleep sensors smart bulbandroidiphone devices laptops and so on The maximumamount of packets generated per second was 587 Oursimulation which consisted of 7 nodes generated an averageof 137 packets per second

42 Simulation of Application DDoS Attacks Our simulationwas designed and implemented based on the specificationsfound in current smart homes Smart Homes have currentlymaximum of 100 Mbps bandwidth To flood such a line only10000 compromised machines will be needed each capableof sending 1Mbps of upstreamThese compromisedmachinesare most likely part of a botnetThe target when attacked willimmediately start losing its packets and with such upstreamspeed data it will be unavailable after a minute

NS-3 simulations are not conducted in real-time Fromour experiments we have estimated that one minute insimulation time is about eight to ten minutes of real-timeTherefore based on these findings for the 100 Mbps line tobe saturated in simulation it will need only 6-75 seconds fora flooding attack This is evident in our simulation trafficgenerated Packets start being dropped after about the 6thsecond For a slow-rate attack which is not volumetricallyhigh it needs about 10-125 of simulation time for packetsto start being dropped Hence ForChaos is able to detectthe malicious behaviour fast Furthermore the attacks areconsidered inside attacks Hence the attackersrsquo capabilitiesare going to be lighter The devices compromised are con-strained in resources such as memory processing power andbandwidth As a result an inside attack on the Smart Homewill not be able to produce a 15 Tbps overhead to the SmartHome network

Our proposed algorithmrsquos accuracy is evaluated throughexperiments Related DDoS detection studies make usage ofpopular datasets such as the KDD-99 DARPA or NSL-KDDDatasets However we cannot use these datasets as they donot have IoT traffic nor they contain application layer DDoSattacks In IoT networks the traffic is highly heterogeneousand therefore harder to detect any malicious behaviourHence we had to create our own synthetic traffic using NS-3 A series of traffic containing normal and attack trafficfiles was generated Both flooding and slow-rate attacks weresimulated In every scenario seven nodes from the SmartHome were generating normal traffic and the remaining twowere generating malicious traffic

In order for both normal and attack traffic to be simu-lated degree of randomness was added in the network packetgeneration from the nodes Randomness was introducedthrough Poisson Distribution in the following metrics at thetime a packet was created and sent from the client to theserver at the size of the packet generated from both theclient and the server and at the time the server needed torespond Different speeds were also applied in IoT nodes toservice application layer requests Hence we can simulateslow connections that are legitimate

Most of the DDoS attacks Flooding types in particularcan be classified as constant rate In constant rate attacks theattackers generate a high steady rate of traffic towards thetarget [11]The impact of such an attack is fast but it can easily

be detected due to its obvious intensity Hence attackers havemoved towardsmore sophisticatedways of conductingDDoSattacks One way to evade any security measures installed isto slowly but steadily flood the target in an increasing-rateattack where the maximum impact of the attack is reachedgradually over the attack period To simulate increased-rateattacks in NS-3 we used open random connections afterrandom time-intervals

To assess the results on ForChaos algorithmrsquos effective-ness we have calculated the most popular metrics usedwhen assessing a detection algorithmrsquos accuracy These areDetection Rate(DR) Error Rate (ER) True Positives (TP)False Positives (FP) False Negatives (FN) and PrecisionIn intrusion detection positives instances are attacks andnegative instances are normal DR measures the algorithmrsquosproportion in correctly classifying incoming instances andER measures the algorithmrsquos proportion errors in incorrectlyclassifying incoming instances DR and ERrsquos sum equals oneTP measures the proportion of positive instances that arecorrectly identified as such FP measures the proportion ofnegative instances to have been misclassified as positiveFN measures the proportion of positive instances that havebeen misclassified as negative Lastly precision measuresthe proportion of relevant instances among the retrievedinstances In other words precision measures the rate oftrue positives divided by all the positives both correctly andincorrectly classified

5 Results Analysis

Experiments are divided according to the type of the attackand the training time For every scenario we have examinedout different window sizes and alpha 119886 constant values from(1) Our main objective was to find the most optimal 119886 andwindow size with the least training time

The window size parameter is the time interval at whichthe detection algorithm generates a new prediction So if thewindow size is 20 seconds then the algorithm will calculate anew prediction after every 20 seconds For alpha we testedvalues from 01 to 09 with an interval of 01 and multiplewindow sizes between 10 and 60 seconds at an interval of 10seconds

After a series of experiments it was observed that theoptimal window size is 30 seconds and 119886 is 01 as it hasgenerated the least false alarms Most of the studies usingforecasting algorithms take 119886 the value of 05 which meansthat equal weight is given between recent and past obser-vations However our integration of chaos with forecastingresulted in a different optimal value of 119886 so that Lyapunovexponents could be calculated correctly

The training time in all of the attack scenarios wasselected to 1000 seconds to accurately detect malicious activ-ities In total eight types of attacks have been constructedand the scenarios were divided into flooding and slow-rateattacks

In the flooding attacks the parameters differentiatedwerethe number of applications used and the time the applicationswas initiated Firstly 5 applications and then 10 applicationswere used on each attacker If the time the applications


Table 3 DDoS flooding experiments results

Parameters Attack Start DR TP FP FN Prec(sec) () () () () ()

5 apps Const 1000 100 100 0 0 1005 apps Const 2000 100 100 0 0 1005 apps Const 4000 100 100 0 0 1005 apps Incr 1000 100 100 0 0 1005 apps Incr 2000 986 8750 0 125 1005 apps Incr 4000 100 100 0 0 10010 apps Const 1000 100 100 0 0 10010 apps Const 2000 100 100 0 0 10010 apps Const 4000 100 100 0 0 10010 apps Incr 1000 100 100 0 0 10010 apps Incr 2000 986 8750 0 125 10010 apps Incr 4000 100 100 0 0 100

Table 4 Slow-rate attacks results


20 apps(NoSlow) 1000 100 100 0 0 10020 apps(NoSlow) 2000 986 875 0 0 10020 apps(NoSlow) 4000 100 100 0 0 10020 apps(Slow) 1000 100 100 0 0 10020 apps(Slow) 2000 986 875 0 125 10020 apps(Slow) 4000 986 100 153 0 818240 apps(NoSlow) 1000 100 100 0 0 10040 apps(NoSlow) 2000 972 100 313 0 833340 apps(NoSlow) 4000 949 100 534 0 666740 apps(Slow) 1000 100 100 0 0 10040 apps(Slow) 2000 100 100 0 0 10040 apps(Slow) 4000 100 100 0 0 100

were initiated was set to ldquoconstantrdquo then all of the attackersrsquoapplications were initiated at the start If the parameter wasset to ldquoincreasingrdquo then the attackersrsquo applications weregradually initiated Through this differentiation we aimedto test if our detection mechanism was able to identify theattack even when the peak time of the attack was not visible

In the slow-rate attacks the numbers of applications andslow-legitimate connections were integrated in the networktraffic as part of its normal behaviour The number ofapplications was differentiated from 20 to 40 Through slowlegitimate connections we could evaluate if the algorithm isable to identify malicious or normal activity In real-trafficnot all connections are fast and completed at once especiallyin a Smart Home IoT environment where there are variousphysical obstacles (eg walls) According to [9] there is26 probability that a connection is slow in a typical homeFurthermore the study indicates that the more connecteddevices a home has the greater the probability there is forthe consumers to experience a slow connection In detailthey report a 475 probability of a slow-connection if sevenor more nodes are connected to the Home Area Network

Therefore according to these metrics we have simulatedthe same percentage as slow-connections being initiatedrandomly before the actual attack is initiated

In all scenarios the attack duration lasted for 200 secondsFor every scenario the attack is initiated either at the 1000thsecond 2000th second or 4000th second Hence in totaltwenty-four experiments were conducted A summary of theflooding experiments with their parameters and the resultsare found in Table 3 and for the slow-rate experiments inTable 4

In our attack scenarios the target of the attack was theESI as it is considered the most important node in a SmartHome

51 Flash Crowd Scenarios Application Layer FloodingDDoS attacks are very similar to Flash Crowd traffic FlashCrowd denotes the entrance of sudden burst of legitimatetraffic in the network that is legitimate It is hard to distinguishflash crowd events from Application Layer DDoS attacksbecause both of them generate a high amount of requestsin a relatively short amount of time We wanted to evaluate


Table 5 Flash crowd false positive results

ScenarioFalse

Positives()

Start of Window Size 119905 Interval

FlashCrowd40 apps1200sec

1429 1110


125 13801860 19501980 2160


4112

990 1020 11101170 12601380 1440 1470 1500 15601590 1620 1770 1800 1830 18601950 1980 2040 207021002160 2190 2250 2460 2490 2550 2640 27302760 3090 3210 3240 3330 3390 3420 348036903750 3810 3930 40204110 4140


8571 9901020 1050108011101140


825

990 10201050 1080 1110 1140 1170 1200 123012901320 1380 1410 144014701500 1560 1590 1620 16501680 1710 1740 1770 1800 1830 18601890 192019802040 2070 2100


2430990 1020 1080 1170 13801410 1500 1650 1740 17701830 1860 1890 2100 2400 2460 2550 2760 32403420 3510 3720 38403900 4080 4110

ForChaos algorithm through a series of Flash Crowd sce-narios To simulate random FlashCrowd behaviour we usedPoissonDistributionHowever the rate of sending packet wasincreased from the normal behaviour but it is not equal to orexceeding the metrics of the Flooding attack mentioned inSection 51 Specifically we have increased the apps createdfrom 35 to 40 and 45 in the two series of experiments In thefirst series of experiments 40 apps have been used in total inall three experiments which lasted for 1200 2200 and 4200seconds prospectively In the second series of experimentsthe same durations have been used but with 45 apps in total inall of them The results are presented below and summarisedin Table 5

From the results of Table 5 we can observe that ForChaoshas difficulty in identifying FlashCrowd events as it generatesa high number of missed alarms For the experiments thatlasted for 1200 and 2200 seconds ForChaos generates morefalse alarms in the 45 apps than the 40 appsThiswas expectedas a larger burst of traffic is likely to confuse the algorithminto misclassifying the behaviour as malicious Interestinglyin the 4200 seconds scenarios ForChaos performed betterin the 45 apps case than in the 40 apps case This may haveoccurred due to ForChaos having calculated more forecaststherefore becoming more robust in its future predictions

52 ForChaos Feature Reduction As described in Section 3a total of eight features were used for the detection ofApplication Layer DDoS attacks Although the complexity ofForChaos is not high we have experimented with reducingthe features to check whether we can achieve the samedetection rate Instead of randomly reducing the features we

have used feature reduction algorithms in our experimentsNature-inspired algorithms form the most popular featurereduction techniques Such algorithms followedmodels fromnature biology social systems and life sciences Some exam-ples include genetic algorithms swarm intelligence artificialimmune systems evolutionary algorithms artificial neuralnetworks fractal geometry and chaos theory

Nature inspired algorithms have an advantage against tra-ditional machine learning algorithms they focus on optimi-sation In detail nature acts as amethod ofmaking somethingas perfect as possible or choosing the most fitted samplesfrom a population In practice this family of algorithmsapplies these principles in the form of optimisation andfinding the best solution to the problem assigned In anomalydetection the main objective is to identify the maliciousbehaviour so these algorithms use their best-fit mechanismsto detect malicious abnormalities Another beneficial usageof naturebioinspired algorithms is to optimise the potentialfeatures used in attacks detection In that way an optimal setof features will be selected for efficient malware detection butalso for reducing the complexity and computational burdenAdditionally naturebioinspired algorithms are highly flexi-ble as they can accept a mixture of variables in terms of typeand continuityThis gives us the opportunity to give a varietyof different features to the algorithms

For finding an optimal set of features we have used theavailable nature-inspired algorithms provided from Wekatoolkit Specifically we used evolutionary search ant andbee search genetic search and particle swarm optimisationsearch algorithms All of the algorithms used have identifiedParallel Requests Average Data Rate Average Packet Sizeand Packet Number as the optimal features needed The


Table 6 ForChaos reduced flooding and slow-rate attacks results

Parameters Attack Start DR TP FP FN Prec(sec) () () () ()

20 apps NoSlow 1000 100 100 0 0 10020 apps NoSlow 2000 986 875 0 0 10020 apps NoSlow 4000 100 0 100 0 10020 apps Slow 1000 100 100 0 0 10020 apps Slow 2000 986 875 0 125 10020 apps Slow 4000 986 100 153 0 818240 apps NoSlow 1000 100 100 0 0 10040 apps NoSlow 2000 972 100 313 0 833340 apps NoSlow 4000 949 100 534 0 666740 apps Slow 1000 100 100 0 0 10040 apps Slow 2000 100 100 0 0 10040 apps Slow 4000 100 100 0 0 10020 apps NoSlow 1000 100 100 0 0 10020 apps NoSlow 2000 986 875 0 0 10020 apps NoSlow 4000 100 0 100 0 10020 apps Slow 1000 100 100 0 0 10020 apps Slow 2000 986 875 0 125 10020 apps Slow 4000 986 100 153 0 818240 apps NoSlow 1000 100 100 0 0 10040 apps NoSlow 2000 972 100 313 0 833340 apps NoSlow 4000 90 100 112 0 333340 apps Slow 1000 100 100 0 0 10040 apps Slow 2000 100 100 0 0 10040 apps Slow 4000 993 100 075 0 875

results of the reduced ForChaos algorithm across all thescenarios are presented in Table 6

As it can be seen from Table 6 the reduced ForChaosis able to successfully detect both the Flooding and slow-rate attacks across all different time intervals in all theirduration of attack conduction As it was expected thereduced ForChaos algorithm was bound to produce a higherlevel of FP in some scenarios This occurs in the slow-rateattack scenarios without slow legitimate connections in the2000 and 4000 seconds This could occur for the followingreason the reduction of features has made the ForChaosalgorithmmore sensitive towards changes in the network thatare not necessarily malicious

6 Discussion of Results

Throughout the experiments we proved that our proposedForChaos algorithm is able to detect malicious activity withsmall training time using eight features However in someexperiments false negatives were identified from our algo-rithm Also certain false alarms were raised under certainexperiments

Detection Rate ForChaos algorithm had the best overalldetection rate in the Flooding experiments as opposed tothe Slow-Rate experiments This is due to the nature of the

attack itself Flooding attacks make more ldquonoiserdquo and theirbehaviour is more obvious than the slow-rate attacks andthat is reflected in the features For example in a floodingscenario the request number average time between eachrequest and average time between responses and requests arevastly changed in a very short period of time In Floodingexperiments the lowest detection rate is 9861while in Slow-Rate experiments the lowest detection rate is 9493 as shownin Tables 3 and 4

False Positives Interestingly the ForChaos algorithm gener-ated the most false alarms in the Slow-Rate 40 apps withNo-Slow legitimate connections as the overall simulationtime was increased This possibly highlights the weakness offorecasting algorithms in general of incorrectly predictingvalues if too much weight is put into the recent observationsAs our 119886 value is set to 01 then more weight is put intorecent observations rather than the history As explainedat the beginning of Section 5 through our experiments thebest 119886 value for the Lyapunov exponents correct behaviourwas 01 though as it generated the least false alarms For theSlow-Rate 20 apps with No-Slow Legitimate connections nofalse alarms were raised at all In general for the slow-rateattacks the highest number of false positives rate was 534as it is shown in Table 4 The ForChaos algorithm did notgenerate any false positives after the training time in any of


Table 7 Related studies results

Study Train (sec) Wnd Size (sec) Results (DR) Attack Features

[1] 3200 600 100 UDP Flood

Inc bytespckts Noin amp outpckts

[2] 3194 60 99SYN FloodDNS FloodICMP Flood

pckts No

[3] 86400 300 100 KDD99

Bytesflowspckts

per min

[4] 86400 900 100 SYN Flood

packetsrc ipdst ipsrc portdst port

[5] - - 94 KDD99 pckts Noip adr

[6] 6000 60 995 KDD99 pckts Noip adr

[7] - - 9375 DARPA packets


For Chaos 1000 30

9861-100(Flood)9493-100(Slow-Rt)

App FloodSlow-Rate Table 1

the Flooding experiments regardless of the attack time Thisis due to the nature of the attack itself Flooding attacks makeldquonoiserdquo and their behaviour is more obvious than the Slow-Rate attacks and that is reflected in the features

False Negatives In the 5 apps and 10 apps increasing experi-ments when the attack occurs at the 2000th second it is notdetected in the first 119905119894 where 119894 is the first time series 119905when theattack is present However as the attack progresses the attackis detected in the next 119905119894 + 1 Also in the Slow-Rate 20 appsexperimentswith orwithout slow-legitimate connections theattack could not be initially detected The reason is that thistime only a small number of apps are initiated Once again asthe attack progresses ForChaos is able to detect that anomalyThe false negative rate was the same across all experiments at125

61 Discussion of ForChaos Algorithm Results with RelatedStudies Related studies have been briefly discussed inSection 2 that attempt to perform DDoS detection usingforecasting andor chaos theory and lyapunov exponentswithother techniques is summarised in Table 7 We observe thatnone of them study application layer DDoS attacks but theyinstead aim to detect their transport layer equivalents

ForChaos algorithm combines multiple mathematicalconcepts together to perform detection To the best ofour knowledge no other study has considered the simple

exponential smoothing algorithm and lyapunov exponents todetect DDoS attacks Therefore we compare our results withnotable studies that make usage of forecasting algorithms orLyapunov Exponents

Detection Rate Comparing ForChaos algorithm to the relatedforecasting algorithms studies we have drawn some inter-esting results Regarding training time and window sizesForChaos algorithm managed to get a DR of at least 943with just 1000 seconds of Training Time and a 30-secondwindow size as opposed to all the forecasting related studies[1ndash4 6] that reported needing a training time varyingbetween 3200 seconds ans 86400 seconds and a window size60 to 600 seconds to make correct predictions Thereforewith larger training time and larger window size the relatedstudies nearly achieve perfect detection rate but ForChaosalgorithm follows close enough with 9493-100

In chaos theory lyapunov exponents are used in combi-nation with neural networks as a replacement to forecastingThe detection rates from the related studies were between9405 and 995 against DDoS attacks fromDARPA andorKDD-99 dataset Our various experiments proved our algo-rithm to have better detection rates across both applicationlayer attack scenarios Our algorithm has a 9493-100 detection rate

We strongly believe that the main reason for our algo-rithmrsquos high detection rate with less training time and smaller


window-size is the higher number of features used Thestudies presented use between one and four features whilewe use a total of eight features This of course increases thecomplexity of our solution but we have considered two typesof DDoS attacks that are not by any means similar to eachother On the contrary other studies achieve the same resultmaking a server unavailable to legitimate requests by havinga vastly differentiated behaviour All the proposals presentedin Section 2 are against mainly flooding attacks and not slow-rate attacks on any network layer Additionally the trafficgenerated from the Smart Home environment is consideredheterogeneous both in its normal and in its attacking state

Therefore we need to monitor more metrics in order tomake accurate classifications about the state of the networkwhether it is under attack or not Application Layer DDoSattacks can have vastly differentiated behaviour always withrespect to time Therefore even a large dataset that is goingto be used to train the neural network might not be effectivewhen detecting a new type of attack in terms of time Itis essential for the IDS system to be fast and lightweightso it will not constrain the network Also it is importantfor the IDS System to be trained with a small dataset fastso it can construct a detection model fast based on thecurrent behaviour of the system and not just its history asthe application layer attacks exploit the variable of time andnot any protocol rules Additionally our algorithm does notneed any attack-based dataset to make correct predictions itjust needs a small amount of normal behaviour to be able todetect malicious behaviour as it was illustrated in our resultssection

False Positives ForChaos algorithm generates FPs only inslow-rate attacks with the greatest percentage being 534The other studies which reported FP rates test their algorithmagainst Flooding types of DDoS only in which our algorithmdoes not generate any FPs

False Negatives ForChaos algorithm generates a FN rate 125 while the only study to report any FN results is [6] withonly a 56

Complexity All of the studies except [1 3] make use ofARIMA models or the triple exponential smoothing whichare heavier in memory and process resources than our ownsimple exponential smoothing It is important to take intoconsideration that most of the devices deployed in a SmartHome IoT environment are resource-constrained so ourdetection algorithm is very efficient in comparison to otherproposals Furthermore neural networks are known to becomputationally heavy and require a large dataset to betrained in order to make correct classifications ArtificialNeural Networks need a large dataset for training so theycan construct their model consisting of inputs hidden layersand the outputs and their connections with each other Inintrusion detection the most popular dataset used which isKDD99 consists of 4000000 instances In the simplest ver-sion of an Artificial Neural Network which is the MultilayerPerceptron the complexity of classification is roughly O(1198992)with only one hidden layer constructed However nearly all

constructed models of neural networks have more than onehidden layers so the complexity is increased

Our algorithm is less complex as it can be seen from thealgorithm pseudocode in Section 2 with a O(119888119905119891) where 119905is the time series and 119888 is a constant because the numberof features is pre-determined eight Finally 119891 denotes thefunction in which the forecasting errors calculation andlyapunox exponenet takes place It also does not requireeither a large dataset or a long time to be trained in orderto distinguish correctly between legitimate and maliciousbehaviour The training time for it is limited to 1000 secondsGiven that it receives a new instance for training at every 30seconds it needs roughly 33 instances to start making correctpredictions

To make accurate classification Artificial Neural Net-works need to be trained with both normal and abnormalbehaviour in order to distinguish between the two Inaddition our system as it has been illustrated throughoutthe diverse experiments is able to detect various intensitiesand sizes Application Layer Flooding and Slow-Rate DDoSattacks On the other hand Forecasting algorithms do notneed both normal and abnormal behaviour to detect mali-cious behaviour However the related studies need muchmore time than ForChaos It can be observed from Table 7that the minimum training time is 3194 seconds and themaximum training time is 86400 seconds (24 hours)

62 Discussion of ForChaos Algorithm Results with MachineLearning Algorithms In Intrusion Detection machine learn-ing algorithms form a popular set of techniques sincethey can discover patterns in data without any sort ofpredefined monitored behaviour Hence they can performanomaly detection of unseen attacks without any type ofsignature However no studies have been conducted againstApplication Layer DDoS attacks in IoT using any MachineLearning Algorithms Therefore we have created a dataset ofApplication Layer DDoS attacks in IoT to evaluate a set ofMachine Learning algorithms provided byWekaThe datasetwas consisted of the scenarios we have used for the ForChaosalgorithmrsquos evaluation Each raw data scenario was split in 30-second instances and was labelled according to its behaviour(either as malicious or benign) All instances were processedthrough a feature extraction algorithm to create the datasetThe final dataset file was fed into Weka with the followingmachine learning algorithms used Bayesian Networks (BN)Naive Bayesian (NB) Support Vector Machine (SVM) Deci-sion Tree (DT) Random Forest (RF) and Artificial NeuralNetworkswith theMultilayer Perceptron architecture (MLP)Two series of experiments were conducted In the first seriesthe dataset was randomised using the ldquorandomisationrdquo filterprovided by Weka before being split into training and testset with 138 attack instances and 365 normal instances Inthe second series of experiments duplicated instances wereremoved through removing ldquoduplicates filterrdquo and then it wasrandomised through the ldquorandomisationrdquo filter The datasetconsisted of 125 attack instances and 182 normal instancesafter the duplicates removal For the training process in bothof the series of experiments half of the data were used fortraining the algorithms to construct the models The results


Table 8 Machine learning algorithms results against application layer DDoS attacks IoT dataset

Alg (Dataset) DR() TP() FP() FN() Prec()BN(1) 956 867 16 133 945NB(1) 924 85 52 15 836SVM(1) 952 767 0 133 100MLP(1) 984 967 1 32 967DT(1) 988 967 05 32 983RF(1) 988 967 06 32 983BN(2) 928 807 0 193 100NB(2) 876 767 54 133 902SVM(2) 876 683 0 237 100MLP(2) 967 95 22 5 966DT(2) 902 867 75 133 881RF(2) 941 883 22 117 964ForChaos 943 875 534 125 8182

are presented in Table 8 and discussed below The datasetconstructed can be made available upon researchers request

As expected machine learning algorithmsrsquo accuracy wasdecreased againstApplication LayerDDoS attacks as opposedto DDoS attacks in lower layers This is due to the attacksrsquogreat similarity to legitimate behaviour and the exploitationof time factor

Detection Rate MLP has performed best in the second seriesof experiments while RF performed best in the first series ofexperiments Hence MLP is more effective against unseenbehaviour than RF is Since the duplicates were removedsome instances are assessed for the first timeTherefore MLPis effective in classifying them correctly with a DR 967 asopposed to RFrsquos 941This was expected as neural networksin general are a very accurate and robust classificationalgorithm Hence it was preferred in the related studiesThe algorithm that performed worse in terms of detectionrate was Naive Bayesian which was expected NB assumesindependence of features which especially in our dataset isnot valid

True Positives For the TP rates in the first series of experi-ments RF DT and MLP did best as they had the same rate967 They were followed by BN 867 and NB 85 andSVM 767

In the second series of experiments MLP performed bestwith 95 followed by RF 883 and DT 867 MLP isproved to be robust with its TP rate being dropped by only17 while DT and RF had a higher drop rate RF performedbetter than DT as expected due to RFrsquos being an ldquoimprovedrdquoDT version Also this difference in the TP rate across the twodatasets highlights that the tree or forest being constructed isnot as effective as understanding the various types of attackbehaviour and their versatility Hence they fail in correctlyclassifying them

For the second series of experiments the BN and NBmethods follow with 807 and 767 prospectively Fromboth series of experiments it is evident that probabilisticapproaches fail to identify the attack instances This occurs

because probabilisticmodels need a large dataset to constructaccurate and robust probabilities Also the removal of dupli-cates greatly affects their performance as well

The worst algorithm for TP was SVM with 683 SVMperforms worse in both of the experiments This is due to thesmall dataset being not enough for the SVM to construct aneffective hyperplane

False Positives In both series of experiments the FP rate wasnot very high with the maximum being 75 Specificallyfor the first series of experiments SVM did not generate anyFPs followed by DT and RF with only 05 Then MLPgenerated a 1 of FP rates followed closely from BN with16 Lastly NB generated a 52 FP rate In the secondseries of experiments BN and SVM generated no FP ratesfollowed by MLP and RF with a 22 NB with 54 andDTwith75 In general all of the algorithms are quite robustagainst generating FP instances In fact they were effectivelyin understanding when a behaviour is ldquooddrdquo but still normalThis is evident through the Flash Crowd Scenarios and theslow-rate scenarios where slow legitimate connections arepresent in the network

False Negatives In both series of experiments the FN ratewas relatively high compared to other studies that used MLalgorithms for DDoS attacks in lower layers Once again thishighlights the nature of the DDoS attacks in the applicationlayer which is an exploitation of time makes them hard toactually detect them

In the first series of experiments MLP DT and RFperformed best with 33 FN rate They were followed byBN with 133 NB with 15 and SVM 233 In the secondseries of experiments MLP performed best with 5 FNfollowed by RF with 117 DT with 133 BN with 193NB with 133 and SVM with 217

A high FN rate is amajor disadvantage for any IDS systemas it means it is unable to identify when an actual attackoccurs The only algorithm that has an ldquoacceptablerdquo FN ratein both series of experiments was MLP In the first series ofexperiments RF andDTdidwell in the first series but their FN


rate was vastly increased in the second series This indicatesthat their tree and forest is unable to identify the versatilityof the attacks when duplicates are removed Also due to theversatility of the attacks and possibly the dataset not beinglarge enough probabilistic approaches (BN and NB) fail todetect the attacks Lastly SVM performs worse as it is unableto construct an optimal hyperplane between the attack andthe normal class

Comparison with ForChaos Through the multiple experi-ments conducted using ForChaos the detection rate wasvaried at 9493-100 depending on the attack scenario Atits lowest DR rate ForChaos manages to pass all of theML algorithms across the two series of experiments exceptMLP Regarding FP and FN rates ForChaos lacks in terms ofaccuracy as it has a highest 534 FP rate while the highest FPrate in ML algorithms was generated by DT with 75 andNB with 54 Regarding FN rates although ForChaos FNrate (125) is not considered low in general it still is muchlower compared to the BN NB and SVM in both series ofexperiments We also have to point out that the ForChaosalgorithm was ldquotrainedrdquo using only 33 instances of trainingwhile th ML algorithms needed 252 and 154 instances for thetwo experiments Also the ML algorithms to be trained needboth attack and normal data while ForChaos only requiresnormal data to be trained and effectively distinguish betweenlegitimate and malicious traffic In Table 8 we present theForChaos minimum values

7 Conclusion

In this paper we have presented a novel Application LayerDDoS attacks detection algorithm using simple exponen-tial smoothing and chaos theory Our approach is able todetect both Flooding and Slow-Rate Application Layer DDoSattacks in the Smart Home IoT network Our proposal is fastand accurate in detecting the attack (10 to 40 seconds afterthe attack has started) generating a very low number of falsepositives and it does not require a large dataset to constructthe model

Although our algorithm proved to have good resultsthere is always room for improvement Future directionsinclude on attempting to reduce the complexity of ourdetection method As already stated a Smart Home IoTnetwork or any IoTnetwork for thatmatter is low inmemoryand power so the more lightweight the solution the betterFurthermore the most false positives generated from ourdetection enginewere under the slow-rate attack scenarios Inparticular a possible future direction is to add more featuresthat have to do more with detecting the slow-rate attack

Smart Home communicates with many networks andcritical infrastructures such as the Smart Grid and VANETSEach network produces its own heterogeneous traffic somalicious behaviour is going to be different depending onthe type size and intensity of the attack and what thetarget is Hence cross-network attacks are a likely scenariosince so many networks are interconnected together andcommunicate with each other on a constant and continuousway Therefore the Smart Home can be a target of attacks

from other networks but it can also participate in large-scaleattacks against a Smart Cityrsquos critical infrastructure the SmartCity itself or even another countryrsquos important assets

In the future we aim to protect the communicationbetween the Smart Home and the Smart Grid It is essentialfor the Smart Home to be protected from external threatsbut also from internal threats that aim to abuse its normalfunctioning and force it to participate in large-scale DDoSattacks that aim to threaten the target critical infrastructureand the Smart City in general

Data Availability

The data for constructing the Application Layer DDoSattacks dataset have been generated through simulation toolstechniques specifically NS3 More details on the actualimplementation of our simulated environment have beengiven in Section 4 ldquoNS3 SIMULATION TESTBEDrdquo Thusthe data generated are purely synthetic and do not put anyindividualsrsquo privacy at risk Relevant parameters regardingthe data generation as well as the actual parameters used insimulation have been provided in Section 4 The dataset canbe available upon request from the corresponding author

Conflicts of Interest

There is no conflict of interest

References

[1] P Shinde and S Guntupalli ldquoEarly DoS attack detection usingsmoothened time-series and wavelet analysisrdquo in Proceedings ofthe 3rd Internationl Symposium on Information Assurance andSecurity IAS 2007 pp 215ndash220 UK 2007

[2] C Fachkha E Bou-Harb and M Debbabi ldquoTowards a Fore-casting Model for Distributed Denial of Service Activitiesrdquoin Proceedings of the 2013 IEEE 12th International Symposiumon Network Computing and Applications (NCA) pp 110ndash117Cambridge MA USA August 2013

[3] P Winter H Lampesberger M Zeilinger and E HermannldquoOn detecting abrupt changes in network entropy time seriesrdquoin Communication and Multimedia Security lecture Notes inComputer Science vol 7025 of Lecture Notes in Comput Sci pp194ndash205 Springer Heidelberg Germany 2011

[4] A S De Moura ldquoAnomaly detection using Holt-Winters fore-cast modelrdquo in Proceedings of the IADIS International Confer-ence WWWInternet 2011 ICWI 2011 pp 349ndash356 Brazil 2011

[5] A Chonka J Singh and W Zhou ldquoChaos theory baseddetection against network mimicking DDoS attacksrdquo IEEECommunications Letters vol 13 no 9 pp 717ndash719 2009

[6] S M T Nezhad M Nazari and E A Gharavol ldquoA Novel DoSand DDoS Attacks Detection Algorithm Using ARIMA TimeSeriesModel andChaotic System inComputerNetworksrdquo IEEECommunications Letters vol 20 no 4 pp 700ndash703 2016

[7] Y Chen X Ma and X Wu ldquoDDoS detection algorithm basedon preprocessing network traffic predicted method and chaostheoryrdquo IEEE Communications Letters vol 17 no 5 pp 1052ndash1054 2013


[8] X Wu and Y Chen ldquoValidation of chaos hypothesis in NADAand improved DDoS detection algorithmrdquo IEEE Communica-tions Letters vol 17 no 12 pp 2396ndash2399 2013

[9] Cisco Bandwidth Consumption and Broadband ReliabilityStudying Speed Performance and Bandwidth Use in the Con-nected Home White Paper 2012

[10] A Sivanathan D Sherratt H H Gharakheili et al ldquoCharacter-izing and classifying IoT traffic in smart cities and campusesrdquo inProceedings of the 2017 IEEE Conference on Computer Commu-nications Workshops INFOCOM WKSHPS 2017 pp 559ndash564Atlanta GA USA 2017

[11] J Mirkovic and P Reiher ldquoA taxonomy of DDoS attack andDDoSdefensemechanismsrdquoComputer CommunicationReviewvol 34 no 2 pp 39ndash53 2004

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018


Active and Passive Electronic Components

VLSI Design



Shock and Vibration


Civil EngineeringAdvances in

Acoustics and VibrationAdvances in



Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of



Journal ofEngineeringVolume 2018

SensorsJournal of



RotatingMachinery


Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018


Chemical EngineeringInternational Journal of Antennas and

Propagation




Navigation and Observation


Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom


heterogeneous networks such as the Smart Home networkThe potential security system must be able to identify largedeviations in the traffic behaviour from the normal behaviourthat it is expected but also being robust against temporarynormal spikes

One way of detecting such attacks is to have a holisticbehaviour of the attack in terms of time ldquoTimerdquo is whatdifferentiates this malicious behaviour from normal and canclassify it as an attack The anomaly-detection solution mustbe able to detect changes in behaviour of the network bymonitoring its behaviour over a set of time-series

Due to the nature of the attacks described above it isevident that the detection algorithm must monitor closelythe present short-term traffic and not base its knowledge onlong-term previous behaviour of the system The reason forthis is that due to time being such a volatile entity and thetraffic being heterogeneous the network traffic generated candifferentiate from to its past history but appear legitimatenevertheless

Therefore we must predict what the future short-time-interval traffic is based on present or short-time-interval pre-vious behaviour of the system and not just classify the presentnetwork trafficmonitoring based on long-termprevious fixedbehaviour of the system It is important to avoid incorrectlyclassifying a behaviour as malicious simply because it is notsimilar to the history of the network Our detection algorithmmakes use of forecasting techniques to make short-termpredictions and identify the attacks through the constructionof Lyapunov exponents for every time-series interval

The remainder of this paper is organized as followsSection 2 presents the related studies Section 3 describes theproposed detection algorithm with the appropriate explana-tion of it Section 4 explains the simulation testbed imple-mented for conducting the experiments Section 5 presentsand discusses the experimental results Finally a conclusionand future directions are provided in Section 6

2 Related Studies

Various forecasting techniques have been proposed and usedin the past for the effective detection of DDoS attacks Themost popular are Moving Average (MA) Weighted MovingAverage (WMA) Simple Exponential Smoothing (SES) alsocalled Exponential Weighted Moving Average (EWMA)Double Exponential Smoothing (DES) and Triple Exponen-tial Smoothing (TES) also called Holtrsquos Winter SmoothingMA and WMA make a forecast solely based on previousobservations only while the rest of the techniques considerboth past observations and past forecasts

The authors in [6] used the Box-J Cox Jekings ARIMAmodel to detect DDoS attacks by monitoring the ip addressesand the packetsThey use the Lyapunov exponents tomeasurethe error generatedThey have constructed their algorithm tomake predictions every 5 minutes with a training time of 100minutes They report their algorithm to have generated a TPrate of 944 a FP of 01 and FN of 56

The authors in [1] used simple exponential smoothing andwavelet analysis in monitoring incoming bytes number of

packets and the proportion between incoming and outgoingpackets to detect UDP Flooding DDoS attacks They choosewindow sizes 600 and seconds with an overlap of 10

Their algorithm after a series of experiments and adjust-ments (window size of 100 seconds with an overlap of either50 of 80) generates no false positives and manages todetect 10 different types of attack scenarios However theydo not discuss whether their algorithm generates any FNs inthe scenarios which is quite important especially since inprevious experiments their algorithm was not able to detectthe attack at its start time but 200 seconds after initiated

In [3] the authors proposed linear exponential smooth-ing to detect Flooding and Scanning DDoS attacks usingbytes flows and packets per minutes They report a 24-hour training time and a window size of 5 minutes toaccurately detect deviations in trafficThe 119886 value is set to 005and the deviation is measured using entropy Through theirexperiments their algorithmcandetect the attackwhich lastsfor 11 minutes and 13 hours after the algorithm has startedbeing trained However it misclassified instances both beforeand after the actual attack In total 11 predictions misclassifynormal behaviour as attack out of 288 This gives a FP 382

In [4] the authors used the triple exponential smoothingto detect TCP SYN Flooding DDoS and Slammer Worm byanalysing each packets source ip destination ip source portand destination portTheyused awindow size of 900 secondsTo assess the error they used of the relative entropy Theydetect the attacks inserted in real-traffic from the BrazilianNational Research and Education Network which consistedof 5 days of traffic With a 24-hour training time and a 5-minute window size the algorithm is able to detect a 15 hourTCPSYN-Flooding attack but no further information is givenon the time the algorithm detected the deviation

In [2] the authors used a number of forecasting algo-rithms including Moving Average Weighted Moving Aver-age Exponential Smoothing and Linear Regression to pre-dict the intensity and size of DDoS attacks in the TCPprotocol such as SYN Flooding DNS Flooding and ICMPFlooding They used a window of 60 seconds betweenpredictions tomonitor the number of packetsThe two-thirdsof the attack from a total of 53minutes were used for trainingand the remaining one-third was used for testing To asseswhich forecasting algorithm has the less error rate they madeuse of the Absolute Prediction Error metric For the TCPSYN Flooding 563 packets were generated at each second forDNS Flooding 8 packets and for ICMP 1 packet per secondThrough the three types of attacks Exponential Smoothingwas the best in detecting the intensity of the attack and sizeof DNS TCP and ICMP Flooding while Linear Regressionwas the most successful in detecting the size of the TCP SYNFlooding Overall the algorithm reports to generate an errorrate of 1

Besides entropy and error to assess the error of forecastsanother way to measure error is chaos theory and theLyapunov Exponents Chaos theory is an area ofmathematicsthat aims to study nonlinear phenomena that are hard ornearly impossible to predict More specifically chaos theorystudies dynamic complex systems that are sensitive to initialconditions A small change in the initial conditions can cause























119865119905+1 = 119886119860 119905 + (1 minus 119886) 119865119905 (1)





119890119894 = 1003816100381610038161003816119865119894 minus 119860 1198941003816100381610038161003816 (2)


119864119905119900119905119886119897 =1119899119899

sum119894=1

119890119894 (3)



120582119894 =1119905119894ln

10038161003816100381610038161003816100381610038161003816Δ119883119894Δ1198830

10038161003816100381610038161003816100381610038161003816(4)


120582119894 =


(5)




18119899

sum119894=1

119890119894

8 120582119905= 1

119905 ln100381610038161003816100381610038161003816100381610038161003816Δ119890119905119900119905119886119897119905Δ1198900








ATTACKER

NORMAL

ESI














5 Results Analysis





















ScenarioFalse

Positives()



1429 1110


125 13801860 19501980 2160


4112

990 1020 11101170 12601380 1440 1470 1500 15601590 1620 1770 1800 1830 18601950 1980 2040 207021002160 2190 2250 2460 2490 2550 2640 27302760 3090 3210 3240 3330 3390 3420 348036903750 3810 3930 40204110 4140


8571 9901020 1050108011101140


825

990 10201050 1080 1110 1140 1170 1200 123012901320 1380 1410 144014701500 1560 1590 1620 16501680 1710 1740 1770 1800 1830 18601890 192019802040 2070 2100


2430990 1020 1080 1170 13801410 1500 1650 1740 17701830 1860 1890 2100 2400 2460 2550 2760 32403420 3510 3720 38403900 4080 4110





















[1] 3200 600 100 UDP Flood



pckts No

[3] 86400 300 100 KDD99

Bytesflowspckts

per min

[4] 86400 900 100 SYN Flood






For Chaos 1000 30

9861-100(Flood)9493-100(Slow-Rt)






































7 Conclusion






Data Availability




References















RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia
























119865119905+1 = 119886119860 119905 + (1 minus 119886) 119865119905 (1)





119890119894 = 1003816100381610038161003816119865119894 minus 119860 1198941003816100381610038161003816 (2)


119864119905119900119905119886119897 =1119899119899

sum119894=1

119890119894 (3)



120582119894 =1119905119894ln

10038161003816100381610038161003816100381610038161003816Δ119883119894Δ1198830

10038161003816100381610038161003816100381610038161003816(4)


120582119894 =


(5)




18119899

sum119894=1

119890119894

8 120582119905= 1

119905 ln100381610038161003816100381610038161003816100381610038161003816Δ119890119905119900119905119886119897119905Δ1198900








ATTACKER

NORMAL

ESI














5 Results Analysis





















ScenarioFalse

Positives()



1429 1110


125 13801860 19501980 2160


4112

990 1020 11101170 12601380 1440 1470 1500 15601590 1620 1770 1800 1830 18601950 1980 2040 207021002160 2190 2250 2460 2490 2550 2640 27302760 3090 3210 3240 3330 3390 3420 348036903750 3810 3930 40204110 4140


8571 9901020 1050108011101140


825

990 10201050 1080 1110 1140 1170 1200 123012901320 1380 1410 144014701500 1560 1590 1620 16501680 1710 1740 1770 1800 1830 18601890 192019802040 2070 2100


2430990 1020 1080 1170 13801410 1500 1650 1740 17701830 1860 1890 2100 2400 2460 2550 2760 32403420 3510 3720 38403900 4080 4110





















[1] 3200 600 100 UDP Flood



pckts No

[3] 86400 300 100 KDD99

Bytesflowspckts

per min

[4] 86400 900 100 SYN Flood






For Chaos 1000 30

9861-100(Flood)9493-100(Slow-Rt)






































7 Conclusion






Data Availability




References















RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia





18119899

sum119894=1

119890119894

8 120582119905= 1

119905 ln100381610038161003816100381610038161003816100381610038161003816Δ119890119905119900119905119886119897119905Δ1198900








ATTACKER

NORMAL

ESI














5 Results Analysis





















ScenarioFalse

Positives()



1429 1110


125 13801860 19501980 2160


4112

990 1020 11101170 12601380 1440 1470 1500 15601590 1620 1770 1800 1830 18601950 1980 2040 207021002160 2190 2250 2460 2490 2550 2640 27302760 3090 3210 3240 3330 3390 3420 348036903750 3810 3930 40204110 4140


8571 9901020 1050108011101140


825

990 10201050 1080 1110 1140 1170 1200 123012901320 1380 1410 144014701500 1560 1590 1620 16501680 1710 1740 1770 1800 1830 18601890 192019802040 2070 2100


2430990 1020 1080 1170 13801410 1500 1650 1740 17701830 1860 1890 2100 2400 2460 2550 2760 32403420 3510 3720 38403900 4080 4110





















[1] 3200 600 100 UDP Flood



pckts No

[3] 86400 300 100 KDD99

Bytesflowspckts

per min

[4] 86400 900 100 SYN Flood






For Chaos 1000 30

9861-100(Flood)9493-100(Slow-Rt)






































7 Conclusion






Data Availability




References















RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia











5 Results Analysis





















ScenarioFalse

Positives()



1429 1110


125 13801860 19501980 2160


4112

990 1020 11101170 12601380 1440 1470 1500 15601590 1620 1770 1800 1830 18601950 1980 2040 207021002160 2190 2250 2460 2490 2550 2640 27302760 3090 3210 3240 3330 3390 3420 348036903750 3810 3930 40204110 4140


8571 9901020 1050108011101140


825

990 10201050 1080 1110 1140 1170 1200 123012901320 1380 1410 144014701500 1560 1590 1620 16501680 1710 1740 1770 1800 1830 18601890 192019802040 2070 2100


2430990 1020 1080 1170 13801410 1500 1650 1740 17701830 1860 1890 2100 2400 2460 2550 2760 32403420 3510 3720 38403900 4080 4110





















[1] 3200 600 100 UDP Flood



pckts No

[3] 86400 300 100 KDD99

Bytesflowspckts

per min

[4] 86400 900 100 SYN Flood






For Chaos 1000 30

9861-100(Flood)9493-100(Slow-Rt)






































7 Conclusion






Data Availability




References















RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia

















ScenarioFalse

Positives()



1429 1110


125 13801860 19501980 2160


4112

990 1020 11101170 12601380 1440 1470 1500 15601590 1620 1770 1800 1830 18601950 1980 2040 207021002160 2190 2250 2460 2490 2550 2640 27302760 3090 3210 3240 3330 3390 3420 348036903750 3810 3930 40204110 4140


8571 9901020 1050108011101140


825

990 10201050 1080 1110 1140 1170 1200 123012901320 1380 1410 144014701500 1560 1590 1620 16501680 1710 1740 1770 1800 1830 18601890 192019802040 2070 2100


2430990 1020 1080 1170 13801410 1500 1650 1740 17701830 1860 1890 2100 2400 2460 2550 2760 32403420 3510 3720 38403900 4080 4110





















[1] 3200 600 100 UDP Flood



pckts No

[3] 86400 300 100 KDD99

Bytesflowspckts

per min

[4] 86400 900 100 SYN Flood






For Chaos 1000 30

9861-100(Flood)9493-100(Slow-Rt)






































7 Conclusion






Data Availability




References















RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia
















[1] 3200 600 100 UDP Flood



pckts No

[3] 86400 300 100 KDD99

Bytesflowspckts

per min

[4] 86400 900 100 SYN Flood






For Chaos 1000 30

9861-100(Flood)9493-100(Slow-Rt)






































7 Conclusion






Data Availability




References















RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia






























7 Conclusion






Data Availability




References















RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia









RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia


ForChaos: Real Time Application DDoS Detection Using...

Documents

Transcript of ForChaos: Real Time Application DDoS Detection Using...