Energy-Aware Smart Connectivity for IoT Networks: Enabling ...

12
Research Article Energy-Aware Smart Connectivity for IoT Networks: Enabling Smart Ports Metin Ozturk , 1 Mona Jaber , 2 and Muhammad A. Imran 1 1 School of Engineering, University of Glasgow, Glasgow G12 8QQ, UK 2 Fujitsu Laboratories of Europe, Hayes UB4 8FE, UK Correspondence should be addressed to Muhammad A. Imran; [email protected] Received 7 March 2018; Revised 5 June 2018; Accepted 10 June 2018; Published 28 June 2018 Academic Editor: Manuel Fernandez-Veiga Copyright © 2018 Metin Ozturk et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. e Internet of ings (IoT) is spreading much faster than the speed at which the supporting technology is maturing. Today, there are tens of wireless technologies competing for IoT and a myriad of IoT devices with disparate capabilities and constraints. Moreover, each of many verticals employing IoT networks dictates distinctive and differential network qualities. In this work, we present a context-aware framework that jointly optimises the connectivity and computational speed of the IoT network to deliver the qualities required by each vertical. Based on a smart port application, we identify energy efficiency, security, and response time as essential quality features and consider a wireless realisation of IoT connectivity using short range and long-range technologies. We propose a reinforcement learning technique and demonstrate significant reduction in energy consumption while meeting the quality requirements of all related applications. 1. Introduction e Internet of ings (IoT) is today’s buzzword, oſten cou- pled with Big Data and Artificial Intelligence (AI). However, there is a lot of ambiguity of what is meant by that and scepticism about the actual value generated by the IoT. IoT devices have become pervasive but cover a broad range of technologies and standards. Wireless technology is key to connect these devices through gateways or aggregation points; but, similarly, a wide range of wireless protocols and standards are available and competing [1]. Once these devices are connected, they start reporting the sensed or measured data to the platform. Again, multiple choices are possible in this aspect with different strengths and weaknesses. Report- ing raw data to the cloud is very costly as every bit gets charged and may also exhaust the battery of the device; this results in massive data. On the other hand, running scripts locally in the device and reporting the resulting events to the cloud reduce the cloud service cost but limits the visibility to the actual data; this still results in big data. Moreover, local scripts result is real-time actions and do not expose the privacy of the data, whereas cloud computing incurs latency due to the transmission network and requires stringent security measures to protect the data. An environment, which is rich in IoT devices that are connected to a platform, qualifies as digitised, and oſten as intelligent. Analytics, which uses AI, is the added layer that transforms such an environment into a smart one. e default application of AI is to draw actionable insights from the data in order to generate value to the given vertical. In this work, we argue that IoT solutions should not be addressed through a layered perspective but, instead, a holistic optimisation approach is needed to generate the desired added value effi- ciently. In such a holistic approach, AI, among other machine learning tools, is employed in every stage of the solution including connectivity, storage, computing, and analytics. Since there are many use-cases of the IoT paradigm [2], it should be approached from a given vertical per- spective, e.g., smart health, smart cities, smart manufac- turing (Industry 4.0), smart transport, etc. Each of these verticals comprises multiple IoT-based applications with various requirements. In [3], for example, signalling mea- surements and modelling are performed for both static and vehicular machine-to-machine (M2M) applications, as Hindawi Wireless Communications and Mobile Computing Volume 2018, Article ID 5379326, 11 pages https://doi.org/10.1155/2018/5379326

Transcript of Energy-Aware Smart Connectivity for IoT Networks: Enabling ...

Page 1: Energy-Aware Smart Connectivity for IoT Networks: Enabling ...

Research ArticleEnergy-Aware Smart Connectivity for IoT NetworksEnabling Smart Ports

Metin Ozturk 1 Mona Jaber 2 and Muhammad A Imran 1

1School of Engineering University of Glasgow Glasgow G12 8QQ UK2Fujitsu Laboratories of Europe Hayes UB4 8FE UK

Correspondence should be addressed to Muhammad A Imran muhammadimranglasgowacuk

Received 7 March 2018 Revised 5 June 2018 Accepted 10 June 2018 Published 28 June 2018

Academic Editor Manuel Fernandez-Veiga

Copyright copy 2018 Metin Ozturk et al This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

The Internet of Things (IoT) is spreading much faster than the speed at which the supporting technology is maturing Todaythere are tens of wireless technologies competing for IoT and a myriad of IoT devices with disparate capabilities and constraintsMoreover each of many verticals employing IoT networks dictates distinctive and differential network qualities In this work wepresent a context-aware framework that jointly optimises the connectivity and computational speed of the IoT network to deliverthe qualities required by each vertical Based on a smart port application we identify energy efficiency security and response timeas essential quality features and consider a wireless realisation of IoT connectivity using short range and long-range technologiesWe propose a reinforcement learning technique and demonstrate significant reduction in energy consumption while meeting thequality requirements of all related applications

1 Introduction

The Internet of Things (IoT) is todayrsquos buzzword often cou-pled with Big Data and Artificial Intelligence (AI) Howeverthere is a lot of ambiguity of what is meant by that andscepticism about the actual value generated by the IoT IoTdevices have become pervasive but cover a broad rangeof technologies and standards Wireless technology is keyto connect these devices through gateways or aggregationpoints but similarly a wide range of wireless protocols andstandards are available and competing [1] Once these devicesare connected they start reporting the sensed or measureddata to the platform Again multiple choices are possible inthis aspect with different strengths and weaknesses Report-ing raw data to the cloud is very costly as every bit getscharged and may also exhaust the battery of the devicethis results in massive data On the other hand runningscripts locally in the device and reporting the resulting eventsto the cloud reduce the cloud service cost but limits thevisibility to the actual data this still results in big dataMoreover local scripts result is real-time actions and do notexpose the privacy of the data whereas cloud computing

incurs latency due to the transmission network and requiresstringent security measures to protect the data

An environment which is rich in IoT devices that areconnected to a platform qualifies as digitised and often asintelligent Analytics which uses AI is the added layer thattransforms such an environment into a smart oneThedefaultapplication of AI is to draw actionable insights from the datain order to generate value to the given vertical In this workwe argue that IoT solutions should not be addressed througha layered perspective but instead a holistic optimisationapproach is needed to generate the desired added value effi-ciently In such a holistic approach AI among other machinelearning tools is employed in every stage of the solutionincluding connectivity storage computing and analytics

Since there are many use-cases of the IoT paradigm[2] it should be approached from a given vertical per-spective eg smart health smart cities smart manufac-turing (Industry 40) smart transport etc Each of theseverticals comprises multiple IoT-based applications withvarious requirements In [3] for example signalling mea-surements and modelling are performed for both staticand vehicular machine-to-machine (M2M) applications as

HindawiWireless Communications and Mobile ComputingVolume 2018 Article ID 5379326 11 pageshttpsdoiorg10115520185379326

2 Wireless Communications and Mobile Computing

both have different signalling overhead characteristics Asanother example remote monitoring in smart cities requiresfull compliance with privacy regulations whereas security-related applications rank response time highest among all keyperformance indicators (KPI)

In this article we adopt the smart port use-case todemonstrate the context-aware smart connectivity since itincludes various types of applications and has a determinedneed for monetisation (as opposed to smart cities that areprimary developed for the well-being and productivity of thesociety) According to figures from the World Trade Orga-nization 80 of worldwide freight is transported throughports (httpswwwwtoorg) The smart port concept entailsthe use of technologies to transform the different publicservices at ports into interactive systems with the purposeof meeting the needs of port users with a greater levelof efficiency transparency and value European smart portinitiatives include the following among many others

(i) The port of Rotterdam where IoT-sensors are usedto generate a digital twin and enable augmentedintelligence

(ii) The port of Hamburg which exploits 5G networks toenable virtual reality for vital infrastructure monitor-ing

(iii) The port of Antwerp employs blockchain technologyto enable a secure transfer of rights to be exchangedbetween often competing parties

(iv) The port of Seville through the Tecnoport 2025 projectuses mobile network technology for traffic and goodstracking on port and their logistical transfer on land

Smart ports present a particular challenge due to thenecessity of information exchange among competing stake-holders including port authorities port operators terminaloperators logistics companies shipping companies etc Itis then likely that multiple IoT networks would coexist andwould consist of partly private and partly public or sharedinfrastructure As described in [4] there are various commu-nication standards with different strengths and weaknesseswhich may be used for connecting IoT networks in thecontext of smart ports Mobile IoT ie connectivity overlicensed mobile wireless networks is often the preferredsolution for handling private data since it is reliable end-to-end secure (owing to the eSIM card) scalable ubiquitousand mature Two main technologies have been introducedby mobile networks to connect IoT devices eMTC and NB-IoT [5] Both of these technologies are compatible with LTE(state-of-the-art commercial mobile network technology)which means that a software update suffices to deploy theIoT options The former is geared towards higher rates (gt 1Mbps) and supports VoIP (Voice over IP based on ITUH323protocol (httpswwwituintrecT-REC-H323e)) and flex-ile mobility The latter is designed for low data rates (20kbps) and long range (100 km) but with limited mobil-ity The NB-IoT technology consists of restricting theenergy of an LTE normal carrier in a narrow band henceallowing a maximum coupling loss that is 20 dB higher(164 dB) than LTE [6] Mobile IoT is a public service

enabled by telecom carriers and may be used by any partywho subscribes to it Other long-range and low-powersolutions such as LoRa(httpswwwlora-allianceorg) andSigfox(httpswwwsigfoxcomen) are unlicensed and canreach similar coverage and data rates as NB-IoT and eMTCThese may be privately owned but require the usage of agateway to connect to the Internet and are often consideredless secure Many short range unlicensed wireless connec-tivity solutions are available such as WiFi (IEEE 80211119892)Bluetooth ZigBee etc as described in [7] and may beshared public or private

In the presence of multiple wireless technologies dis-parate IoT applications competing parties and a broad rangeof static and moving IoT devices with multiple connectivityoptions it is of key importance to identify the best way tocollect store cache and process the IoT data What qualifiesas the best way depends on the device capabilities (eg con-nectivity options available battery) the wireless conditionsthe security requirements the processing complexity andavailability the cost of storagecachinguploading etc

2 Related Work

As the energy consumption is one of the challenges for IoTnetworks [8] recent works such as [9 10] study the trade-offbetween local and cloud computing in terms of device energyconsumption The former proposes an analytical frameworkthat minimises the energy consumption by optimising theoffloading decision of multiple user devices The latter elab-orates a theoretical framework for establishing trade-offsin the energy consumption and IoT infrastructure billingcomprising cloud computing Mobile wireless networks area prime contender in the race to connect IoT networks owingto their well-established and ubiquitous coverage and securecommunication based on the subscriber identity module(eSIM card) In [11] authors investigate the connectivity ofNB-IoT and LoRa in terms of both area and populationcoverage in order to highlight the importance of the networkdeployments In [12] big data analytics based user-centricsmart connectivity is argued by providing correspondingresearch challenges

Although data aggregation seems a promising solutionto ease the signalling overhead it is one of the causes ofthe transmission delay In [13] authors discuss the trade-off between delay and signalling overhead in order todemonstrate the impacts of data aggregation Authors in[14] analyse the joint optimisation of caching and taskoffloading in such networks with mobile edge computingThey present an efficient online algorithmbased on Lyapunovoptimisation and Gibbs sampling that succeeds in reducingcomputation latency while keeping the energy consumptionlow In [15] a recommendation system is proposed to addressthe challenge of link selection in a cloud radio access networkA data-driven scheme is introduced that results in optimisedclassification of link strengths between remote radio headsand IoT devices

A deep learning algorithm for edge computing is intro-duced in [16] to boost the learning performance in IoT

Wireless Communications and Mobile Computing 3

networks They also attempt to increase the amount of edgetasks by considering the edge capacity constraints An open-source database is designed in [17] for the edge computationof Industrial IoT (IIoT) networks The authors use a time-series analysis for predicting conditions of IIoT machinesin order to decrease the amount of condition reports tobe sent to the cloud A holistic view of communicationcomputation and caching is presented in [18] using graph-based representations as learning methods for innovativeresource allocation techniquesThe performance of the edge-caching as well as the energy efficiency and delivery time isinvestigated in [19] with quality of service (QoS) constraints

In this work we employ machine learning techniquesbased on reinforcement learning in order tomanagemultipleoptimisation objectives jointly and to dynamically identifythe best connection and route for each device We identifyfour key quality features that dominate IoT applications ingeneral and smart ports in particular security energy latencyand cost This work is the first to address these multiple IoToptimisation objectives jointly using reinforcement learningWe compare our novel approach to the state-of-the-artconnectivity solutions and demonstrate significant gains inall aspects (ranging from 959 to 28354) Moreover ourapproach is the only one that is able to meet the context-aware requirements fully while minimising the cost and theenergy consumption The advantage of the machine learningscheme adopted is primarily its low complexity and its abilityto optimise in a dynamic environment such as a smart port

The rest of the paper is organised as follows In Section 3we define the system model of our research In Section 4we present our novel machine-learning-based solution forsolving the multiobjective problem Section 5 elaborates theresults and analysis and in Section 6 we conclude the article

3 System Model

The energy-aware smart connectivity novel approach pro-posed in this work applies to any IoT network with diverseoptions of connectivity and processing For the sake of clarityin the presentation we build the system model around asmart port scenario such as the one shown in Figure 1 All IoTdevices are battery operated and have different battery livesThey all have some processing power to perform basic tasksand can either offload the task to the gateway (or fog) ie theWiFi access point or to the evolved node B (eNB or cloud)

Differently from the state-of-the-art research we proposeto decide simultaneously on the best connectivity and the bestlocation for processing the tasks by jointly optimising energyresponse time security and cost A two-stage approachwhich describes the decision and optimisation processes ispresented in Figure 2 It is assumed that every IoT device iscontrolled by a given application and they jointly determinethe context-aware constraints Each combination of connec-tivity option and processing location offers specific charac-teristics and limitations Stage 1 consists of optimising thesedecisions based on the context-aware constraints while Stage2 refines the trade-off between energy consumption and costIn the following paragraphs we describe the models adopted

Figure 1 Smart port diagram with two overlapping networks NB-IoT and WiFi WiFi access points use LTE for backhauling All IoTdevices are capable of both wireless technologies

to capture the propagation loss energy consumption andresponse time for the proposed system Table 1 lists all theparameters that are pertinent to our simulations

31 Propagation Model There are three wireless connectionsthat require modelling (a) Device-to-Gateway (WiFi) (b)Device-to-eNB (NB-IoT) and (c) Gateway-to-eNB (LTE)Connections (a) and (c) are often interference limited asthe employed spectrum is likely to be shared by otherneighbouring connections Connections of type (b) arehowever considered to be noise limited as we assume thatthere are no other eNB in the surrounding employing NB-IoT technology The objective of the propagation modellingis to determine the transmission power required to caterfor each of the wireless connection types Accordingly theenergy consumption will be calculated We start with thepropagation loss 119871 which is modelled as a function of twotechnology-specific parameters the propagation constant 119870and the propagation exponent 120572 and the distance of thewireless hop 120575measured in 119896119898 as shown below

119871 = 119870 sdot 120575120572 (1)

Moreover the probability of having line of sight between thedevice and the gateway is much higher than in the case of theother types of wireless connections hence the propagationloss per decade is less [20] On the other hand NB-IoTconnections suffer the same propagation loss per decade asLTE links however are successfully received with 20 dB lesspower (threshold receiver sensitivity is minus141 dBm) For alltypes of links the received power at a distance 119889119909 from thetransmitting device can be expressed as 119875119903 = 119875119905119871 in mWattNext we calculate the required received power 119875119903 (in mWatt)in order to achieve the target data transmission119863 in bits

119863 = 119879 sdot 119861 sdot log2 (1 + 119875119903119875119868 + 1198730 sdot 119861) (2)

where 119879 is the time period 119861 is the channel bandwidth and119875119868 is the cumulative interference power on the given channelduring time period 119879 Please note that 119875119868 is null for wirelessconnections of type (b) Using (2) and solving for 119875119903 we get

119875119903 = (2119863(119879sdot119861) minus 1) times (119875119868 + 1198730 sdot 119861) (3)

4 Wireless Communications and Mobile Computing

Sensor- Wireless

options- Battery

Gateway(WiFi)

eNB(NB-IoT)

Local processing

Gateway processing

Cloud processing Actuation

Application- Security- Response time

Connectivity- eSiMGateway- Energy consumption

Processing- Availability- Response-time- Energy consumption

Local processing

Cloud processing

Joint DecisionContext-aware

STAGE 1

Constraints

STAGE 2

Trade-off

Cost

Energy

Figure 2 Decision and optimisation processes in a two-stage approach to optimise four performance criteria energy response time securityand cost

Source

LTENB-IoT Wireless ChannelWiFi Wireless

Channel

Recipient

Data transmitted (Dr) Processing at recipient

Data transmitted (Dp) Processing at sourcetptp

Figure 3 Uplink delay model capturing the factors affecting both processing and transmission delays over any hop in our system

32 Energy Consumption Model There are two major pro-cesses that consume energy in an IoT network wirelesstransmission and task computationThe energy consumptionof the former is 119864119905 and the latter is 119864119901 thus the total energyconsumption is the sum of both Depending on the route ofcommunication taken by the device the energy consumeddue to transmission power can be a result of either one hopusing NB-IoT (119864119905119887) or two hops using WiFi for the first linkand LTE for the second (119864119905119886 + 119864119905119888) The energy consumed forprocessing the task is a function of the data rate requirementof device 119889 120579119889 and the computational power of the processor119864119901119894 forall119894 = 119889 119891 119888 (see Table 1) and is expressed as119864119901 = 120579sdot119864119901119894 33 Response Time Model The response time perceived bythe IoT device is the combination of the uplink and downlinkdelays between the IoT device and the server In this workthe uplink delay is modelled while the downlink delay isassumed the same for all devices

The uplink delay is caused by two phenomena taskprocessing (processing delay 119905119901) and data transmission(transmission delay 119905119905) The processing delay depends on theprocessorrsquos computational power which is measured in thenumber of computational cycle per data element (120578) ie thehigher 120578 the less computational power Naturally a server hashigher computational power than a small gateway and muchhigher than a simple IoT device (120578119888 lt 120578119891 lt 120578119889) Thus in

this work 119905119901 is modelled based on the computational powersof the processing locations 119905119901119889 = 10 times 119905119901119891 = 100 times 119905119901119888 Inaddition while the input to the task processing stage is largeraw data the output is compressed data with comparably lessvolume To that end the compression rate between the inputand output data volumes is given as119862119863119903 = 119862 sdot119863119901 where119863119903and 119863119901 are the volumes of raw and processed (compressed)data respectively

The transmission delay is affected by the type of radioaccess technology and the volume of data to be transmittedSinceWiFi access employs the unlicensed frequency bands itoften suffers from higher retransmission rates which resultsin increased transmission delays due to frequent collisionsTherefore in this work this effect is captured by the factor119865 gt 1 whereby the delay incurred for transmitting the samevolume of data overWiFi is119865 times higher than that over LTEor NB-IoT 119905119905119886 = 119865 sdot 119905119905119887 = 119865 sdot 119905119905119888 This model is represented inFigure 3 in which the source could be either the IoT deviceor the gateway and the recipient could be either the gatewayor the cloud

Consequently the overall response time for each action iscalculated for 119862 = 200 and 119865 = 2 as follows

119877 = 119905119901 +119873ℎsum119894=1

119905119905119894 sdot 119863119894 (4)

Wireless Communications and Mobile Computing 5

Table 1 System model parameters and simulation values

Parameter Value Description119903119899 200 m eNB cell radius119903119908 30 m WiFi cell radius119873119866 10 Number of IoT devices per gateway120594119889 30 Kbps Computational capacity (device)120594119891 102 Kbps Computational capacity (fog)120594119888 103 Kbps Computational capacity (cloud)120598 5 times 10minus9 Joule Energy consumption per computational cycle120578119889 102 Required amount of computational cycle per data element (device)120578119891 10 Required amount of computational cycle per data element (fog)120578119888 1 Required amount of computational cycle per data element (cloud)1198730 -204 dBWHz Noise density119861 180 kHz Bandwidth119875119905119889 10minus8W Average transmit power of the IoT devices in the gateways 2 3 4 and 5119879 1 s Time period120582 05 119876-table update parameter120601 09 119876-table update parameter1205761 08 Action selection parameter for Stage 11205762 104 Action selection parameter for Stage 2120588 08 Decaying rate for 1205761 and 1205762S 8 Number of bits in each data element120599 103119878 Conversion of kbps data rates to number of data elements119864119901119889 120598 sdot 120578119889 sdot 120582 Data processing energy consumption per data rate in kbps (device)119864119901119891 120598 sdot 120578119891 sdot 120582 Data processing energy consumption per data rate in kbps (fog)119864119901119888 120598 sdot 120578119888 sdot 120582 Data processing energy consumption per data rate in kbps (cloud)Γ119889 10minus4 Cost of processing per kbps (device)Γ119891 10minus1 Cost of processing per kbps (fog)Γ119888 1 Cost of processing per kbps (cloud)b 20 Budget1205731 102 Constant coefficient for penalty comparison1205732 1012 Constant coefficient for penalty comparison119870119908 = 119870119897 = 119870119899 1281 dB Propagation loss constant for all wireless connection types (a b and c)120572119897 = 120572119899 376 Propagation loss exponent for NB-IoT and LTE wireless connection types (b and c)120572119908 3 Propagation loss exponent for Wi-Fi (80211g) wireless connection type (a)

where 119873ℎ = 1 2 is the number of hops and 119863 = 119863119903 119863119901Besides 119905119905119894 and 119863119894 represent the values of 119905119905 and 119863 for the 119894119905ℎhop respectivelyThen the calculated values populate Table 2after the application of feature scaling into the range of [0 1]using the function given as

119891 (119909) = 119909 minusmin (119883)max (119883) minusmin (119883) (5)

where 119883 is the set of 119909 Note that both (a) and (b) typeconnections constitute the first hop while the connectiontype (c) is the second hop

4 Machine Learning-Based Solution

In this work we propose to employ reinforcement learning(RL) a machine learning technique based on a goal-seeking

approach It is a trial and error approach in which the agent(or learning device) learns to take the correct action byinteracting with its surroundings and being rewarded orpenalised in each iteration RL is selected in this work due toits great applicability to the presented problem For exampleIoT devices need to interact with its environment in orderto assess the circumstances and to take subsequent actionswhich is determination of the connection type and the dataprocessing location Therefore RL maps to this requirementvery well since it allows optimisation with environmentalinteractions

Being one of the most prominent reinforcement learningtechniques 119876-learning aims to find the optimum policy fora given problem that is the best action to take at any givenstate To do this the agent takes an action and evaluates thesubsequent rewardcost of taking that action given that it was

6 Wireless Communications and Mobile Computing

Table 2 Stage one action list

Action Connection Processor Tuple1198601 Wi-Fi Device 1198601 = [0004 1 120594119889 (119864119905119886 + 119864119905119888 + 119864119901119889 sdot 120579) Γ119889]1198602 Wi-Fi Fog 1198602 = [062 1 120594119891 (119864119905119886 + 119864119905119888 + 119864119901119891 sdot 120579) Γ119891]1198603 Wi-Fi Cloud 1198603 = [1 1 120594119888 (119864119905119886 + 119864119905119888 + 119864119901119888 sdot 120579) Γ119888]1198604 NB-IoT Device 1198604 = [0 0 120594119889 (119864119905119887 + 119864119901119889 sdot 120579) Γ119889]1198605 NB-IoT Cloud 1198605 = [02 0 120594119888 (119864119905119887 + 119864119901119888 sdot 120579) Γ119888]

in a certain state This rewardcost is then used to update alook-up-table known as the119876-table which is later utilised bythe agent to select the best action Further the agent calculatesthe 119876-value for every possible stateaction pair Therefore asimple implementation can result in the agent learning onlinethe best actions regardless of the policy

Moreover 119876-learning offers two key features whichenable an efficient solution to our problem First as it isa model-free learning approach [21 22] it is (1) capableof operating in dynamically changing environments (2) alow-complexity algorithm which does not require a lot ofpower thus reducing the energy consumption of the IoTnetwork Second 119876-learning is known to converge in mostcases [23] which has also been demonstrated in multiagentnoncooperative environments [24] as are IoT networks

We propose a two-stage approach to solve the energy-aware smart IoT connectivity where each of the stagesemploys 119876-learning41 First Stage Learning Stage 1 consists of learning thebest combination of connectivity and processing locationin view of the device and application requirements and thelimitations offered by each of these options Thus there arefive possible actions that may be taken by each device asdescribed in Table 2 As a side note all the variables inTable 2 are the feature scaled values (into the range of [01]) calculated through (5) The tuples shown represent thelimitations of each action eg 119860 119894 = [119877 Σ 120594119897 119864119905 + 119864119901 Γ119897]where 119877 and 119864119905 + 119864119901 are described in Sections 33 and32 respectively 120594119897 is the available processing capacity andΓ119897 is the processing cost where 119897 = 119889 119891 119888 as defined inTable 1 The parameter Σ = 1 2 refers to the level ofdata security offered by the wireless technology whereby thevalue 1 indicates eSIM protection (only provided by NB-IoT)and 2 the absence of that Moreover each device may bein four different states as shown in Table 3 depending onthe context-aware constraints defined jointly by the deviceand application These constraints are 1198771015840 Σ1015840 and 1205941015840 whichrepresent the response time security level and computationalpower requirements respectively

411 Penalty Function Determination Each device will esti-mate the penalty function associated with each possibleaction it is able to take following the system shown in Table 3where120593119901 = 119877minus1198771015840 ΣminusΣ1015840 120594minus1205941015840 | 119901 = 1 2 3 is the differencebetween the available and required characteristicsThe fourthpenalty is 1205934 = 1205941015840 sdot 119860(5)119894 minus 119887 where 119860(5)119894 is the fifth index of 119894119905ℎaction and the parameter 119887 is the available budget

The penalty function determination policy aims to satisfythe optimisation objective by including the elements that aredesired to be minimised As seen from Table 3 the penaltyfunctions consist of three main elements constant termdissatisfaction level and energy consumption The constantvalue is the cost of being in the states and it decreases whilethe level of state increases This element compels the agenttry to achieve the highest possible level of states as it is oneof the objectives of the optimisation problem The elementof dissatisfaction level as a supportive of the constant valueincurs cost for not satisfying the device requirements inorder to improve the satisfaction levels Lastly the energyconsumption element provides minimisation in the end-to-end energy consumption (connection and data processing)The parameter 0 le ] le 1 is the battery level where 0represents an empty battery and 1 represents the full chargeIn the expressions in Table 3 the parameter 120589 specifies thepriority level of the energy consumption For instance lowvalues of 120589 prioritise the energy consumption once the batterylevel ] is very low (eg 5) while high values prioritise theenergy consumption even when the battery level is high (eg50)

In addition to all these normally the algorithm tends toselect an option with a cloud processing as it is the mostenergy efficient one However some amount of data willnot be offloaded due to budget constraints and will thenbe processed locally which is the most energy consumingoption Note that this amount is evaluated by the second stagelearning Thus the selected option by the first stage wouldbe more energy consuming than the fog processing-includedoption as the processing will be the combination of the cloudand device Therefore the last parts of the penalty functions(inside the square brackets) prevent the algorithm frommaking blind decisions which ignores the budget availabilityby including an average energy consumption of the actionswith the device processing The reason of taking the averagevalue is that the final action is yet to be taken during thelearning process The coefficients of these three elementsare determined empirically However they can be used toprioritise any element that is desired to be minimised more

The119876-table entries are then updated according to the fol-lowing expression where 119904 1199041015840 119875 and 119886 are the current statenext state penalty function and action under evaluation

119876 (119904 119886) larr997888 119876 (119904 119886)+ 120582 (119875 (119904) + 120601min (119876 (1199041015840 119886)) minus 119876 (119904 119886)) (6)

Wireless Communications and Mobile Computing 7

Table 3 List of possible states of each device in Stage one and corresponding penalty calculation

State Description Penalty function (119875)

1205901 None of the constraints are satisfied 104 + sum119901=1minus3

120593119901 + 10120589] sdot 1205731 sdot 119860(4)119894 sdot [(1 minus 12059341205941015840 sdot 119860(5)119894 ) +1205934 sdot ((119860(4)1 + 119860(4)4 ) 2)

1205941015840 sdot 119860(5)119894 ]1205902 One constraint is satisfied 5 times 103 + 08 sum

119901=1minus3

120593119901 + 10120589] sdot 1205731 sdot 119860(4)119894 sdot [(1 minus 12059341205941015840 sdot 119860(5)119894 ) + 1205934 sdot ((119860(4)1 + 119860(4)4 ) 2)1205941015840 sdot 119860(5)119894 ]

1205903 Two constraints are satisfied 2 times 103 + 06 sum119901=1minus3

120593119901 + 10120589] sdot 1205731 sdot 119860(4)119894 sdot [(1 minus 12059341205941015840 sdot 119860(5)119894 ) +1205934 sdot ((119860(4)1 + 119860(4)4 ) 2)

1205941015840 sdot 119860(5)119894 ]1205904 Three constraints are satisfied 08 sum

119901=1minus2

120593119901 + 1205933 + 10120589] sdot 1205731 sdot 119860(4)119894 sdot [(1 minus 12059341205941015840 sdot 119860(5)119894 ) +1205934 sdot ((119860(4)1 + 119860(4)4 ) 2)

1205941015840 sdot 119860(5)119894 ]

Table 4 List of possible states of each device in Stage two and corresponding penalty calculation

State Description Penalty function1 No availability in cloud or fog for 1205941015840 103 + 1205732 (119860(4)119894 sdot (1 minus 119894) + 1205941015840 sdot 119894 sdot 119860(5)119894 )2 Enough availability in cloud or fog but no budget for 1205941015840 103 + 1205732 (119860(4)119894 sdot (1 minus 119894) + 1205941015840 sdot 119894 sdot 119860(5)119894 )3 Enough availability and budget for 1205941015840 1205732 (119860(4)119894 sdot (1 minus 119894) + 1205941015840 sdot 119894 sdot 119860(5)119894 )

42 Second Stage Learning The second stage aims to find thebest policy for task offloading by considering the budget andavailability of the fog or cloud To this end the second stageis activated only when the action taken in Stage 1 does notresult in local processing (ie 1198601 and 1198604) In Stage 2 119876-learning is also employed with 21 possible actions = [0 005 1] and the constraints are the available budget 119887 andthe availability of the fog andor cloud The resulting statesand penalty functions for this stage are listed in Table 4

421 Penalty Function Determination The penalty functionof this stage is determined with a similar procedure to thefirst stage hence there are three cost elements constant termenergy consumption and monetary cost Similar to the firststage the constant value ensures ending up with the highestpossible level of state Having the energy consumption andmonetary cost elements simultaneously provides finding thebest trade-off between the two However unlike the firststage these elements are calculated for a piece of data thatis planned to be transferred as specifying the best amount isthe objective of this stage learning Similarly the coefficientsare obtained empirically

The interaction between Stage 1 and Stage 2 in the learningprocess is depicted in Algorithms 1 and 2 respectively

5 Results and Analysis

In this section we implement the proposed reinforcementlearning approach in a simulation environment as shown inFigure 4 using the parameter values defined in Table 1 Weconsider that half of the IoT devices connect with NB-IoT inview of the data privacy and related security requirementsthese represent Group A The remaining devices connect tothe eNB through the WiFi gateway hence over two wirelesshops and represent Group B Consequently there are sixpossible fixed scenarios that may be formed by selecting theprocessing location of each group of devices these are listed

in Table 6 A total of 100 iterations is conducted and in eachrandom battery levels are allocated to each of the devices

We compare the results obtained with our method tothe six listed scenarios in terms of five different parametersenergy cost dissatisfaction number of out of budget devicesand joint penalty First energy represents the end-to-endenergy consumption caused from both connection and dataprocessing Second cost is the overall monetary cost incurredby the use of the data processing locations such as fog andcloud Third dissatisfaction is a measure of the total numberof device requirements that are not satisfied Fourth numberof out of budget devices reflects the count of devices thatexceed their available monetary budgets during performingtheir tasks Finally the joint penalty indicates the cumulativecombination of previous four parameters (energy cost dissat-isfaction and number of out of budget devices)

The results in terms of gain (positive values) and loss(negative values) are shown in Figure 5 Note that the valuesfor parameters energy cost dissatisfaction and joint penaltyare obtained as follows

119892 (119909) = 119901119904 minus 119901119902119901119902 times 100 (7)

where 119901119904 and 119901119902 are the values from Table 5 for Scenarios A-Fand 119876-learning respectively

On the other hand the gainloss values for the parameterof number of out of budget devices in Figure 5 is calculatedusing the function given as

119900 (119909) = 119874119906119905 119900119891 119861119906119889119892119890119905 119863119890V119894119888119890119904119873119866 times 100 (8)

It is worth noting that the results provided in Figure 5 areevaluated using the average values given in Table 5 along with95 confidence intervals Moreover the joint cost parameterin Table 5 is calculated by summing them However beforethe summation other four parameters (energy consumption

8 Wireless Communications and Mobile Computing

Data Context-aware constraints available computational capacity in gateway and eNB budgetResult Combination of connectivity route and processing venue

1 initialization2 for all IoT devices do3 Determine the current state using Table 34 Evaluate all the actions5 Calculate the penalty using Table 36 Select the best action7 Jump to the next state8 Update the 119876-table9 if the selected action includes fog(gateway) or

cloud (eNB) processing then10 go to Algorithm 211 end12 end

Algorithm 1 First stage learning

Data Action selected by the first stage available computational capacity in gateway and eNB budgetResult Share of data to be offloaded13 initialization14 for all IoT devices do15 Determine the current state using Table 416 Evaluate all the actions17 Calculate the penalty using Table 418 Select the best action19 Jump to the next state20 Update the 119876-table21 end

Algorithm 2 Second stage learning

minus200 minus150 minus100 minus50 0 50 100 150 200minus200

minus150

minus100

minus50

0

50

100

150

200

LTE eNBWiFi GW 1WiFi GW 2WiFi GW 3

WiFi GW 4WiFi GW 5IoT Devices

Figure 4 Sample snapshot of the simulation environment IoTdevices are located randomly while positions of the gateways arefixed

cost dissatisfaction and number of out of budget devices) arefeature scaled into the range of [0 1] using the function in (5)in order to keep their impacts in the same scale

Our method outperforms any fixed combination whenexamining the joint or holistic gain with values rangingfrom 959 to 28354 Similarly the reinforcement learningtechnique results in better matching between the context-aware constraint and the availability of the IoT networkcompare to any other scenario with gains varying from18333 to 34444 Although the processing cost of ourproposed method is higher than that of Scenario A theresulting gain in energy saving is even more important aswell as the context-aware constraint compliance The closestcontender to reinforcement learning with respect to thegenerated results is Scenario C in which the processing ofGroup A IoT devices is locally conducted while that of GroupB occurs in the gateway Nonetheless the reinforcementlearning allows for a device-driven context-aware connectiv-ity that improves the compliance criteria by more than twotimes while saving 4322 of energy resulting in a holisticgain of 5852 Scenario D manages to reduce the energyconsumption more than our proposed approach at the sametotal cost however 303 of the devices are out of budgetresulting in incomplete or interrupted computational tasks

Wireless Communications and Mobile Computing 9

Table 5 Results on various metrics for 119876-learning and the scenarios

Energy Consumption (mJ) Cost Dissatisfaction Out of Budget Devices Joint CostQ-Learning 569 plusmn 0322 9677 plusmn 401 18 plusmn 0291 0 plusmn 0 07822Scenario A 1488 plusmn 0385 024 plusmn 615119890minus3 51 plusmn 028 0 plusmn 0 15323Scenario B 755 plusmn 024 11857 plusmn 449 529 plusmn 0181 303 plusmn 0217 20679Scenario C 816 plusmn 0284 1207 plusmn 0383 581 plusmn 0208 0 plusmn 0 12399Scenario D 083 plusmn 0025 13041 plusmn 454 6 plusmn 0 303 plusmn 0217 17756Scenario E 748 plusmn 0281 11968 plusmn 383 781 plusmn 0208 297 plusmn 0213 24643Scenario F 015 plusmn 459119890minus3 23802 plusmn 616 8 plusmn 0 6 plusmn 0339 30000

Table 6 List of fixed scenarios with connection types and locationsof data processing

Scenario Group A Group BA Device DeviceB Cloud DeviceC Device FogD Cloud FogE Device CloudF Cloud Cloud

Gain of Q-learning over the scenarios

Scenario A

Scenario B

Scenario C

Scenario D

Scenario E

Scenario Fminus100

minus50

0

50

100

150

200

250

300

350

Gai

n (

)

Total energy (mJ)Total costTotal dissatisfaction

Out of budget devicesJoint Penalty

Figure 5 Summary of results for 120589 = 01 Positive and negativevalues reflect gain and loss respectively Gainloss occurs when the119876-learningscenarios is better than the scenarios119876-learning

Moreover in this scenario connected devices are more thantwo times more likely to be dissatisfied with one or more ofthe context-aware requirements

Next we examine the impact of the battery priority factor120589 on the energy efficiency As shown in Figure 6 low valuesof 120589 result in almost neglecting the battery life of the device inthe optimisation process until it drops below 10 Very high

0 10 20 30 40 50 60 70 80 90 100Battery Level ()

1

2

3

4

5

6

7

Ener

gy C

onsu

mpt

ion

(J)

Impact of the Energy Prioritization Factor ()

= 01 = 03 = 05

= 09 = 12

times10minus3

Figure 6 Impact of energy prioritisation factor 120589

values of 120589 prioritise the reduction of energy consumption forall devices except those that have higher than 70battery lifeTo this end it is possible to tune this parameter dependingon the scenario at hand and in a device-specific manner Forinstance some devices may be part of a moving vehicle withthe possibility of agile and low cost battery replenishmentSuch devices may benefit from low settings of 120589 to allowmore flexibility in meeting the remaining constraints Otherdevices may be in hard-to-reach places and would requireskilled force special equipment and hence high cost toreplace the dead battery In this case higher settings of 120589 aremore suitable and would result in better cost to quality ratio

The simulation results achieved in this work are verypromising as they indicate a large margin for improvementthat is not possible in fixed connection schemes The pro-posed reinforcement learning method relies on centralisedintelligence which has access to all the constraints andrequirements of all devices gateways and connectionsHence the 119876-learning-based method selects the best action(connection typeprocessing location pair in the first stageand amount of data to be transmitted in the second stage)after the convergence We appreciate that such a deployment

10 Wireless Communications and Mobile Computing

is not realistic and propose to explore the feasibility andcorresponding gains of multiagent and distributed reinforce-ment learning as adopted in [24] in our future workNonetheless this work is undoubtedly the first to highlightthe importance of context-aware connectivity in the IoT con-text that addresses jointly security energy and computationalpower as well as cost We present a new application SmartPorts and quantify the potential margin for improvement byemploying the novel scheme and highlight its effects on theapplication

6 Conclusion

In this work we have presented novel approach for energy-aware and context-aware IoT connectivity that jointlyoptimises the energy security computational power andresponse time of the connection The proposed schemeemploys reinforcement learning and manages to achieve aholistic gain of up to 28354 compared to deterministicroutes Although some deterministic scenarios may resultin lower computational cost or lower energy consumptionnone is able to meet the holistic context-aware performancetarget In addition we presented an analysis of the impactof the energy prioritisation factor in which we demonstratedthe importance of tuning this parameter in a device-centricmanner in order to achieve better optimisation of the wholesystem

Data Availability

The data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This research was partly funded by EPSRCGlobal ChallengesResearch Fundmdashthe DARE ProjectmdashEPP0287641The firstauthor was supported by the Republic of Turkey Ministry ofNational Education (MoNE-1416YLSY)

References

[1] S Andreev O Galinina A Pyattaev et al ldquoUnderstandingthe IoT connectivity landscape a contemporary M2M radiotechnology roadmaprdquo IEEE Communications Magazine vol 53no 9 pp 32ndash40 2015

[2] L Atzori A Iera and G Morabito ldquoThe internet of things asurveyrdquoComputer Networks vol 54 no 15 pp 2787ndash2805 2010

[3] N Kouzayha M Jaber and Z Dawy ldquoMeasurement-basedsignaling management strategies for cellular IoTrdquo IEEE Internetof Things Journal vol 4 no 5 pp 1434ndash1444 2017

[4] Y Yang M Zhong H Yao F Yu X Fu and O PostolacheldquoInternet of things for smart ports technologies and chal-lengesrdquo IEEE Instrumentation Measurement Magazine vol 21no 1 pp 34ndash43 2018

[5] GSMA ldquo3GPP low power wide area technologiesrdquo GSMAWhite paper Oct 2016

[6] 3GPP ldquoEvolved Universal Terrestrial Radio Access (E-UTRA)LTE coverage enhancementsrdquo 3GPPThechnical Report 36 Jun2012

[7] Technologies Keysight ldquoThe menu at the IoT cafe a guide toIoT wireless technologiesrdquo Application Note 2017

[8] L Farhan S T Shukur A E Alissa M Alrweg U Raza andR Kharel ldquoA survey on the challenges and opportunities of theInternet of Things (IoT)rdquo in Proceedings of the 2017 EleventhInternational Conference on Sensing Technology (ICST) pp 1ndash5December 2017

[9] S Tayade P Rost A Maeder and H D Schotten ldquoDevice-centric energy optimization for edge cloud offloadingrdquo inProceedings of the 2017 IEEEGlobal Communications Conference(GLOBECOM 2017) pp 1ndash7 Singapore December 2017

[10] F Renna J Doyle V Giotsas and Y Andreopoulos ldquoQueryprocessing for the internet-of-things coupling of device energyconsumption and cloud infrastructure billingrdquo in Proceedingsof the 2016 IEEE First International Conference on Internet-of-Things Design and Implementation (IoTDI) pp 83ndash94 BerlinGermany April 2016

[11] S Persia C Carciofi and M Faccioli ldquoNB-IoT and LoRAconnectivity analysis for M2MIoT smart grids applicationsrdquo inProceedings of the 2017 AEIT International Annual Conferencepp 1ndash6 Cagliari September 2017

[12] A Mihovska and M Sarkar ldquoSmart connectivity for internet ofthings (IoT) applicationsrdquo in New Advances in the Internet ofThings vol 715 of Studies in Computational Intelligence pp 105ndash118 Springer International Publishing Cham 2018

[13] N Kouzayha M Jaber and Z Dawy ldquoM2M data aggregationover cellular networks signaling-delay trade-offsrdquo in Proceed-ings of the 2014 IEEE Globecom Workshops (GC Wkshps) pp1155ndash1160 December 2014

[14] J Xu L Chen and P Zhou ldquoJoint service caching and taskoffloading for mobile edge computing in dense networksrdquoArXiv e-prints 180105868 Jan 2018

[15] O Y Bursalioglu Z Li C Wang and H PapadopoulosldquoEfficient C-RAN random access for IoT devices learning linksvia recommendation systemsrdquo ArXiv e-prints 180104001 Jan2018

[16] H Li K Ota and M Dong ldquoLearning IoT in edge deeplearning for the internet of things with edge computingrdquo IEEENetwork vol 32 no 1 pp 96ndash101 2018

[17] E Oyekanlu ldquoPredictive edge computing for time series ofindustrial IoT and large scale critical infrastructure based onopen-source software analytic of big datardquo in Proceedings of the2017 IEEE International Conference on Big Data (Big Data) pp1663ndash1669 Boston MA USA December 2017

[18] S Barbarossa S Sardellitti E Ceci and M Merluzzi ldquoTheedge cloud a holistic view of communication computation andcachingrdquo ArXiv e-prints 180200700 Feb 2018

[19] T X Vu S Chatzinotas and B Ottersten ldquoEdge-cachingwireless networks performance analysis and optimizationrdquoIEEE Transactions on Wireless Communications vol 17 no 4pp 2827ndash2839 2018

[20] ITU-R ldquoPropagation data and prediction methods for theplanning of short-range outdoor radiocommunication sys-tems and radio local area networks in the frequency range300 MHz to 100 GHzrdquo International TelecommunicationUnionmdashRadiocommunication Sector Geneva 2017 Recommen-dation ITU-R P1411-9

Wireless Communications and Mobile Computing 11

[21] L P KaelblingM L Littman andAWMoore ldquoReinforcementlearning a surveyrdquo Journal of Artificial Intelligence Research vol4 pp 237ndash285 1996

[22] E M Russek I Momennejad MM Botvinick S J Gershmanand N D Daw ldquoPredictive representations can link model-based reinforcement learning tomodel-freemechanismsrdquo PLoSComputational Biology vol 13 no 9 Article ID e1005768 2017

[23] E Even-Dar and Y Mansour ldquoConvergence of optimistic andincremental q-learningrdquo in Advances in Neural InformationProcessing Systems pp 1499ndash1506 2002

[24] M Jaber M A Imran R Tafazolli and A TukmanovldquoA distributed SON-based user-centric backhaul provisioningschemerdquo IEEE Access vol 4 pp 2314ndash2330 2016

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 2: Energy-Aware Smart Connectivity for IoT Networks: Enabling ...

2 Wireless Communications and Mobile Computing

both have different signalling overhead characteristics Asanother example remote monitoring in smart cities requiresfull compliance with privacy regulations whereas security-related applications rank response time highest among all keyperformance indicators (KPI)

In this article we adopt the smart port use-case todemonstrate the context-aware smart connectivity since itincludes various types of applications and has a determinedneed for monetisation (as opposed to smart cities that areprimary developed for the well-being and productivity of thesociety) According to figures from the World Trade Orga-nization 80 of worldwide freight is transported throughports (httpswwwwtoorg) The smart port concept entailsthe use of technologies to transform the different publicservices at ports into interactive systems with the purposeof meeting the needs of port users with a greater levelof efficiency transparency and value European smart portinitiatives include the following among many others

(i) The port of Rotterdam where IoT-sensors are usedto generate a digital twin and enable augmentedintelligence

(ii) The port of Hamburg which exploits 5G networks toenable virtual reality for vital infrastructure monitor-ing

(iii) The port of Antwerp employs blockchain technologyto enable a secure transfer of rights to be exchangedbetween often competing parties

(iv) The port of Seville through the Tecnoport 2025 projectuses mobile network technology for traffic and goodstracking on port and their logistical transfer on land

Smart ports present a particular challenge due to thenecessity of information exchange among competing stake-holders including port authorities port operators terminaloperators logistics companies shipping companies etc Itis then likely that multiple IoT networks would coexist andwould consist of partly private and partly public or sharedinfrastructure As described in [4] there are various commu-nication standards with different strengths and weaknesseswhich may be used for connecting IoT networks in thecontext of smart ports Mobile IoT ie connectivity overlicensed mobile wireless networks is often the preferredsolution for handling private data since it is reliable end-to-end secure (owing to the eSIM card) scalable ubiquitousand mature Two main technologies have been introducedby mobile networks to connect IoT devices eMTC and NB-IoT [5] Both of these technologies are compatible with LTE(state-of-the-art commercial mobile network technology)which means that a software update suffices to deploy theIoT options The former is geared towards higher rates (gt 1Mbps) and supports VoIP (Voice over IP based on ITUH323protocol (httpswwwituintrecT-REC-H323e)) and flex-ile mobility The latter is designed for low data rates (20kbps) and long range (100 km) but with limited mobil-ity The NB-IoT technology consists of restricting theenergy of an LTE normal carrier in a narrow band henceallowing a maximum coupling loss that is 20 dB higher(164 dB) than LTE [6] Mobile IoT is a public service

enabled by telecom carriers and may be used by any partywho subscribes to it Other long-range and low-powersolutions such as LoRa(httpswwwlora-allianceorg) andSigfox(httpswwwsigfoxcomen) are unlicensed and canreach similar coverage and data rates as NB-IoT and eMTCThese may be privately owned but require the usage of agateway to connect to the Internet and are often consideredless secure Many short range unlicensed wireless connec-tivity solutions are available such as WiFi (IEEE 80211119892)Bluetooth ZigBee etc as described in [7] and may beshared public or private

In the presence of multiple wireless technologies dis-parate IoT applications competing parties and a broad rangeof static and moving IoT devices with multiple connectivityoptions it is of key importance to identify the best way tocollect store cache and process the IoT data What qualifiesas the best way depends on the device capabilities (eg con-nectivity options available battery) the wireless conditionsthe security requirements the processing complexity andavailability the cost of storagecachinguploading etc

2 Related Work

As the energy consumption is one of the challenges for IoTnetworks [8] recent works such as [9 10] study the trade-offbetween local and cloud computing in terms of device energyconsumption The former proposes an analytical frameworkthat minimises the energy consumption by optimising theoffloading decision of multiple user devices The latter elab-orates a theoretical framework for establishing trade-offsin the energy consumption and IoT infrastructure billingcomprising cloud computing Mobile wireless networks area prime contender in the race to connect IoT networks owingto their well-established and ubiquitous coverage and securecommunication based on the subscriber identity module(eSIM card) In [11] authors investigate the connectivity ofNB-IoT and LoRa in terms of both area and populationcoverage in order to highlight the importance of the networkdeployments In [12] big data analytics based user-centricsmart connectivity is argued by providing correspondingresearch challenges

Although data aggregation seems a promising solutionto ease the signalling overhead it is one of the causes ofthe transmission delay In [13] authors discuss the trade-off between delay and signalling overhead in order todemonstrate the impacts of data aggregation Authors in[14] analyse the joint optimisation of caching and taskoffloading in such networks with mobile edge computingThey present an efficient online algorithmbased on Lyapunovoptimisation and Gibbs sampling that succeeds in reducingcomputation latency while keeping the energy consumptionlow In [15] a recommendation system is proposed to addressthe challenge of link selection in a cloud radio access networkA data-driven scheme is introduced that results in optimisedclassification of link strengths between remote radio headsand IoT devices

A deep learning algorithm for edge computing is intro-duced in [16] to boost the learning performance in IoT

Wireless Communications and Mobile Computing 3

networks They also attempt to increase the amount of edgetasks by considering the edge capacity constraints An open-source database is designed in [17] for the edge computationof Industrial IoT (IIoT) networks The authors use a time-series analysis for predicting conditions of IIoT machinesin order to decrease the amount of condition reports tobe sent to the cloud A holistic view of communicationcomputation and caching is presented in [18] using graph-based representations as learning methods for innovativeresource allocation techniquesThe performance of the edge-caching as well as the energy efficiency and delivery time isinvestigated in [19] with quality of service (QoS) constraints

In this work we employ machine learning techniquesbased on reinforcement learning in order tomanagemultipleoptimisation objectives jointly and to dynamically identifythe best connection and route for each device We identifyfour key quality features that dominate IoT applications ingeneral and smart ports in particular security energy latencyand cost This work is the first to address these multiple IoToptimisation objectives jointly using reinforcement learningWe compare our novel approach to the state-of-the-artconnectivity solutions and demonstrate significant gains inall aspects (ranging from 959 to 28354) Moreover ourapproach is the only one that is able to meet the context-aware requirements fully while minimising the cost and theenergy consumption The advantage of the machine learningscheme adopted is primarily its low complexity and its abilityto optimise in a dynamic environment such as a smart port

The rest of the paper is organised as follows In Section 3we define the system model of our research In Section 4we present our novel machine-learning-based solution forsolving the multiobjective problem Section 5 elaborates theresults and analysis and in Section 6 we conclude the article

3 System Model

The energy-aware smart connectivity novel approach pro-posed in this work applies to any IoT network with diverseoptions of connectivity and processing For the sake of clarityin the presentation we build the system model around asmart port scenario such as the one shown in Figure 1 All IoTdevices are battery operated and have different battery livesThey all have some processing power to perform basic tasksand can either offload the task to the gateway (or fog) ie theWiFi access point or to the evolved node B (eNB or cloud)

Differently from the state-of-the-art research we proposeto decide simultaneously on the best connectivity and the bestlocation for processing the tasks by jointly optimising energyresponse time security and cost A two-stage approachwhich describes the decision and optimisation processes ispresented in Figure 2 It is assumed that every IoT device iscontrolled by a given application and they jointly determinethe context-aware constraints Each combination of connec-tivity option and processing location offers specific charac-teristics and limitations Stage 1 consists of optimising thesedecisions based on the context-aware constraints while Stage2 refines the trade-off between energy consumption and costIn the following paragraphs we describe the models adopted

Figure 1 Smart port diagram with two overlapping networks NB-IoT and WiFi WiFi access points use LTE for backhauling All IoTdevices are capable of both wireless technologies

to capture the propagation loss energy consumption andresponse time for the proposed system Table 1 lists all theparameters that are pertinent to our simulations

31 Propagation Model There are three wireless connectionsthat require modelling (a) Device-to-Gateway (WiFi) (b)Device-to-eNB (NB-IoT) and (c) Gateway-to-eNB (LTE)Connections (a) and (c) are often interference limited asthe employed spectrum is likely to be shared by otherneighbouring connections Connections of type (b) arehowever considered to be noise limited as we assume thatthere are no other eNB in the surrounding employing NB-IoT technology The objective of the propagation modellingis to determine the transmission power required to caterfor each of the wireless connection types Accordingly theenergy consumption will be calculated We start with thepropagation loss 119871 which is modelled as a function of twotechnology-specific parameters the propagation constant 119870and the propagation exponent 120572 and the distance of thewireless hop 120575measured in 119896119898 as shown below

119871 = 119870 sdot 120575120572 (1)

Moreover the probability of having line of sight between thedevice and the gateway is much higher than in the case of theother types of wireless connections hence the propagationloss per decade is less [20] On the other hand NB-IoTconnections suffer the same propagation loss per decade asLTE links however are successfully received with 20 dB lesspower (threshold receiver sensitivity is minus141 dBm) For alltypes of links the received power at a distance 119889119909 from thetransmitting device can be expressed as 119875119903 = 119875119905119871 in mWattNext we calculate the required received power 119875119903 (in mWatt)in order to achieve the target data transmission119863 in bits

119863 = 119879 sdot 119861 sdot log2 (1 + 119875119903119875119868 + 1198730 sdot 119861) (2)

where 119879 is the time period 119861 is the channel bandwidth and119875119868 is the cumulative interference power on the given channelduring time period 119879 Please note that 119875119868 is null for wirelessconnections of type (b) Using (2) and solving for 119875119903 we get

119875119903 = (2119863(119879sdot119861) minus 1) times (119875119868 + 1198730 sdot 119861) (3)

4 Wireless Communications and Mobile Computing

Sensor- Wireless

options- Battery

Gateway(WiFi)

eNB(NB-IoT)

Local processing

Gateway processing

Cloud processing Actuation

Application- Security- Response time

Connectivity- eSiMGateway- Energy consumption

Processing- Availability- Response-time- Energy consumption

Local processing

Cloud processing

Joint DecisionContext-aware

STAGE 1

Constraints

STAGE 2

Trade-off

Cost

Energy

Figure 2 Decision and optimisation processes in a two-stage approach to optimise four performance criteria energy response time securityand cost

Source

LTENB-IoT Wireless ChannelWiFi Wireless

Channel

Recipient

Data transmitted (Dr) Processing at recipient

Data transmitted (Dp) Processing at sourcetptp

Figure 3 Uplink delay model capturing the factors affecting both processing and transmission delays over any hop in our system

32 Energy Consumption Model There are two major pro-cesses that consume energy in an IoT network wirelesstransmission and task computationThe energy consumptionof the former is 119864119905 and the latter is 119864119901 thus the total energyconsumption is the sum of both Depending on the route ofcommunication taken by the device the energy consumeddue to transmission power can be a result of either one hopusing NB-IoT (119864119905119887) or two hops using WiFi for the first linkand LTE for the second (119864119905119886 + 119864119905119888) The energy consumed forprocessing the task is a function of the data rate requirementof device 119889 120579119889 and the computational power of the processor119864119901119894 forall119894 = 119889 119891 119888 (see Table 1) and is expressed as119864119901 = 120579sdot119864119901119894 33 Response Time Model The response time perceived bythe IoT device is the combination of the uplink and downlinkdelays between the IoT device and the server In this workthe uplink delay is modelled while the downlink delay isassumed the same for all devices

The uplink delay is caused by two phenomena taskprocessing (processing delay 119905119901) and data transmission(transmission delay 119905119905) The processing delay depends on theprocessorrsquos computational power which is measured in thenumber of computational cycle per data element (120578) ie thehigher 120578 the less computational power Naturally a server hashigher computational power than a small gateway and muchhigher than a simple IoT device (120578119888 lt 120578119891 lt 120578119889) Thus in

this work 119905119901 is modelled based on the computational powersof the processing locations 119905119901119889 = 10 times 119905119901119891 = 100 times 119905119901119888 Inaddition while the input to the task processing stage is largeraw data the output is compressed data with comparably lessvolume To that end the compression rate between the inputand output data volumes is given as119862119863119903 = 119862 sdot119863119901 where119863119903and 119863119901 are the volumes of raw and processed (compressed)data respectively

The transmission delay is affected by the type of radioaccess technology and the volume of data to be transmittedSinceWiFi access employs the unlicensed frequency bands itoften suffers from higher retransmission rates which resultsin increased transmission delays due to frequent collisionsTherefore in this work this effect is captured by the factor119865 gt 1 whereby the delay incurred for transmitting the samevolume of data overWiFi is119865 times higher than that over LTEor NB-IoT 119905119905119886 = 119865 sdot 119905119905119887 = 119865 sdot 119905119905119888 This model is represented inFigure 3 in which the source could be either the IoT deviceor the gateway and the recipient could be either the gatewayor the cloud

Consequently the overall response time for each action iscalculated for 119862 = 200 and 119865 = 2 as follows

119877 = 119905119901 +119873ℎsum119894=1

119905119905119894 sdot 119863119894 (4)

Wireless Communications and Mobile Computing 5

Table 1 System model parameters and simulation values

Parameter Value Description119903119899 200 m eNB cell radius119903119908 30 m WiFi cell radius119873119866 10 Number of IoT devices per gateway120594119889 30 Kbps Computational capacity (device)120594119891 102 Kbps Computational capacity (fog)120594119888 103 Kbps Computational capacity (cloud)120598 5 times 10minus9 Joule Energy consumption per computational cycle120578119889 102 Required amount of computational cycle per data element (device)120578119891 10 Required amount of computational cycle per data element (fog)120578119888 1 Required amount of computational cycle per data element (cloud)1198730 -204 dBWHz Noise density119861 180 kHz Bandwidth119875119905119889 10minus8W Average transmit power of the IoT devices in the gateways 2 3 4 and 5119879 1 s Time period120582 05 119876-table update parameter120601 09 119876-table update parameter1205761 08 Action selection parameter for Stage 11205762 104 Action selection parameter for Stage 2120588 08 Decaying rate for 1205761 and 1205762S 8 Number of bits in each data element120599 103119878 Conversion of kbps data rates to number of data elements119864119901119889 120598 sdot 120578119889 sdot 120582 Data processing energy consumption per data rate in kbps (device)119864119901119891 120598 sdot 120578119891 sdot 120582 Data processing energy consumption per data rate in kbps (fog)119864119901119888 120598 sdot 120578119888 sdot 120582 Data processing energy consumption per data rate in kbps (cloud)Γ119889 10minus4 Cost of processing per kbps (device)Γ119891 10minus1 Cost of processing per kbps (fog)Γ119888 1 Cost of processing per kbps (cloud)b 20 Budget1205731 102 Constant coefficient for penalty comparison1205732 1012 Constant coefficient for penalty comparison119870119908 = 119870119897 = 119870119899 1281 dB Propagation loss constant for all wireless connection types (a b and c)120572119897 = 120572119899 376 Propagation loss exponent for NB-IoT and LTE wireless connection types (b and c)120572119908 3 Propagation loss exponent for Wi-Fi (80211g) wireless connection type (a)

where 119873ℎ = 1 2 is the number of hops and 119863 = 119863119903 119863119901Besides 119905119905119894 and 119863119894 represent the values of 119905119905 and 119863 for the 119894119905ℎhop respectivelyThen the calculated values populate Table 2after the application of feature scaling into the range of [0 1]using the function given as

119891 (119909) = 119909 minusmin (119883)max (119883) minusmin (119883) (5)

where 119883 is the set of 119909 Note that both (a) and (b) typeconnections constitute the first hop while the connectiontype (c) is the second hop

4 Machine Learning-Based Solution

In this work we propose to employ reinforcement learning(RL) a machine learning technique based on a goal-seeking

approach It is a trial and error approach in which the agent(or learning device) learns to take the correct action byinteracting with its surroundings and being rewarded orpenalised in each iteration RL is selected in this work due toits great applicability to the presented problem For exampleIoT devices need to interact with its environment in orderto assess the circumstances and to take subsequent actionswhich is determination of the connection type and the dataprocessing location Therefore RL maps to this requirementvery well since it allows optimisation with environmentalinteractions

Being one of the most prominent reinforcement learningtechniques 119876-learning aims to find the optimum policy fora given problem that is the best action to take at any givenstate To do this the agent takes an action and evaluates thesubsequent rewardcost of taking that action given that it was

6 Wireless Communications and Mobile Computing

Table 2 Stage one action list

Action Connection Processor Tuple1198601 Wi-Fi Device 1198601 = [0004 1 120594119889 (119864119905119886 + 119864119905119888 + 119864119901119889 sdot 120579) Γ119889]1198602 Wi-Fi Fog 1198602 = [062 1 120594119891 (119864119905119886 + 119864119905119888 + 119864119901119891 sdot 120579) Γ119891]1198603 Wi-Fi Cloud 1198603 = [1 1 120594119888 (119864119905119886 + 119864119905119888 + 119864119901119888 sdot 120579) Γ119888]1198604 NB-IoT Device 1198604 = [0 0 120594119889 (119864119905119887 + 119864119901119889 sdot 120579) Γ119889]1198605 NB-IoT Cloud 1198605 = [02 0 120594119888 (119864119905119887 + 119864119901119888 sdot 120579) Γ119888]

in a certain state This rewardcost is then used to update alook-up-table known as the119876-table which is later utilised bythe agent to select the best action Further the agent calculatesthe 119876-value for every possible stateaction pair Therefore asimple implementation can result in the agent learning onlinethe best actions regardless of the policy

Moreover 119876-learning offers two key features whichenable an efficient solution to our problem First as it isa model-free learning approach [21 22] it is (1) capableof operating in dynamically changing environments (2) alow-complexity algorithm which does not require a lot ofpower thus reducing the energy consumption of the IoTnetwork Second 119876-learning is known to converge in mostcases [23] which has also been demonstrated in multiagentnoncooperative environments [24] as are IoT networks

We propose a two-stage approach to solve the energy-aware smart IoT connectivity where each of the stagesemploys 119876-learning41 First Stage Learning Stage 1 consists of learning thebest combination of connectivity and processing locationin view of the device and application requirements and thelimitations offered by each of these options Thus there arefive possible actions that may be taken by each device asdescribed in Table 2 As a side note all the variables inTable 2 are the feature scaled values (into the range of [01]) calculated through (5) The tuples shown represent thelimitations of each action eg 119860 119894 = [119877 Σ 120594119897 119864119905 + 119864119901 Γ119897]where 119877 and 119864119905 + 119864119901 are described in Sections 33 and32 respectively 120594119897 is the available processing capacity andΓ119897 is the processing cost where 119897 = 119889 119891 119888 as defined inTable 1 The parameter Σ = 1 2 refers to the level ofdata security offered by the wireless technology whereby thevalue 1 indicates eSIM protection (only provided by NB-IoT)and 2 the absence of that Moreover each device may bein four different states as shown in Table 3 depending onthe context-aware constraints defined jointly by the deviceand application These constraints are 1198771015840 Σ1015840 and 1205941015840 whichrepresent the response time security level and computationalpower requirements respectively

411 Penalty Function Determination Each device will esti-mate the penalty function associated with each possibleaction it is able to take following the system shown in Table 3where120593119901 = 119877minus1198771015840 ΣminusΣ1015840 120594minus1205941015840 | 119901 = 1 2 3 is the differencebetween the available and required characteristicsThe fourthpenalty is 1205934 = 1205941015840 sdot 119860(5)119894 minus 119887 where 119860(5)119894 is the fifth index of 119894119905ℎaction and the parameter 119887 is the available budget

The penalty function determination policy aims to satisfythe optimisation objective by including the elements that aredesired to be minimised As seen from Table 3 the penaltyfunctions consist of three main elements constant termdissatisfaction level and energy consumption The constantvalue is the cost of being in the states and it decreases whilethe level of state increases This element compels the agenttry to achieve the highest possible level of states as it is oneof the objectives of the optimisation problem The elementof dissatisfaction level as a supportive of the constant valueincurs cost for not satisfying the device requirements inorder to improve the satisfaction levels Lastly the energyconsumption element provides minimisation in the end-to-end energy consumption (connection and data processing)The parameter 0 le ] le 1 is the battery level where 0represents an empty battery and 1 represents the full chargeIn the expressions in Table 3 the parameter 120589 specifies thepriority level of the energy consumption For instance lowvalues of 120589 prioritise the energy consumption once the batterylevel ] is very low (eg 5) while high values prioritise theenergy consumption even when the battery level is high (eg50)

In addition to all these normally the algorithm tends toselect an option with a cloud processing as it is the mostenergy efficient one However some amount of data willnot be offloaded due to budget constraints and will thenbe processed locally which is the most energy consumingoption Note that this amount is evaluated by the second stagelearning Thus the selected option by the first stage wouldbe more energy consuming than the fog processing-includedoption as the processing will be the combination of the cloudand device Therefore the last parts of the penalty functions(inside the square brackets) prevent the algorithm frommaking blind decisions which ignores the budget availabilityby including an average energy consumption of the actionswith the device processing The reason of taking the averagevalue is that the final action is yet to be taken during thelearning process The coefficients of these three elementsare determined empirically However they can be used toprioritise any element that is desired to be minimised more

The119876-table entries are then updated according to the fol-lowing expression where 119904 1199041015840 119875 and 119886 are the current statenext state penalty function and action under evaluation

119876 (119904 119886) larr997888 119876 (119904 119886)+ 120582 (119875 (119904) + 120601min (119876 (1199041015840 119886)) minus 119876 (119904 119886)) (6)

Wireless Communications and Mobile Computing 7

Table 3 List of possible states of each device in Stage one and corresponding penalty calculation

State Description Penalty function (119875)

1205901 None of the constraints are satisfied 104 + sum119901=1minus3

120593119901 + 10120589] sdot 1205731 sdot 119860(4)119894 sdot [(1 minus 12059341205941015840 sdot 119860(5)119894 ) +1205934 sdot ((119860(4)1 + 119860(4)4 ) 2)

1205941015840 sdot 119860(5)119894 ]1205902 One constraint is satisfied 5 times 103 + 08 sum

119901=1minus3

120593119901 + 10120589] sdot 1205731 sdot 119860(4)119894 sdot [(1 minus 12059341205941015840 sdot 119860(5)119894 ) + 1205934 sdot ((119860(4)1 + 119860(4)4 ) 2)1205941015840 sdot 119860(5)119894 ]

1205903 Two constraints are satisfied 2 times 103 + 06 sum119901=1minus3

120593119901 + 10120589] sdot 1205731 sdot 119860(4)119894 sdot [(1 minus 12059341205941015840 sdot 119860(5)119894 ) +1205934 sdot ((119860(4)1 + 119860(4)4 ) 2)

1205941015840 sdot 119860(5)119894 ]1205904 Three constraints are satisfied 08 sum

119901=1minus2

120593119901 + 1205933 + 10120589] sdot 1205731 sdot 119860(4)119894 sdot [(1 minus 12059341205941015840 sdot 119860(5)119894 ) +1205934 sdot ((119860(4)1 + 119860(4)4 ) 2)

1205941015840 sdot 119860(5)119894 ]

Table 4 List of possible states of each device in Stage two and corresponding penalty calculation

State Description Penalty function1 No availability in cloud or fog for 1205941015840 103 + 1205732 (119860(4)119894 sdot (1 minus 119894) + 1205941015840 sdot 119894 sdot 119860(5)119894 )2 Enough availability in cloud or fog but no budget for 1205941015840 103 + 1205732 (119860(4)119894 sdot (1 minus 119894) + 1205941015840 sdot 119894 sdot 119860(5)119894 )3 Enough availability and budget for 1205941015840 1205732 (119860(4)119894 sdot (1 minus 119894) + 1205941015840 sdot 119894 sdot 119860(5)119894 )

42 Second Stage Learning The second stage aims to find thebest policy for task offloading by considering the budget andavailability of the fog or cloud To this end the second stageis activated only when the action taken in Stage 1 does notresult in local processing (ie 1198601 and 1198604) In Stage 2 119876-learning is also employed with 21 possible actions = [0 005 1] and the constraints are the available budget 119887 andthe availability of the fog andor cloud The resulting statesand penalty functions for this stage are listed in Table 4

421 Penalty Function Determination The penalty functionof this stage is determined with a similar procedure to thefirst stage hence there are three cost elements constant termenergy consumption and monetary cost Similar to the firststage the constant value ensures ending up with the highestpossible level of state Having the energy consumption andmonetary cost elements simultaneously provides finding thebest trade-off between the two However unlike the firststage these elements are calculated for a piece of data thatis planned to be transferred as specifying the best amount isthe objective of this stage learning Similarly the coefficientsare obtained empirically

The interaction between Stage 1 and Stage 2 in the learningprocess is depicted in Algorithms 1 and 2 respectively

5 Results and Analysis

In this section we implement the proposed reinforcementlearning approach in a simulation environment as shown inFigure 4 using the parameter values defined in Table 1 Weconsider that half of the IoT devices connect with NB-IoT inview of the data privacy and related security requirementsthese represent Group A The remaining devices connect tothe eNB through the WiFi gateway hence over two wirelesshops and represent Group B Consequently there are sixpossible fixed scenarios that may be formed by selecting theprocessing location of each group of devices these are listed

in Table 6 A total of 100 iterations is conducted and in eachrandom battery levels are allocated to each of the devices

We compare the results obtained with our method tothe six listed scenarios in terms of five different parametersenergy cost dissatisfaction number of out of budget devicesand joint penalty First energy represents the end-to-endenergy consumption caused from both connection and dataprocessing Second cost is the overall monetary cost incurredby the use of the data processing locations such as fog andcloud Third dissatisfaction is a measure of the total numberof device requirements that are not satisfied Fourth numberof out of budget devices reflects the count of devices thatexceed their available monetary budgets during performingtheir tasks Finally the joint penalty indicates the cumulativecombination of previous four parameters (energy cost dissat-isfaction and number of out of budget devices)

The results in terms of gain (positive values) and loss(negative values) are shown in Figure 5 Note that the valuesfor parameters energy cost dissatisfaction and joint penaltyare obtained as follows

119892 (119909) = 119901119904 minus 119901119902119901119902 times 100 (7)

where 119901119904 and 119901119902 are the values from Table 5 for Scenarios A-Fand 119876-learning respectively

On the other hand the gainloss values for the parameterof number of out of budget devices in Figure 5 is calculatedusing the function given as

119900 (119909) = 119874119906119905 119900119891 119861119906119889119892119890119905 119863119890V119894119888119890119904119873119866 times 100 (8)

It is worth noting that the results provided in Figure 5 areevaluated using the average values given in Table 5 along with95 confidence intervals Moreover the joint cost parameterin Table 5 is calculated by summing them However beforethe summation other four parameters (energy consumption

8 Wireless Communications and Mobile Computing

Data Context-aware constraints available computational capacity in gateway and eNB budgetResult Combination of connectivity route and processing venue

1 initialization2 for all IoT devices do3 Determine the current state using Table 34 Evaluate all the actions5 Calculate the penalty using Table 36 Select the best action7 Jump to the next state8 Update the 119876-table9 if the selected action includes fog(gateway) or

cloud (eNB) processing then10 go to Algorithm 211 end12 end

Algorithm 1 First stage learning

Data Action selected by the first stage available computational capacity in gateway and eNB budgetResult Share of data to be offloaded13 initialization14 for all IoT devices do15 Determine the current state using Table 416 Evaluate all the actions17 Calculate the penalty using Table 418 Select the best action19 Jump to the next state20 Update the 119876-table21 end

Algorithm 2 Second stage learning

minus200 minus150 minus100 minus50 0 50 100 150 200minus200

minus150

minus100

minus50

0

50

100

150

200

LTE eNBWiFi GW 1WiFi GW 2WiFi GW 3

WiFi GW 4WiFi GW 5IoT Devices

Figure 4 Sample snapshot of the simulation environment IoTdevices are located randomly while positions of the gateways arefixed

cost dissatisfaction and number of out of budget devices) arefeature scaled into the range of [0 1] using the function in (5)in order to keep their impacts in the same scale

Our method outperforms any fixed combination whenexamining the joint or holistic gain with values rangingfrom 959 to 28354 Similarly the reinforcement learningtechnique results in better matching between the context-aware constraint and the availability of the IoT networkcompare to any other scenario with gains varying from18333 to 34444 Although the processing cost of ourproposed method is higher than that of Scenario A theresulting gain in energy saving is even more important aswell as the context-aware constraint compliance The closestcontender to reinforcement learning with respect to thegenerated results is Scenario C in which the processing ofGroup A IoT devices is locally conducted while that of GroupB occurs in the gateway Nonetheless the reinforcementlearning allows for a device-driven context-aware connectiv-ity that improves the compliance criteria by more than twotimes while saving 4322 of energy resulting in a holisticgain of 5852 Scenario D manages to reduce the energyconsumption more than our proposed approach at the sametotal cost however 303 of the devices are out of budgetresulting in incomplete or interrupted computational tasks

Wireless Communications and Mobile Computing 9

Table 5 Results on various metrics for 119876-learning and the scenarios

Energy Consumption (mJ) Cost Dissatisfaction Out of Budget Devices Joint CostQ-Learning 569 plusmn 0322 9677 plusmn 401 18 plusmn 0291 0 plusmn 0 07822Scenario A 1488 plusmn 0385 024 plusmn 615119890minus3 51 plusmn 028 0 plusmn 0 15323Scenario B 755 plusmn 024 11857 plusmn 449 529 plusmn 0181 303 plusmn 0217 20679Scenario C 816 plusmn 0284 1207 plusmn 0383 581 plusmn 0208 0 plusmn 0 12399Scenario D 083 plusmn 0025 13041 plusmn 454 6 plusmn 0 303 plusmn 0217 17756Scenario E 748 plusmn 0281 11968 plusmn 383 781 plusmn 0208 297 plusmn 0213 24643Scenario F 015 plusmn 459119890minus3 23802 plusmn 616 8 plusmn 0 6 plusmn 0339 30000

Table 6 List of fixed scenarios with connection types and locationsof data processing

Scenario Group A Group BA Device DeviceB Cloud DeviceC Device FogD Cloud FogE Device CloudF Cloud Cloud

Gain of Q-learning over the scenarios

Scenario A

Scenario B

Scenario C

Scenario D

Scenario E

Scenario Fminus100

minus50

0

50

100

150

200

250

300

350

Gai

n (

)

Total energy (mJ)Total costTotal dissatisfaction

Out of budget devicesJoint Penalty

Figure 5 Summary of results for 120589 = 01 Positive and negativevalues reflect gain and loss respectively Gainloss occurs when the119876-learningscenarios is better than the scenarios119876-learning

Moreover in this scenario connected devices are more thantwo times more likely to be dissatisfied with one or more ofthe context-aware requirements

Next we examine the impact of the battery priority factor120589 on the energy efficiency As shown in Figure 6 low valuesof 120589 result in almost neglecting the battery life of the device inthe optimisation process until it drops below 10 Very high

0 10 20 30 40 50 60 70 80 90 100Battery Level ()

1

2

3

4

5

6

7

Ener

gy C

onsu

mpt

ion

(J)

Impact of the Energy Prioritization Factor ()

= 01 = 03 = 05

= 09 = 12

times10minus3

Figure 6 Impact of energy prioritisation factor 120589

values of 120589 prioritise the reduction of energy consumption forall devices except those that have higher than 70battery lifeTo this end it is possible to tune this parameter dependingon the scenario at hand and in a device-specific manner Forinstance some devices may be part of a moving vehicle withthe possibility of agile and low cost battery replenishmentSuch devices may benefit from low settings of 120589 to allowmore flexibility in meeting the remaining constraints Otherdevices may be in hard-to-reach places and would requireskilled force special equipment and hence high cost toreplace the dead battery In this case higher settings of 120589 aremore suitable and would result in better cost to quality ratio

The simulation results achieved in this work are verypromising as they indicate a large margin for improvementthat is not possible in fixed connection schemes The pro-posed reinforcement learning method relies on centralisedintelligence which has access to all the constraints andrequirements of all devices gateways and connectionsHence the 119876-learning-based method selects the best action(connection typeprocessing location pair in the first stageand amount of data to be transmitted in the second stage)after the convergence We appreciate that such a deployment

10 Wireless Communications and Mobile Computing

is not realistic and propose to explore the feasibility andcorresponding gains of multiagent and distributed reinforce-ment learning as adopted in [24] in our future workNonetheless this work is undoubtedly the first to highlightthe importance of context-aware connectivity in the IoT con-text that addresses jointly security energy and computationalpower as well as cost We present a new application SmartPorts and quantify the potential margin for improvement byemploying the novel scheme and highlight its effects on theapplication

6 Conclusion

In this work we have presented novel approach for energy-aware and context-aware IoT connectivity that jointlyoptimises the energy security computational power andresponse time of the connection The proposed schemeemploys reinforcement learning and manages to achieve aholistic gain of up to 28354 compared to deterministicroutes Although some deterministic scenarios may resultin lower computational cost or lower energy consumptionnone is able to meet the holistic context-aware performancetarget In addition we presented an analysis of the impactof the energy prioritisation factor in which we demonstratedthe importance of tuning this parameter in a device-centricmanner in order to achieve better optimisation of the wholesystem

Data Availability

The data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This research was partly funded by EPSRCGlobal ChallengesResearch Fundmdashthe DARE ProjectmdashEPP0287641The firstauthor was supported by the Republic of Turkey Ministry ofNational Education (MoNE-1416YLSY)

References

[1] S Andreev O Galinina A Pyattaev et al ldquoUnderstandingthe IoT connectivity landscape a contemporary M2M radiotechnology roadmaprdquo IEEE Communications Magazine vol 53no 9 pp 32ndash40 2015

[2] L Atzori A Iera and G Morabito ldquoThe internet of things asurveyrdquoComputer Networks vol 54 no 15 pp 2787ndash2805 2010

[3] N Kouzayha M Jaber and Z Dawy ldquoMeasurement-basedsignaling management strategies for cellular IoTrdquo IEEE Internetof Things Journal vol 4 no 5 pp 1434ndash1444 2017

[4] Y Yang M Zhong H Yao F Yu X Fu and O PostolacheldquoInternet of things for smart ports technologies and chal-lengesrdquo IEEE Instrumentation Measurement Magazine vol 21no 1 pp 34ndash43 2018

[5] GSMA ldquo3GPP low power wide area technologiesrdquo GSMAWhite paper Oct 2016

[6] 3GPP ldquoEvolved Universal Terrestrial Radio Access (E-UTRA)LTE coverage enhancementsrdquo 3GPPThechnical Report 36 Jun2012

[7] Technologies Keysight ldquoThe menu at the IoT cafe a guide toIoT wireless technologiesrdquo Application Note 2017

[8] L Farhan S T Shukur A E Alissa M Alrweg U Raza andR Kharel ldquoA survey on the challenges and opportunities of theInternet of Things (IoT)rdquo in Proceedings of the 2017 EleventhInternational Conference on Sensing Technology (ICST) pp 1ndash5December 2017

[9] S Tayade P Rost A Maeder and H D Schotten ldquoDevice-centric energy optimization for edge cloud offloadingrdquo inProceedings of the 2017 IEEEGlobal Communications Conference(GLOBECOM 2017) pp 1ndash7 Singapore December 2017

[10] F Renna J Doyle V Giotsas and Y Andreopoulos ldquoQueryprocessing for the internet-of-things coupling of device energyconsumption and cloud infrastructure billingrdquo in Proceedingsof the 2016 IEEE First International Conference on Internet-of-Things Design and Implementation (IoTDI) pp 83ndash94 BerlinGermany April 2016

[11] S Persia C Carciofi and M Faccioli ldquoNB-IoT and LoRAconnectivity analysis for M2MIoT smart grids applicationsrdquo inProceedings of the 2017 AEIT International Annual Conferencepp 1ndash6 Cagliari September 2017

[12] A Mihovska and M Sarkar ldquoSmart connectivity for internet ofthings (IoT) applicationsrdquo in New Advances in the Internet ofThings vol 715 of Studies in Computational Intelligence pp 105ndash118 Springer International Publishing Cham 2018

[13] N Kouzayha M Jaber and Z Dawy ldquoM2M data aggregationover cellular networks signaling-delay trade-offsrdquo in Proceed-ings of the 2014 IEEE Globecom Workshops (GC Wkshps) pp1155ndash1160 December 2014

[14] J Xu L Chen and P Zhou ldquoJoint service caching and taskoffloading for mobile edge computing in dense networksrdquoArXiv e-prints 180105868 Jan 2018

[15] O Y Bursalioglu Z Li C Wang and H PapadopoulosldquoEfficient C-RAN random access for IoT devices learning linksvia recommendation systemsrdquo ArXiv e-prints 180104001 Jan2018

[16] H Li K Ota and M Dong ldquoLearning IoT in edge deeplearning for the internet of things with edge computingrdquo IEEENetwork vol 32 no 1 pp 96ndash101 2018

[17] E Oyekanlu ldquoPredictive edge computing for time series ofindustrial IoT and large scale critical infrastructure based onopen-source software analytic of big datardquo in Proceedings of the2017 IEEE International Conference on Big Data (Big Data) pp1663ndash1669 Boston MA USA December 2017

[18] S Barbarossa S Sardellitti E Ceci and M Merluzzi ldquoTheedge cloud a holistic view of communication computation andcachingrdquo ArXiv e-prints 180200700 Feb 2018

[19] T X Vu S Chatzinotas and B Ottersten ldquoEdge-cachingwireless networks performance analysis and optimizationrdquoIEEE Transactions on Wireless Communications vol 17 no 4pp 2827ndash2839 2018

[20] ITU-R ldquoPropagation data and prediction methods for theplanning of short-range outdoor radiocommunication sys-tems and radio local area networks in the frequency range300 MHz to 100 GHzrdquo International TelecommunicationUnionmdashRadiocommunication Sector Geneva 2017 Recommen-dation ITU-R P1411-9

Wireless Communications and Mobile Computing 11

[21] L P KaelblingM L Littman andAWMoore ldquoReinforcementlearning a surveyrdquo Journal of Artificial Intelligence Research vol4 pp 237ndash285 1996

[22] E M Russek I Momennejad MM Botvinick S J Gershmanand N D Daw ldquoPredictive representations can link model-based reinforcement learning tomodel-freemechanismsrdquo PLoSComputational Biology vol 13 no 9 Article ID e1005768 2017

[23] E Even-Dar and Y Mansour ldquoConvergence of optimistic andincremental q-learningrdquo in Advances in Neural InformationProcessing Systems pp 1499ndash1506 2002

[24] M Jaber M A Imran R Tafazolli and A TukmanovldquoA distributed SON-based user-centric backhaul provisioningschemerdquo IEEE Access vol 4 pp 2314ndash2330 2016

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 3: Energy-Aware Smart Connectivity for IoT Networks: Enabling ...

Wireless Communications and Mobile Computing 3

networks They also attempt to increase the amount of edgetasks by considering the edge capacity constraints An open-source database is designed in [17] for the edge computationof Industrial IoT (IIoT) networks The authors use a time-series analysis for predicting conditions of IIoT machinesin order to decrease the amount of condition reports tobe sent to the cloud A holistic view of communicationcomputation and caching is presented in [18] using graph-based representations as learning methods for innovativeresource allocation techniquesThe performance of the edge-caching as well as the energy efficiency and delivery time isinvestigated in [19] with quality of service (QoS) constraints

In this work we employ machine learning techniquesbased on reinforcement learning in order tomanagemultipleoptimisation objectives jointly and to dynamically identifythe best connection and route for each device We identifyfour key quality features that dominate IoT applications ingeneral and smart ports in particular security energy latencyand cost This work is the first to address these multiple IoToptimisation objectives jointly using reinforcement learningWe compare our novel approach to the state-of-the-artconnectivity solutions and demonstrate significant gains inall aspects (ranging from 959 to 28354) Moreover ourapproach is the only one that is able to meet the context-aware requirements fully while minimising the cost and theenergy consumption The advantage of the machine learningscheme adopted is primarily its low complexity and its abilityto optimise in a dynamic environment such as a smart port

The rest of the paper is organised as follows In Section 3we define the system model of our research In Section 4we present our novel machine-learning-based solution forsolving the multiobjective problem Section 5 elaborates theresults and analysis and in Section 6 we conclude the article

3 System Model

The energy-aware smart connectivity novel approach pro-posed in this work applies to any IoT network with diverseoptions of connectivity and processing For the sake of clarityin the presentation we build the system model around asmart port scenario such as the one shown in Figure 1 All IoTdevices are battery operated and have different battery livesThey all have some processing power to perform basic tasksand can either offload the task to the gateway (or fog) ie theWiFi access point or to the evolved node B (eNB or cloud)

Differently from the state-of-the-art research we proposeto decide simultaneously on the best connectivity and the bestlocation for processing the tasks by jointly optimising energyresponse time security and cost A two-stage approachwhich describes the decision and optimisation processes ispresented in Figure 2 It is assumed that every IoT device iscontrolled by a given application and they jointly determinethe context-aware constraints Each combination of connec-tivity option and processing location offers specific charac-teristics and limitations Stage 1 consists of optimising thesedecisions based on the context-aware constraints while Stage2 refines the trade-off between energy consumption and costIn the following paragraphs we describe the models adopted

Figure 1 Smart port diagram with two overlapping networks NB-IoT and WiFi WiFi access points use LTE for backhauling All IoTdevices are capable of both wireless technologies

to capture the propagation loss energy consumption andresponse time for the proposed system Table 1 lists all theparameters that are pertinent to our simulations

31 Propagation Model There are three wireless connectionsthat require modelling (a) Device-to-Gateway (WiFi) (b)Device-to-eNB (NB-IoT) and (c) Gateway-to-eNB (LTE)Connections (a) and (c) are often interference limited asthe employed spectrum is likely to be shared by otherneighbouring connections Connections of type (b) arehowever considered to be noise limited as we assume thatthere are no other eNB in the surrounding employing NB-IoT technology The objective of the propagation modellingis to determine the transmission power required to caterfor each of the wireless connection types Accordingly theenergy consumption will be calculated We start with thepropagation loss 119871 which is modelled as a function of twotechnology-specific parameters the propagation constant 119870and the propagation exponent 120572 and the distance of thewireless hop 120575measured in 119896119898 as shown below

119871 = 119870 sdot 120575120572 (1)

Moreover the probability of having line of sight between thedevice and the gateway is much higher than in the case of theother types of wireless connections hence the propagationloss per decade is less [20] On the other hand NB-IoTconnections suffer the same propagation loss per decade asLTE links however are successfully received with 20 dB lesspower (threshold receiver sensitivity is minus141 dBm) For alltypes of links the received power at a distance 119889119909 from thetransmitting device can be expressed as 119875119903 = 119875119905119871 in mWattNext we calculate the required received power 119875119903 (in mWatt)in order to achieve the target data transmission119863 in bits

119863 = 119879 sdot 119861 sdot log2 (1 + 119875119903119875119868 + 1198730 sdot 119861) (2)

where 119879 is the time period 119861 is the channel bandwidth and119875119868 is the cumulative interference power on the given channelduring time period 119879 Please note that 119875119868 is null for wirelessconnections of type (b) Using (2) and solving for 119875119903 we get

119875119903 = (2119863(119879sdot119861) minus 1) times (119875119868 + 1198730 sdot 119861) (3)

4 Wireless Communications and Mobile Computing

Sensor- Wireless

options- Battery

Gateway(WiFi)

eNB(NB-IoT)

Local processing

Gateway processing

Cloud processing Actuation

Application- Security- Response time

Connectivity- eSiMGateway- Energy consumption

Processing- Availability- Response-time- Energy consumption

Local processing

Cloud processing

Joint DecisionContext-aware

STAGE 1

Constraints

STAGE 2

Trade-off

Cost

Energy

Figure 2 Decision and optimisation processes in a two-stage approach to optimise four performance criteria energy response time securityand cost

Source

LTENB-IoT Wireless ChannelWiFi Wireless

Channel

Recipient

Data transmitted (Dr) Processing at recipient

Data transmitted (Dp) Processing at sourcetptp

Figure 3 Uplink delay model capturing the factors affecting both processing and transmission delays over any hop in our system

32 Energy Consumption Model There are two major pro-cesses that consume energy in an IoT network wirelesstransmission and task computationThe energy consumptionof the former is 119864119905 and the latter is 119864119901 thus the total energyconsumption is the sum of both Depending on the route ofcommunication taken by the device the energy consumeddue to transmission power can be a result of either one hopusing NB-IoT (119864119905119887) or two hops using WiFi for the first linkand LTE for the second (119864119905119886 + 119864119905119888) The energy consumed forprocessing the task is a function of the data rate requirementof device 119889 120579119889 and the computational power of the processor119864119901119894 forall119894 = 119889 119891 119888 (see Table 1) and is expressed as119864119901 = 120579sdot119864119901119894 33 Response Time Model The response time perceived bythe IoT device is the combination of the uplink and downlinkdelays between the IoT device and the server In this workthe uplink delay is modelled while the downlink delay isassumed the same for all devices

The uplink delay is caused by two phenomena taskprocessing (processing delay 119905119901) and data transmission(transmission delay 119905119905) The processing delay depends on theprocessorrsquos computational power which is measured in thenumber of computational cycle per data element (120578) ie thehigher 120578 the less computational power Naturally a server hashigher computational power than a small gateway and muchhigher than a simple IoT device (120578119888 lt 120578119891 lt 120578119889) Thus in

this work 119905119901 is modelled based on the computational powersof the processing locations 119905119901119889 = 10 times 119905119901119891 = 100 times 119905119901119888 Inaddition while the input to the task processing stage is largeraw data the output is compressed data with comparably lessvolume To that end the compression rate between the inputand output data volumes is given as119862119863119903 = 119862 sdot119863119901 where119863119903and 119863119901 are the volumes of raw and processed (compressed)data respectively

The transmission delay is affected by the type of radioaccess technology and the volume of data to be transmittedSinceWiFi access employs the unlicensed frequency bands itoften suffers from higher retransmission rates which resultsin increased transmission delays due to frequent collisionsTherefore in this work this effect is captured by the factor119865 gt 1 whereby the delay incurred for transmitting the samevolume of data overWiFi is119865 times higher than that over LTEor NB-IoT 119905119905119886 = 119865 sdot 119905119905119887 = 119865 sdot 119905119905119888 This model is represented inFigure 3 in which the source could be either the IoT deviceor the gateway and the recipient could be either the gatewayor the cloud

Consequently the overall response time for each action iscalculated for 119862 = 200 and 119865 = 2 as follows

119877 = 119905119901 +119873ℎsum119894=1

119905119905119894 sdot 119863119894 (4)

Wireless Communications and Mobile Computing 5

Table 1 System model parameters and simulation values

Parameter Value Description119903119899 200 m eNB cell radius119903119908 30 m WiFi cell radius119873119866 10 Number of IoT devices per gateway120594119889 30 Kbps Computational capacity (device)120594119891 102 Kbps Computational capacity (fog)120594119888 103 Kbps Computational capacity (cloud)120598 5 times 10minus9 Joule Energy consumption per computational cycle120578119889 102 Required amount of computational cycle per data element (device)120578119891 10 Required amount of computational cycle per data element (fog)120578119888 1 Required amount of computational cycle per data element (cloud)1198730 -204 dBWHz Noise density119861 180 kHz Bandwidth119875119905119889 10minus8W Average transmit power of the IoT devices in the gateways 2 3 4 and 5119879 1 s Time period120582 05 119876-table update parameter120601 09 119876-table update parameter1205761 08 Action selection parameter for Stage 11205762 104 Action selection parameter for Stage 2120588 08 Decaying rate for 1205761 and 1205762S 8 Number of bits in each data element120599 103119878 Conversion of kbps data rates to number of data elements119864119901119889 120598 sdot 120578119889 sdot 120582 Data processing energy consumption per data rate in kbps (device)119864119901119891 120598 sdot 120578119891 sdot 120582 Data processing energy consumption per data rate in kbps (fog)119864119901119888 120598 sdot 120578119888 sdot 120582 Data processing energy consumption per data rate in kbps (cloud)Γ119889 10minus4 Cost of processing per kbps (device)Γ119891 10minus1 Cost of processing per kbps (fog)Γ119888 1 Cost of processing per kbps (cloud)b 20 Budget1205731 102 Constant coefficient for penalty comparison1205732 1012 Constant coefficient for penalty comparison119870119908 = 119870119897 = 119870119899 1281 dB Propagation loss constant for all wireless connection types (a b and c)120572119897 = 120572119899 376 Propagation loss exponent for NB-IoT and LTE wireless connection types (b and c)120572119908 3 Propagation loss exponent for Wi-Fi (80211g) wireless connection type (a)

where 119873ℎ = 1 2 is the number of hops and 119863 = 119863119903 119863119901Besides 119905119905119894 and 119863119894 represent the values of 119905119905 and 119863 for the 119894119905ℎhop respectivelyThen the calculated values populate Table 2after the application of feature scaling into the range of [0 1]using the function given as

119891 (119909) = 119909 minusmin (119883)max (119883) minusmin (119883) (5)

where 119883 is the set of 119909 Note that both (a) and (b) typeconnections constitute the first hop while the connectiontype (c) is the second hop

4 Machine Learning-Based Solution

In this work we propose to employ reinforcement learning(RL) a machine learning technique based on a goal-seeking

approach It is a trial and error approach in which the agent(or learning device) learns to take the correct action byinteracting with its surroundings and being rewarded orpenalised in each iteration RL is selected in this work due toits great applicability to the presented problem For exampleIoT devices need to interact with its environment in orderto assess the circumstances and to take subsequent actionswhich is determination of the connection type and the dataprocessing location Therefore RL maps to this requirementvery well since it allows optimisation with environmentalinteractions

Being one of the most prominent reinforcement learningtechniques 119876-learning aims to find the optimum policy fora given problem that is the best action to take at any givenstate To do this the agent takes an action and evaluates thesubsequent rewardcost of taking that action given that it was

6 Wireless Communications and Mobile Computing

Table 2 Stage one action list

Action Connection Processor Tuple1198601 Wi-Fi Device 1198601 = [0004 1 120594119889 (119864119905119886 + 119864119905119888 + 119864119901119889 sdot 120579) Γ119889]1198602 Wi-Fi Fog 1198602 = [062 1 120594119891 (119864119905119886 + 119864119905119888 + 119864119901119891 sdot 120579) Γ119891]1198603 Wi-Fi Cloud 1198603 = [1 1 120594119888 (119864119905119886 + 119864119905119888 + 119864119901119888 sdot 120579) Γ119888]1198604 NB-IoT Device 1198604 = [0 0 120594119889 (119864119905119887 + 119864119901119889 sdot 120579) Γ119889]1198605 NB-IoT Cloud 1198605 = [02 0 120594119888 (119864119905119887 + 119864119901119888 sdot 120579) Γ119888]

in a certain state This rewardcost is then used to update alook-up-table known as the119876-table which is later utilised bythe agent to select the best action Further the agent calculatesthe 119876-value for every possible stateaction pair Therefore asimple implementation can result in the agent learning onlinethe best actions regardless of the policy

Moreover 119876-learning offers two key features whichenable an efficient solution to our problem First as it isa model-free learning approach [21 22] it is (1) capableof operating in dynamically changing environments (2) alow-complexity algorithm which does not require a lot ofpower thus reducing the energy consumption of the IoTnetwork Second 119876-learning is known to converge in mostcases [23] which has also been demonstrated in multiagentnoncooperative environments [24] as are IoT networks

We propose a two-stage approach to solve the energy-aware smart IoT connectivity where each of the stagesemploys 119876-learning41 First Stage Learning Stage 1 consists of learning thebest combination of connectivity and processing locationin view of the device and application requirements and thelimitations offered by each of these options Thus there arefive possible actions that may be taken by each device asdescribed in Table 2 As a side note all the variables inTable 2 are the feature scaled values (into the range of [01]) calculated through (5) The tuples shown represent thelimitations of each action eg 119860 119894 = [119877 Σ 120594119897 119864119905 + 119864119901 Γ119897]where 119877 and 119864119905 + 119864119901 are described in Sections 33 and32 respectively 120594119897 is the available processing capacity andΓ119897 is the processing cost where 119897 = 119889 119891 119888 as defined inTable 1 The parameter Σ = 1 2 refers to the level ofdata security offered by the wireless technology whereby thevalue 1 indicates eSIM protection (only provided by NB-IoT)and 2 the absence of that Moreover each device may bein four different states as shown in Table 3 depending onthe context-aware constraints defined jointly by the deviceand application These constraints are 1198771015840 Σ1015840 and 1205941015840 whichrepresent the response time security level and computationalpower requirements respectively

411 Penalty Function Determination Each device will esti-mate the penalty function associated with each possibleaction it is able to take following the system shown in Table 3where120593119901 = 119877minus1198771015840 ΣminusΣ1015840 120594minus1205941015840 | 119901 = 1 2 3 is the differencebetween the available and required characteristicsThe fourthpenalty is 1205934 = 1205941015840 sdot 119860(5)119894 minus 119887 where 119860(5)119894 is the fifth index of 119894119905ℎaction and the parameter 119887 is the available budget

The penalty function determination policy aims to satisfythe optimisation objective by including the elements that aredesired to be minimised As seen from Table 3 the penaltyfunctions consist of three main elements constant termdissatisfaction level and energy consumption The constantvalue is the cost of being in the states and it decreases whilethe level of state increases This element compels the agenttry to achieve the highest possible level of states as it is oneof the objectives of the optimisation problem The elementof dissatisfaction level as a supportive of the constant valueincurs cost for not satisfying the device requirements inorder to improve the satisfaction levels Lastly the energyconsumption element provides minimisation in the end-to-end energy consumption (connection and data processing)The parameter 0 le ] le 1 is the battery level where 0represents an empty battery and 1 represents the full chargeIn the expressions in Table 3 the parameter 120589 specifies thepriority level of the energy consumption For instance lowvalues of 120589 prioritise the energy consumption once the batterylevel ] is very low (eg 5) while high values prioritise theenergy consumption even when the battery level is high (eg50)

In addition to all these normally the algorithm tends toselect an option with a cloud processing as it is the mostenergy efficient one However some amount of data willnot be offloaded due to budget constraints and will thenbe processed locally which is the most energy consumingoption Note that this amount is evaluated by the second stagelearning Thus the selected option by the first stage wouldbe more energy consuming than the fog processing-includedoption as the processing will be the combination of the cloudand device Therefore the last parts of the penalty functions(inside the square brackets) prevent the algorithm frommaking blind decisions which ignores the budget availabilityby including an average energy consumption of the actionswith the device processing The reason of taking the averagevalue is that the final action is yet to be taken during thelearning process The coefficients of these three elementsare determined empirically However they can be used toprioritise any element that is desired to be minimised more

The119876-table entries are then updated according to the fol-lowing expression where 119904 1199041015840 119875 and 119886 are the current statenext state penalty function and action under evaluation

119876 (119904 119886) larr997888 119876 (119904 119886)+ 120582 (119875 (119904) + 120601min (119876 (1199041015840 119886)) minus 119876 (119904 119886)) (6)

Wireless Communications and Mobile Computing 7

Table 3 List of possible states of each device in Stage one and corresponding penalty calculation

State Description Penalty function (119875)

1205901 None of the constraints are satisfied 104 + sum119901=1minus3

120593119901 + 10120589] sdot 1205731 sdot 119860(4)119894 sdot [(1 minus 12059341205941015840 sdot 119860(5)119894 ) +1205934 sdot ((119860(4)1 + 119860(4)4 ) 2)

1205941015840 sdot 119860(5)119894 ]1205902 One constraint is satisfied 5 times 103 + 08 sum

119901=1minus3

120593119901 + 10120589] sdot 1205731 sdot 119860(4)119894 sdot [(1 minus 12059341205941015840 sdot 119860(5)119894 ) + 1205934 sdot ((119860(4)1 + 119860(4)4 ) 2)1205941015840 sdot 119860(5)119894 ]

1205903 Two constraints are satisfied 2 times 103 + 06 sum119901=1minus3

120593119901 + 10120589] sdot 1205731 sdot 119860(4)119894 sdot [(1 minus 12059341205941015840 sdot 119860(5)119894 ) +1205934 sdot ((119860(4)1 + 119860(4)4 ) 2)

1205941015840 sdot 119860(5)119894 ]1205904 Three constraints are satisfied 08 sum

119901=1minus2

120593119901 + 1205933 + 10120589] sdot 1205731 sdot 119860(4)119894 sdot [(1 minus 12059341205941015840 sdot 119860(5)119894 ) +1205934 sdot ((119860(4)1 + 119860(4)4 ) 2)

1205941015840 sdot 119860(5)119894 ]

Table 4 List of possible states of each device in Stage two and corresponding penalty calculation

State Description Penalty function1 No availability in cloud or fog for 1205941015840 103 + 1205732 (119860(4)119894 sdot (1 minus 119894) + 1205941015840 sdot 119894 sdot 119860(5)119894 )2 Enough availability in cloud or fog but no budget for 1205941015840 103 + 1205732 (119860(4)119894 sdot (1 minus 119894) + 1205941015840 sdot 119894 sdot 119860(5)119894 )3 Enough availability and budget for 1205941015840 1205732 (119860(4)119894 sdot (1 minus 119894) + 1205941015840 sdot 119894 sdot 119860(5)119894 )

42 Second Stage Learning The second stage aims to find thebest policy for task offloading by considering the budget andavailability of the fog or cloud To this end the second stageis activated only when the action taken in Stage 1 does notresult in local processing (ie 1198601 and 1198604) In Stage 2 119876-learning is also employed with 21 possible actions = [0 005 1] and the constraints are the available budget 119887 andthe availability of the fog andor cloud The resulting statesand penalty functions for this stage are listed in Table 4

421 Penalty Function Determination The penalty functionof this stage is determined with a similar procedure to thefirst stage hence there are three cost elements constant termenergy consumption and monetary cost Similar to the firststage the constant value ensures ending up with the highestpossible level of state Having the energy consumption andmonetary cost elements simultaneously provides finding thebest trade-off between the two However unlike the firststage these elements are calculated for a piece of data thatis planned to be transferred as specifying the best amount isthe objective of this stage learning Similarly the coefficientsare obtained empirically

The interaction between Stage 1 and Stage 2 in the learningprocess is depicted in Algorithms 1 and 2 respectively

5 Results and Analysis

In this section we implement the proposed reinforcementlearning approach in a simulation environment as shown inFigure 4 using the parameter values defined in Table 1 Weconsider that half of the IoT devices connect with NB-IoT inview of the data privacy and related security requirementsthese represent Group A The remaining devices connect tothe eNB through the WiFi gateway hence over two wirelesshops and represent Group B Consequently there are sixpossible fixed scenarios that may be formed by selecting theprocessing location of each group of devices these are listed

in Table 6 A total of 100 iterations is conducted and in eachrandom battery levels are allocated to each of the devices

We compare the results obtained with our method tothe six listed scenarios in terms of five different parametersenergy cost dissatisfaction number of out of budget devicesand joint penalty First energy represents the end-to-endenergy consumption caused from both connection and dataprocessing Second cost is the overall monetary cost incurredby the use of the data processing locations such as fog andcloud Third dissatisfaction is a measure of the total numberof device requirements that are not satisfied Fourth numberof out of budget devices reflects the count of devices thatexceed their available monetary budgets during performingtheir tasks Finally the joint penalty indicates the cumulativecombination of previous four parameters (energy cost dissat-isfaction and number of out of budget devices)

The results in terms of gain (positive values) and loss(negative values) are shown in Figure 5 Note that the valuesfor parameters energy cost dissatisfaction and joint penaltyare obtained as follows

119892 (119909) = 119901119904 minus 119901119902119901119902 times 100 (7)

where 119901119904 and 119901119902 are the values from Table 5 for Scenarios A-Fand 119876-learning respectively

On the other hand the gainloss values for the parameterof number of out of budget devices in Figure 5 is calculatedusing the function given as

119900 (119909) = 119874119906119905 119900119891 119861119906119889119892119890119905 119863119890V119894119888119890119904119873119866 times 100 (8)

It is worth noting that the results provided in Figure 5 areevaluated using the average values given in Table 5 along with95 confidence intervals Moreover the joint cost parameterin Table 5 is calculated by summing them However beforethe summation other four parameters (energy consumption

8 Wireless Communications and Mobile Computing

Data Context-aware constraints available computational capacity in gateway and eNB budgetResult Combination of connectivity route and processing venue

1 initialization2 for all IoT devices do3 Determine the current state using Table 34 Evaluate all the actions5 Calculate the penalty using Table 36 Select the best action7 Jump to the next state8 Update the 119876-table9 if the selected action includes fog(gateway) or

cloud (eNB) processing then10 go to Algorithm 211 end12 end

Algorithm 1 First stage learning

Data Action selected by the first stage available computational capacity in gateway and eNB budgetResult Share of data to be offloaded13 initialization14 for all IoT devices do15 Determine the current state using Table 416 Evaluate all the actions17 Calculate the penalty using Table 418 Select the best action19 Jump to the next state20 Update the 119876-table21 end

Algorithm 2 Second stage learning

minus200 minus150 minus100 minus50 0 50 100 150 200minus200

minus150

minus100

minus50

0

50

100

150

200

LTE eNBWiFi GW 1WiFi GW 2WiFi GW 3

WiFi GW 4WiFi GW 5IoT Devices

Figure 4 Sample snapshot of the simulation environment IoTdevices are located randomly while positions of the gateways arefixed

cost dissatisfaction and number of out of budget devices) arefeature scaled into the range of [0 1] using the function in (5)in order to keep their impacts in the same scale

Our method outperforms any fixed combination whenexamining the joint or holistic gain with values rangingfrom 959 to 28354 Similarly the reinforcement learningtechnique results in better matching between the context-aware constraint and the availability of the IoT networkcompare to any other scenario with gains varying from18333 to 34444 Although the processing cost of ourproposed method is higher than that of Scenario A theresulting gain in energy saving is even more important aswell as the context-aware constraint compliance The closestcontender to reinforcement learning with respect to thegenerated results is Scenario C in which the processing ofGroup A IoT devices is locally conducted while that of GroupB occurs in the gateway Nonetheless the reinforcementlearning allows for a device-driven context-aware connectiv-ity that improves the compliance criteria by more than twotimes while saving 4322 of energy resulting in a holisticgain of 5852 Scenario D manages to reduce the energyconsumption more than our proposed approach at the sametotal cost however 303 of the devices are out of budgetresulting in incomplete or interrupted computational tasks

Wireless Communications and Mobile Computing 9

Table 5 Results on various metrics for 119876-learning and the scenarios

Energy Consumption (mJ) Cost Dissatisfaction Out of Budget Devices Joint CostQ-Learning 569 plusmn 0322 9677 plusmn 401 18 plusmn 0291 0 plusmn 0 07822Scenario A 1488 plusmn 0385 024 plusmn 615119890minus3 51 plusmn 028 0 plusmn 0 15323Scenario B 755 plusmn 024 11857 plusmn 449 529 plusmn 0181 303 plusmn 0217 20679Scenario C 816 plusmn 0284 1207 plusmn 0383 581 plusmn 0208 0 plusmn 0 12399Scenario D 083 plusmn 0025 13041 plusmn 454 6 plusmn 0 303 plusmn 0217 17756Scenario E 748 plusmn 0281 11968 plusmn 383 781 plusmn 0208 297 plusmn 0213 24643Scenario F 015 plusmn 459119890minus3 23802 plusmn 616 8 plusmn 0 6 plusmn 0339 30000

Table 6 List of fixed scenarios with connection types and locationsof data processing

Scenario Group A Group BA Device DeviceB Cloud DeviceC Device FogD Cloud FogE Device CloudF Cloud Cloud

Gain of Q-learning over the scenarios

Scenario A

Scenario B

Scenario C

Scenario D

Scenario E

Scenario Fminus100

minus50

0

50

100

150

200

250

300

350

Gai

n (

)

Total energy (mJ)Total costTotal dissatisfaction

Out of budget devicesJoint Penalty

Figure 5 Summary of results for 120589 = 01 Positive and negativevalues reflect gain and loss respectively Gainloss occurs when the119876-learningscenarios is better than the scenarios119876-learning

Moreover in this scenario connected devices are more thantwo times more likely to be dissatisfied with one or more ofthe context-aware requirements

Next we examine the impact of the battery priority factor120589 on the energy efficiency As shown in Figure 6 low valuesof 120589 result in almost neglecting the battery life of the device inthe optimisation process until it drops below 10 Very high

0 10 20 30 40 50 60 70 80 90 100Battery Level ()

1

2

3

4

5

6

7

Ener

gy C

onsu

mpt

ion

(J)

Impact of the Energy Prioritization Factor ()

= 01 = 03 = 05

= 09 = 12

times10minus3

Figure 6 Impact of energy prioritisation factor 120589

values of 120589 prioritise the reduction of energy consumption forall devices except those that have higher than 70battery lifeTo this end it is possible to tune this parameter dependingon the scenario at hand and in a device-specific manner Forinstance some devices may be part of a moving vehicle withthe possibility of agile and low cost battery replenishmentSuch devices may benefit from low settings of 120589 to allowmore flexibility in meeting the remaining constraints Otherdevices may be in hard-to-reach places and would requireskilled force special equipment and hence high cost toreplace the dead battery In this case higher settings of 120589 aremore suitable and would result in better cost to quality ratio

The simulation results achieved in this work are verypromising as they indicate a large margin for improvementthat is not possible in fixed connection schemes The pro-posed reinforcement learning method relies on centralisedintelligence which has access to all the constraints andrequirements of all devices gateways and connectionsHence the 119876-learning-based method selects the best action(connection typeprocessing location pair in the first stageand amount of data to be transmitted in the second stage)after the convergence We appreciate that such a deployment

10 Wireless Communications and Mobile Computing

is not realistic and propose to explore the feasibility andcorresponding gains of multiagent and distributed reinforce-ment learning as adopted in [24] in our future workNonetheless this work is undoubtedly the first to highlightthe importance of context-aware connectivity in the IoT con-text that addresses jointly security energy and computationalpower as well as cost We present a new application SmartPorts and quantify the potential margin for improvement byemploying the novel scheme and highlight its effects on theapplication

6 Conclusion

In this work we have presented novel approach for energy-aware and context-aware IoT connectivity that jointlyoptimises the energy security computational power andresponse time of the connection The proposed schemeemploys reinforcement learning and manages to achieve aholistic gain of up to 28354 compared to deterministicroutes Although some deterministic scenarios may resultin lower computational cost or lower energy consumptionnone is able to meet the holistic context-aware performancetarget In addition we presented an analysis of the impactof the energy prioritisation factor in which we demonstratedthe importance of tuning this parameter in a device-centricmanner in order to achieve better optimisation of the wholesystem

Data Availability

The data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This research was partly funded by EPSRCGlobal ChallengesResearch Fundmdashthe DARE ProjectmdashEPP0287641The firstauthor was supported by the Republic of Turkey Ministry ofNational Education (MoNE-1416YLSY)

References

[1] S Andreev O Galinina A Pyattaev et al ldquoUnderstandingthe IoT connectivity landscape a contemporary M2M radiotechnology roadmaprdquo IEEE Communications Magazine vol 53no 9 pp 32ndash40 2015

[2] L Atzori A Iera and G Morabito ldquoThe internet of things asurveyrdquoComputer Networks vol 54 no 15 pp 2787ndash2805 2010

[3] N Kouzayha M Jaber and Z Dawy ldquoMeasurement-basedsignaling management strategies for cellular IoTrdquo IEEE Internetof Things Journal vol 4 no 5 pp 1434ndash1444 2017

[4] Y Yang M Zhong H Yao F Yu X Fu and O PostolacheldquoInternet of things for smart ports technologies and chal-lengesrdquo IEEE Instrumentation Measurement Magazine vol 21no 1 pp 34ndash43 2018

[5] GSMA ldquo3GPP low power wide area technologiesrdquo GSMAWhite paper Oct 2016

[6] 3GPP ldquoEvolved Universal Terrestrial Radio Access (E-UTRA)LTE coverage enhancementsrdquo 3GPPThechnical Report 36 Jun2012

[7] Technologies Keysight ldquoThe menu at the IoT cafe a guide toIoT wireless technologiesrdquo Application Note 2017

[8] L Farhan S T Shukur A E Alissa M Alrweg U Raza andR Kharel ldquoA survey on the challenges and opportunities of theInternet of Things (IoT)rdquo in Proceedings of the 2017 EleventhInternational Conference on Sensing Technology (ICST) pp 1ndash5December 2017

[9] S Tayade P Rost A Maeder and H D Schotten ldquoDevice-centric energy optimization for edge cloud offloadingrdquo inProceedings of the 2017 IEEEGlobal Communications Conference(GLOBECOM 2017) pp 1ndash7 Singapore December 2017

[10] F Renna J Doyle V Giotsas and Y Andreopoulos ldquoQueryprocessing for the internet-of-things coupling of device energyconsumption and cloud infrastructure billingrdquo in Proceedingsof the 2016 IEEE First International Conference on Internet-of-Things Design and Implementation (IoTDI) pp 83ndash94 BerlinGermany April 2016

[11] S Persia C Carciofi and M Faccioli ldquoNB-IoT and LoRAconnectivity analysis for M2MIoT smart grids applicationsrdquo inProceedings of the 2017 AEIT International Annual Conferencepp 1ndash6 Cagliari September 2017

[12] A Mihovska and M Sarkar ldquoSmart connectivity for internet ofthings (IoT) applicationsrdquo in New Advances in the Internet ofThings vol 715 of Studies in Computational Intelligence pp 105ndash118 Springer International Publishing Cham 2018

[13] N Kouzayha M Jaber and Z Dawy ldquoM2M data aggregationover cellular networks signaling-delay trade-offsrdquo in Proceed-ings of the 2014 IEEE Globecom Workshops (GC Wkshps) pp1155ndash1160 December 2014

[14] J Xu L Chen and P Zhou ldquoJoint service caching and taskoffloading for mobile edge computing in dense networksrdquoArXiv e-prints 180105868 Jan 2018

[15] O Y Bursalioglu Z Li C Wang and H PapadopoulosldquoEfficient C-RAN random access for IoT devices learning linksvia recommendation systemsrdquo ArXiv e-prints 180104001 Jan2018

[16] H Li K Ota and M Dong ldquoLearning IoT in edge deeplearning for the internet of things with edge computingrdquo IEEENetwork vol 32 no 1 pp 96ndash101 2018

[17] E Oyekanlu ldquoPredictive edge computing for time series ofindustrial IoT and large scale critical infrastructure based onopen-source software analytic of big datardquo in Proceedings of the2017 IEEE International Conference on Big Data (Big Data) pp1663ndash1669 Boston MA USA December 2017

[18] S Barbarossa S Sardellitti E Ceci and M Merluzzi ldquoTheedge cloud a holistic view of communication computation andcachingrdquo ArXiv e-prints 180200700 Feb 2018

[19] T X Vu S Chatzinotas and B Ottersten ldquoEdge-cachingwireless networks performance analysis and optimizationrdquoIEEE Transactions on Wireless Communications vol 17 no 4pp 2827ndash2839 2018

[20] ITU-R ldquoPropagation data and prediction methods for theplanning of short-range outdoor radiocommunication sys-tems and radio local area networks in the frequency range300 MHz to 100 GHzrdquo International TelecommunicationUnionmdashRadiocommunication Sector Geneva 2017 Recommen-dation ITU-R P1411-9

Wireless Communications and Mobile Computing 11

[21] L P KaelblingM L Littman andAWMoore ldquoReinforcementlearning a surveyrdquo Journal of Artificial Intelligence Research vol4 pp 237ndash285 1996

[22] E M Russek I Momennejad MM Botvinick S J Gershmanand N D Daw ldquoPredictive representations can link model-based reinforcement learning tomodel-freemechanismsrdquo PLoSComputational Biology vol 13 no 9 Article ID e1005768 2017

[23] E Even-Dar and Y Mansour ldquoConvergence of optimistic andincremental q-learningrdquo in Advances in Neural InformationProcessing Systems pp 1499ndash1506 2002

[24] M Jaber M A Imran R Tafazolli and A TukmanovldquoA distributed SON-based user-centric backhaul provisioningschemerdquo IEEE Access vol 4 pp 2314ndash2330 2016

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 4: Energy-Aware Smart Connectivity for IoT Networks: Enabling ...

4 Wireless Communications and Mobile Computing

Sensor- Wireless

options- Battery

Gateway(WiFi)

eNB(NB-IoT)

Local processing

Gateway processing

Cloud processing Actuation

Application- Security- Response time

Connectivity- eSiMGateway- Energy consumption

Processing- Availability- Response-time- Energy consumption

Local processing

Cloud processing

Joint DecisionContext-aware

STAGE 1

Constraints

STAGE 2

Trade-off

Cost

Energy

Figure 2 Decision and optimisation processes in a two-stage approach to optimise four performance criteria energy response time securityand cost

Source

LTENB-IoT Wireless ChannelWiFi Wireless

Channel

Recipient

Data transmitted (Dr) Processing at recipient

Data transmitted (Dp) Processing at sourcetptp

Figure 3 Uplink delay model capturing the factors affecting both processing and transmission delays over any hop in our system

32 Energy Consumption Model There are two major pro-cesses that consume energy in an IoT network wirelesstransmission and task computationThe energy consumptionof the former is 119864119905 and the latter is 119864119901 thus the total energyconsumption is the sum of both Depending on the route ofcommunication taken by the device the energy consumeddue to transmission power can be a result of either one hopusing NB-IoT (119864119905119887) or two hops using WiFi for the first linkand LTE for the second (119864119905119886 + 119864119905119888) The energy consumed forprocessing the task is a function of the data rate requirementof device 119889 120579119889 and the computational power of the processor119864119901119894 forall119894 = 119889 119891 119888 (see Table 1) and is expressed as119864119901 = 120579sdot119864119901119894 33 Response Time Model The response time perceived bythe IoT device is the combination of the uplink and downlinkdelays between the IoT device and the server In this workthe uplink delay is modelled while the downlink delay isassumed the same for all devices

The uplink delay is caused by two phenomena taskprocessing (processing delay 119905119901) and data transmission(transmission delay 119905119905) The processing delay depends on theprocessorrsquos computational power which is measured in thenumber of computational cycle per data element (120578) ie thehigher 120578 the less computational power Naturally a server hashigher computational power than a small gateway and muchhigher than a simple IoT device (120578119888 lt 120578119891 lt 120578119889) Thus in

this work 119905119901 is modelled based on the computational powersof the processing locations 119905119901119889 = 10 times 119905119901119891 = 100 times 119905119901119888 Inaddition while the input to the task processing stage is largeraw data the output is compressed data with comparably lessvolume To that end the compression rate between the inputand output data volumes is given as119862119863119903 = 119862 sdot119863119901 where119863119903and 119863119901 are the volumes of raw and processed (compressed)data respectively

The transmission delay is affected by the type of radioaccess technology and the volume of data to be transmittedSinceWiFi access employs the unlicensed frequency bands itoften suffers from higher retransmission rates which resultsin increased transmission delays due to frequent collisionsTherefore in this work this effect is captured by the factor119865 gt 1 whereby the delay incurred for transmitting the samevolume of data overWiFi is119865 times higher than that over LTEor NB-IoT 119905119905119886 = 119865 sdot 119905119905119887 = 119865 sdot 119905119905119888 This model is represented inFigure 3 in which the source could be either the IoT deviceor the gateway and the recipient could be either the gatewayor the cloud

Consequently the overall response time for each action iscalculated for 119862 = 200 and 119865 = 2 as follows

119877 = 119905119901 +119873ℎsum119894=1

119905119905119894 sdot 119863119894 (4)

Wireless Communications and Mobile Computing 5

Table 1 System model parameters and simulation values

Parameter Value Description119903119899 200 m eNB cell radius119903119908 30 m WiFi cell radius119873119866 10 Number of IoT devices per gateway120594119889 30 Kbps Computational capacity (device)120594119891 102 Kbps Computational capacity (fog)120594119888 103 Kbps Computational capacity (cloud)120598 5 times 10minus9 Joule Energy consumption per computational cycle120578119889 102 Required amount of computational cycle per data element (device)120578119891 10 Required amount of computational cycle per data element (fog)120578119888 1 Required amount of computational cycle per data element (cloud)1198730 -204 dBWHz Noise density119861 180 kHz Bandwidth119875119905119889 10minus8W Average transmit power of the IoT devices in the gateways 2 3 4 and 5119879 1 s Time period120582 05 119876-table update parameter120601 09 119876-table update parameter1205761 08 Action selection parameter for Stage 11205762 104 Action selection parameter for Stage 2120588 08 Decaying rate for 1205761 and 1205762S 8 Number of bits in each data element120599 103119878 Conversion of kbps data rates to number of data elements119864119901119889 120598 sdot 120578119889 sdot 120582 Data processing energy consumption per data rate in kbps (device)119864119901119891 120598 sdot 120578119891 sdot 120582 Data processing energy consumption per data rate in kbps (fog)119864119901119888 120598 sdot 120578119888 sdot 120582 Data processing energy consumption per data rate in kbps (cloud)Γ119889 10minus4 Cost of processing per kbps (device)Γ119891 10minus1 Cost of processing per kbps (fog)Γ119888 1 Cost of processing per kbps (cloud)b 20 Budget1205731 102 Constant coefficient for penalty comparison1205732 1012 Constant coefficient for penalty comparison119870119908 = 119870119897 = 119870119899 1281 dB Propagation loss constant for all wireless connection types (a b and c)120572119897 = 120572119899 376 Propagation loss exponent for NB-IoT and LTE wireless connection types (b and c)120572119908 3 Propagation loss exponent for Wi-Fi (80211g) wireless connection type (a)

where 119873ℎ = 1 2 is the number of hops and 119863 = 119863119903 119863119901Besides 119905119905119894 and 119863119894 represent the values of 119905119905 and 119863 for the 119894119905ℎhop respectivelyThen the calculated values populate Table 2after the application of feature scaling into the range of [0 1]using the function given as

119891 (119909) = 119909 minusmin (119883)max (119883) minusmin (119883) (5)

where 119883 is the set of 119909 Note that both (a) and (b) typeconnections constitute the first hop while the connectiontype (c) is the second hop

4 Machine Learning-Based Solution

In this work we propose to employ reinforcement learning(RL) a machine learning technique based on a goal-seeking

approach It is a trial and error approach in which the agent(or learning device) learns to take the correct action byinteracting with its surroundings and being rewarded orpenalised in each iteration RL is selected in this work due toits great applicability to the presented problem For exampleIoT devices need to interact with its environment in orderto assess the circumstances and to take subsequent actionswhich is determination of the connection type and the dataprocessing location Therefore RL maps to this requirementvery well since it allows optimisation with environmentalinteractions

Being one of the most prominent reinforcement learningtechniques 119876-learning aims to find the optimum policy fora given problem that is the best action to take at any givenstate To do this the agent takes an action and evaluates thesubsequent rewardcost of taking that action given that it was

6 Wireless Communications and Mobile Computing

Table 2 Stage one action list

Action Connection Processor Tuple1198601 Wi-Fi Device 1198601 = [0004 1 120594119889 (119864119905119886 + 119864119905119888 + 119864119901119889 sdot 120579) Γ119889]1198602 Wi-Fi Fog 1198602 = [062 1 120594119891 (119864119905119886 + 119864119905119888 + 119864119901119891 sdot 120579) Γ119891]1198603 Wi-Fi Cloud 1198603 = [1 1 120594119888 (119864119905119886 + 119864119905119888 + 119864119901119888 sdot 120579) Γ119888]1198604 NB-IoT Device 1198604 = [0 0 120594119889 (119864119905119887 + 119864119901119889 sdot 120579) Γ119889]1198605 NB-IoT Cloud 1198605 = [02 0 120594119888 (119864119905119887 + 119864119901119888 sdot 120579) Γ119888]

in a certain state This rewardcost is then used to update alook-up-table known as the119876-table which is later utilised bythe agent to select the best action Further the agent calculatesthe 119876-value for every possible stateaction pair Therefore asimple implementation can result in the agent learning onlinethe best actions regardless of the policy

Moreover 119876-learning offers two key features whichenable an efficient solution to our problem First as it isa model-free learning approach [21 22] it is (1) capableof operating in dynamically changing environments (2) alow-complexity algorithm which does not require a lot ofpower thus reducing the energy consumption of the IoTnetwork Second 119876-learning is known to converge in mostcases [23] which has also been demonstrated in multiagentnoncooperative environments [24] as are IoT networks

We propose a two-stage approach to solve the energy-aware smart IoT connectivity where each of the stagesemploys 119876-learning41 First Stage Learning Stage 1 consists of learning thebest combination of connectivity and processing locationin view of the device and application requirements and thelimitations offered by each of these options Thus there arefive possible actions that may be taken by each device asdescribed in Table 2 As a side note all the variables inTable 2 are the feature scaled values (into the range of [01]) calculated through (5) The tuples shown represent thelimitations of each action eg 119860 119894 = [119877 Σ 120594119897 119864119905 + 119864119901 Γ119897]where 119877 and 119864119905 + 119864119901 are described in Sections 33 and32 respectively 120594119897 is the available processing capacity andΓ119897 is the processing cost where 119897 = 119889 119891 119888 as defined inTable 1 The parameter Σ = 1 2 refers to the level ofdata security offered by the wireless technology whereby thevalue 1 indicates eSIM protection (only provided by NB-IoT)and 2 the absence of that Moreover each device may bein four different states as shown in Table 3 depending onthe context-aware constraints defined jointly by the deviceand application These constraints are 1198771015840 Σ1015840 and 1205941015840 whichrepresent the response time security level and computationalpower requirements respectively

411 Penalty Function Determination Each device will esti-mate the penalty function associated with each possibleaction it is able to take following the system shown in Table 3where120593119901 = 119877minus1198771015840 ΣminusΣ1015840 120594minus1205941015840 | 119901 = 1 2 3 is the differencebetween the available and required characteristicsThe fourthpenalty is 1205934 = 1205941015840 sdot 119860(5)119894 minus 119887 where 119860(5)119894 is the fifth index of 119894119905ℎaction and the parameter 119887 is the available budget

The penalty function determination policy aims to satisfythe optimisation objective by including the elements that aredesired to be minimised As seen from Table 3 the penaltyfunctions consist of three main elements constant termdissatisfaction level and energy consumption The constantvalue is the cost of being in the states and it decreases whilethe level of state increases This element compels the agenttry to achieve the highest possible level of states as it is oneof the objectives of the optimisation problem The elementof dissatisfaction level as a supportive of the constant valueincurs cost for not satisfying the device requirements inorder to improve the satisfaction levels Lastly the energyconsumption element provides minimisation in the end-to-end energy consumption (connection and data processing)The parameter 0 le ] le 1 is the battery level where 0represents an empty battery and 1 represents the full chargeIn the expressions in Table 3 the parameter 120589 specifies thepriority level of the energy consumption For instance lowvalues of 120589 prioritise the energy consumption once the batterylevel ] is very low (eg 5) while high values prioritise theenergy consumption even when the battery level is high (eg50)

In addition to all these normally the algorithm tends toselect an option with a cloud processing as it is the mostenergy efficient one However some amount of data willnot be offloaded due to budget constraints and will thenbe processed locally which is the most energy consumingoption Note that this amount is evaluated by the second stagelearning Thus the selected option by the first stage wouldbe more energy consuming than the fog processing-includedoption as the processing will be the combination of the cloudand device Therefore the last parts of the penalty functions(inside the square brackets) prevent the algorithm frommaking blind decisions which ignores the budget availabilityby including an average energy consumption of the actionswith the device processing The reason of taking the averagevalue is that the final action is yet to be taken during thelearning process The coefficients of these three elementsare determined empirically However they can be used toprioritise any element that is desired to be minimised more

The119876-table entries are then updated according to the fol-lowing expression where 119904 1199041015840 119875 and 119886 are the current statenext state penalty function and action under evaluation

119876 (119904 119886) larr997888 119876 (119904 119886)+ 120582 (119875 (119904) + 120601min (119876 (1199041015840 119886)) minus 119876 (119904 119886)) (6)

Wireless Communications and Mobile Computing 7

Table 3 List of possible states of each device in Stage one and corresponding penalty calculation

State Description Penalty function (119875)

1205901 None of the constraints are satisfied 104 + sum119901=1minus3

120593119901 + 10120589] sdot 1205731 sdot 119860(4)119894 sdot [(1 minus 12059341205941015840 sdot 119860(5)119894 ) +1205934 sdot ((119860(4)1 + 119860(4)4 ) 2)

1205941015840 sdot 119860(5)119894 ]1205902 One constraint is satisfied 5 times 103 + 08 sum

119901=1minus3

120593119901 + 10120589] sdot 1205731 sdot 119860(4)119894 sdot [(1 minus 12059341205941015840 sdot 119860(5)119894 ) + 1205934 sdot ((119860(4)1 + 119860(4)4 ) 2)1205941015840 sdot 119860(5)119894 ]

1205903 Two constraints are satisfied 2 times 103 + 06 sum119901=1minus3

120593119901 + 10120589] sdot 1205731 sdot 119860(4)119894 sdot [(1 minus 12059341205941015840 sdot 119860(5)119894 ) +1205934 sdot ((119860(4)1 + 119860(4)4 ) 2)

1205941015840 sdot 119860(5)119894 ]1205904 Three constraints are satisfied 08 sum

119901=1minus2

120593119901 + 1205933 + 10120589] sdot 1205731 sdot 119860(4)119894 sdot [(1 minus 12059341205941015840 sdot 119860(5)119894 ) +1205934 sdot ((119860(4)1 + 119860(4)4 ) 2)

1205941015840 sdot 119860(5)119894 ]

Table 4 List of possible states of each device in Stage two and corresponding penalty calculation

State Description Penalty function1 No availability in cloud or fog for 1205941015840 103 + 1205732 (119860(4)119894 sdot (1 minus 119894) + 1205941015840 sdot 119894 sdot 119860(5)119894 )2 Enough availability in cloud or fog but no budget for 1205941015840 103 + 1205732 (119860(4)119894 sdot (1 minus 119894) + 1205941015840 sdot 119894 sdot 119860(5)119894 )3 Enough availability and budget for 1205941015840 1205732 (119860(4)119894 sdot (1 minus 119894) + 1205941015840 sdot 119894 sdot 119860(5)119894 )

42 Second Stage Learning The second stage aims to find thebest policy for task offloading by considering the budget andavailability of the fog or cloud To this end the second stageis activated only when the action taken in Stage 1 does notresult in local processing (ie 1198601 and 1198604) In Stage 2 119876-learning is also employed with 21 possible actions = [0 005 1] and the constraints are the available budget 119887 andthe availability of the fog andor cloud The resulting statesand penalty functions for this stage are listed in Table 4

421 Penalty Function Determination The penalty functionof this stage is determined with a similar procedure to thefirst stage hence there are three cost elements constant termenergy consumption and monetary cost Similar to the firststage the constant value ensures ending up with the highestpossible level of state Having the energy consumption andmonetary cost elements simultaneously provides finding thebest trade-off between the two However unlike the firststage these elements are calculated for a piece of data thatis planned to be transferred as specifying the best amount isthe objective of this stage learning Similarly the coefficientsare obtained empirically

The interaction between Stage 1 and Stage 2 in the learningprocess is depicted in Algorithms 1 and 2 respectively

5 Results and Analysis

In this section we implement the proposed reinforcementlearning approach in a simulation environment as shown inFigure 4 using the parameter values defined in Table 1 Weconsider that half of the IoT devices connect with NB-IoT inview of the data privacy and related security requirementsthese represent Group A The remaining devices connect tothe eNB through the WiFi gateway hence over two wirelesshops and represent Group B Consequently there are sixpossible fixed scenarios that may be formed by selecting theprocessing location of each group of devices these are listed

in Table 6 A total of 100 iterations is conducted and in eachrandom battery levels are allocated to each of the devices

We compare the results obtained with our method tothe six listed scenarios in terms of five different parametersenergy cost dissatisfaction number of out of budget devicesand joint penalty First energy represents the end-to-endenergy consumption caused from both connection and dataprocessing Second cost is the overall monetary cost incurredby the use of the data processing locations such as fog andcloud Third dissatisfaction is a measure of the total numberof device requirements that are not satisfied Fourth numberof out of budget devices reflects the count of devices thatexceed their available monetary budgets during performingtheir tasks Finally the joint penalty indicates the cumulativecombination of previous four parameters (energy cost dissat-isfaction and number of out of budget devices)

The results in terms of gain (positive values) and loss(negative values) are shown in Figure 5 Note that the valuesfor parameters energy cost dissatisfaction and joint penaltyare obtained as follows

119892 (119909) = 119901119904 minus 119901119902119901119902 times 100 (7)

where 119901119904 and 119901119902 are the values from Table 5 for Scenarios A-Fand 119876-learning respectively

On the other hand the gainloss values for the parameterof number of out of budget devices in Figure 5 is calculatedusing the function given as

119900 (119909) = 119874119906119905 119900119891 119861119906119889119892119890119905 119863119890V119894119888119890119904119873119866 times 100 (8)

It is worth noting that the results provided in Figure 5 areevaluated using the average values given in Table 5 along with95 confidence intervals Moreover the joint cost parameterin Table 5 is calculated by summing them However beforethe summation other four parameters (energy consumption

8 Wireless Communications and Mobile Computing

Data Context-aware constraints available computational capacity in gateway and eNB budgetResult Combination of connectivity route and processing venue

1 initialization2 for all IoT devices do3 Determine the current state using Table 34 Evaluate all the actions5 Calculate the penalty using Table 36 Select the best action7 Jump to the next state8 Update the 119876-table9 if the selected action includes fog(gateway) or

cloud (eNB) processing then10 go to Algorithm 211 end12 end

Algorithm 1 First stage learning

Data Action selected by the first stage available computational capacity in gateway and eNB budgetResult Share of data to be offloaded13 initialization14 for all IoT devices do15 Determine the current state using Table 416 Evaluate all the actions17 Calculate the penalty using Table 418 Select the best action19 Jump to the next state20 Update the 119876-table21 end

Algorithm 2 Second stage learning

minus200 minus150 minus100 minus50 0 50 100 150 200minus200

minus150

minus100

minus50

0

50

100

150

200

LTE eNBWiFi GW 1WiFi GW 2WiFi GW 3

WiFi GW 4WiFi GW 5IoT Devices

Figure 4 Sample snapshot of the simulation environment IoTdevices are located randomly while positions of the gateways arefixed

cost dissatisfaction and number of out of budget devices) arefeature scaled into the range of [0 1] using the function in (5)in order to keep their impacts in the same scale

Our method outperforms any fixed combination whenexamining the joint or holistic gain with values rangingfrom 959 to 28354 Similarly the reinforcement learningtechnique results in better matching between the context-aware constraint and the availability of the IoT networkcompare to any other scenario with gains varying from18333 to 34444 Although the processing cost of ourproposed method is higher than that of Scenario A theresulting gain in energy saving is even more important aswell as the context-aware constraint compliance The closestcontender to reinforcement learning with respect to thegenerated results is Scenario C in which the processing ofGroup A IoT devices is locally conducted while that of GroupB occurs in the gateway Nonetheless the reinforcementlearning allows for a device-driven context-aware connectiv-ity that improves the compliance criteria by more than twotimes while saving 4322 of energy resulting in a holisticgain of 5852 Scenario D manages to reduce the energyconsumption more than our proposed approach at the sametotal cost however 303 of the devices are out of budgetresulting in incomplete or interrupted computational tasks

Wireless Communications and Mobile Computing 9

Table 5 Results on various metrics for 119876-learning and the scenarios

Energy Consumption (mJ) Cost Dissatisfaction Out of Budget Devices Joint CostQ-Learning 569 plusmn 0322 9677 plusmn 401 18 plusmn 0291 0 plusmn 0 07822Scenario A 1488 plusmn 0385 024 plusmn 615119890minus3 51 plusmn 028 0 plusmn 0 15323Scenario B 755 plusmn 024 11857 plusmn 449 529 plusmn 0181 303 plusmn 0217 20679Scenario C 816 plusmn 0284 1207 plusmn 0383 581 plusmn 0208 0 plusmn 0 12399Scenario D 083 plusmn 0025 13041 plusmn 454 6 plusmn 0 303 plusmn 0217 17756Scenario E 748 plusmn 0281 11968 plusmn 383 781 plusmn 0208 297 plusmn 0213 24643Scenario F 015 plusmn 459119890minus3 23802 plusmn 616 8 plusmn 0 6 plusmn 0339 30000

Table 6 List of fixed scenarios with connection types and locationsof data processing

Scenario Group A Group BA Device DeviceB Cloud DeviceC Device FogD Cloud FogE Device CloudF Cloud Cloud

Gain of Q-learning over the scenarios

Scenario A

Scenario B

Scenario C

Scenario D

Scenario E

Scenario Fminus100

minus50

0

50

100

150

200

250

300

350

Gai

n (

)

Total energy (mJ)Total costTotal dissatisfaction

Out of budget devicesJoint Penalty

Figure 5 Summary of results for 120589 = 01 Positive and negativevalues reflect gain and loss respectively Gainloss occurs when the119876-learningscenarios is better than the scenarios119876-learning

Moreover in this scenario connected devices are more thantwo times more likely to be dissatisfied with one or more ofthe context-aware requirements

Next we examine the impact of the battery priority factor120589 on the energy efficiency As shown in Figure 6 low valuesof 120589 result in almost neglecting the battery life of the device inthe optimisation process until it drops below 10 Very high

0 10 20 30 40 50 60 70 80 90 100Battery Level ()

1

2

3

4

5

6

7

Ener

gy C

onsu

mpt

ion

(J)

Impact of the Energy Prioritization Factor ()

= 01 = 03 = 05

= 09 = 12

times10minus3

Figure 6 Impact of energy prioritisation factor 120589

values of 120589 prioritise the reduction of energy consumption forall devices except those that have higher than 70battery lifeTo this end it is possible to tune this parameter dependingon the scenario at hand and in a device-specific manner Forinstance some devices may be part of a moving vehicle withthe possibility of agile and low cost battery replenishmentSuch devices may benefit from low settings of 120589 to allowmore flexibility in meeting the remaining constraints Otherdevices may be in hard-to-reach places and would requireskilled force special equipment and hence high cost toreplace the dead battery In this case higher settings of 120589 aremore suitable and would result in better cost to quality ratio

The simulation results achieved in this work are verypromising as they indicate a large margin for improvementthat is not possible in fixed connection schemes The pro-posed reinforcement learning method relies on centralisedintelligence which has access to all the constraints andrequirements of all devices gateways and connectionsHence the 119876-learning-based method selects the best action(connection typeprocessing location pair in the first stageand amount of data to be transmitted in the second stage)after the convergence We appreciate that such a deployment

10 Wireless Communications and Mobile Computing

is not realistic and propose to explore the feasibility andcorresponding gains of multiagent and distributed reinforce-ment learning as adopted in [24] in our future workNonetheless this work is undoubtedly the first to highlightthe importance of context-aware connectivity in the IoT con-text that addresses jointly security energy and computationalpower as well as cost We present a new application SmartPorts and quantify the potential margin for improvement byemploying the novel scheme and highlight its effects on theapplication

6 Conclusion

In this work we have presented novel approach for energy-aware and context-aware IoT connectivity that jointlyoptimises the energy security computational power andresponse time of the connection The proposed schemeemploys reinforcement learning and manages to achieve aholistic gain of up to 28354 compared to deterministicroutes Although some deterministic scenarios may resultin lower computational cost or lower energy consumptionnone is able to meet the holistic context-aware performancetarget In addition we presented an analysis of the impactof the energy prioritisation factor in which we demonstratedthe importance of tuning this parameter in a device-centricmanner in order to achieve better optimisation of the wholesystem

Data Availability

The data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This research was partly funded by EPSRCGlobal ChallengesResearch Fundmdashthe DARE ProjectmdashEPP0287641The firstauthor was supported by the Republic of Turkey Ministry ofNational Education (MoNE-1416YLSY)

References

[1] S Andreev O Galinina A Pyattaev et al ldquoUnderstandingthe IoT connectivity landscape a contemporary M2M radiotechnology roadmaprdquo IEEE Communications Magazine vol 53no 9 pp 32ndash40 2015

[2] L Atzori A Iera and G Morabito ldquoThe internet of things asurveyrdquoComputer Networks vol 54 no 15 pp 2787ndash2805 2010

[3] N Kouzayha M Jaber and Z Dawy ldquoMeasurement-basedsignaling management strategies for cellular IoTrdquo IEEE Internetof Things Journal vol 4 no 5 pp 1434ndash1444 2017

[4] Y Yang M Zhong H Yao F Yu X Fu and O PostolacheldquoInternet of things for smart ports technologies and chal-lengesrdquo IEEE Instrumentation Measurement Magazine vol 21no 1 pp 34ndash43 2018

[5] GSMA ldquo3GPP low power wide area technologiesrdquo GSMAWhite paper Oct 2016

[6] 3GPP ldquoEvolved Universal Terrestrial Radio Access (E-UTRA)LTE coverage enhancementsrdquo 3GPPThechnical Report 36 Jun2012

[7] Technologies Keysight ldquoThe menu at the IoT cafe a guide toIoT wireless technologiesrdquo Application Note 2017

[8] L Farhan S T Shukur A E Alissa M Alrweg U Raza andR Kharel ldquoA survey on the challenges and opportunities of theInternet of Things (IoT)rdquo in Proceedings of the 2017 EleventhInternational Conference on Sensing Technology (ICST) pp 1ndash5December 2017

[9] S Tayade P Rost A Maeder and H D Schotten ldquoDevice-centric energy optimization for edge cloud offloadingrdquo inProceedings of the 2017 IEEEGlobal Communications Conference(GLOBECOM 2017) pp 1ndash7 Singapore December 2017

[10] F Renna J Doyle V Giotsas and Y Andreopoulos ldquoQueryprocessing for the internet-of-things coupling of device energyconsumption and cloud infrastructure billingrdquo in Proceedingsof the 2016 IEEE First International Conference on Internet-of-Things Design and Implementation (IoTDI) pp 83ndash94 BerlinGermany April 2016

[11] S Persia C Carciofi and M Faccioli ldquoNB-IoT and LoRAconnectivity analysis for M2MIoT smart grids applicationsrdquo inProceedings of the 2017 AEIT International Annual Conferencepp 1ndash6 Cagliari September 2017

[12] A Mihovska and M Sarkar ldquoSmart connectivity for internet ofthings (IoT) applicationsrdquo in New Advances in the Internet ofThings vol 715 of Studies in Computational Intelligence pp 105ndash118 Springer International Publishing Cham 2018

[13] N Kouzayha M Jaber and Z Dawy ldquoM2M data aggregationover cellular networks signaling-delay trade-offsrdquo in Proceed-ings of the 2014 IEEE Globecom Workshops (GC Wkshps) pp1155ndash1160 December 2014

[14] J Xu L Chen and P Zhou ldquoJoint service caching and taskoffloading for mobile edge computing in dense networksrdquoArXiv e-prints 180105868 Jan 2018

[15] O Y Bursalioglu Z Li C Wang and H PapadopoulosldquoEfficient C-RAN random access for IoT devices learning linksvia recommendation systemsrdquo ArXiv e-prints 180104001 Jan2018

[16] H Li K Ota and M Dong ldquoLearning IoT in edge deeplearning for the internet of things with edge computingrdquo IEEENetwork vol 32 no 1 pp 96ndash101 2018

[17] E Oyekanlu ldquoPredictive edge computing for time series ofindustrial IoT and large scale critical infrastructure based onopen-source software analytic of big datardquo in Proceedings of the2017 IEEE International Conference on Big Data (Big Data) pp1663ndash1669 Boston MA USA December 2017

[18] S Barbarossa S Sardellitti E Ceci and M Merluzzi ldquoTheedge cloud a holistic view of communication computation andcachingrdquo ArXiv e-prints 180200700 Feb 2018

[19] T X Vu S Chatzinotas and B Ottersten ldquoEdge-cachingwireless networks performance analysis and optimizationrdquoIEEE Transactions on Wireless Communications vol 17 no 4pp 2827ndash2839 2018

[20] ITU-R ldquoPropagation data and prediction methods for theplanning of short-range outdoor radiocommunication sys-tems and radio local area networks in the frequency range300 MHz to 100 GHzrdquo International TelecommunicationUnionmdashRadiocommunication Sector Geneva 2017 Recommen-dation ITU-R P1411-9

Wireless Communications and Mobile Computing 11

[21] L P KaelblingM L Littman andAWMoore ldquoReinforcementlearning a surveyrdquo Journal of Artificial Intelligence Research vol4 pp 237ndash285 1996

[22] E M Russek I Momennejad MM Botvinick S J Gershmanand N D Daw ldquoPredictive representations can link model-based reinforcement learning tomodel-freemechanismsrdquo PLoSComputational Biology vol 13 no 9 Article ID e1005768 2017

[23] E Even-Dar and Y Mansour ldquoConvergence of optimistic andincremental q-learningrdquo in Advances in Neural InformationProcessing Systems pp 1499ndash1506 2002

[24] M Jaber M A Imran R Tafazolli and A TukmanovldquoA distributed SON-based user-centric backhaul provisioningschemerdquo IEEE Access vol 4 pp 2314ndash2330 2016

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 5: Energy-Aware Smart Connectivity for IoT Networks: Enabling ...

Wireless Communications and Mobile Computing 5

Table 1 System model parameters and simulation values

Parameter Value Description119903119899 200 m eNB cell radius119903119908 30 m WiFi cell radius119873119866 10 Number of IoT devices per gateway120594119889 30 Kbps Computational capacity (device)120594119891 102 Kbps Computational capacity (fog)120594119888 103 Kbps Computational capacity (cloud)120598 5 times 10minus9 Joule Energy consumption per computational cycle120578119889 102 Required amount of computational cycle per data element (device)120578119891 10 Required amount of computational cycle per data element (fog)120578119888 1 Required amount of computational cycle per data element (cloud)1198730 -204 dBWHz Noise density119861 180 kHz Bandwidth119875119905119889 10minus8W Average transmit power of the IoT devices in the gateways 2 3 4 and 5119879 1 s Time period120582 05 119876-table update parameter120601 09 119876-table update parameter1205761 08 Action selection parameter for Stage 11205762 104 Action selection parameter for Stage 2120588 08 Decaying rate for 1205761 and 1205762S 8 Number of bits in each data element120599 103119878 Conversion of kbps data rates to number of data elements119864119901119889 120598 sdot 120578119889 sdot 120582 Data processing energy consumption per data rate in kbps (device)119864119901119891 120598 sdot 120578119891 sdot 120582 Data processing energy consumption per data rate in kbps (fog)119864119901119888 120598 sdot 120578119888 sdot 120582 Data processing energy consumption per data rate in kbps (cloud)Γ119889 10minus4 Cost of processing per kbps (device)Γ119891 10minus1 Cost of processing per kbps (fog)Γ119888 1 Cost of processing per kbps (cloud)b 20 Budget1205731 102 Constant coefficient for penalty comparison1205732 1012 Constant coefficient for penalty comparison119870119908 = 119870119897 = 119870119899 1281 dB Propagation loss constant for all wireless connection types (a b and c)120572119897 = 120572119899 376 Propagation loss exponent for NB-IoT and LTE wireless connection types (b and c)120572119908 3 Propagation loss exponent for Wi-Fi (80211g) wireless connection type (a)

where 119873ℎ = 1 2 is the number of hops and 119863 = 119863119903 119863119901Besides 119905119905119894 and 119863119894 represent the values of 119905119905 and 119863 for the 119894119905ℎhop respectivelyThen the calculated values populate Table 2after the application of feature scaling into the range of [0 1]using the function given as

119891 (119909) = 119909 minusmin (119883)max (119883) minusmin (119883) (5)

where 119883 is the set of 119909 Note that both (a) and (b) typeconnections constitute the first hop while the connectiontype (c) is the second hop

4 Machine Learning-Based Solution

In this work we propose to employ reinforcement learning(RL) a machine learning technique based on a goal-seeking

approach It is a trial and error approach in which the agent(or learning device) learns to take the correct action byinteracting with its surroundings and being rewarded orpenalised in each iteration RL is selected in this work due toits great applicability to the presented problem For exampleIoT devices need to interact with its environment in orderto assess the circumstances and to take subsequent actionswhich is determination of the connection type and the dataprocessing location Therefore RL maps to this requirementvery well since it allows optimisation with environmentalinteractions

Being one of the most prominent reinforcement learningtechniques 119876-learning aims to find the optimum policy fora given problem that is the best action to take at any givenstate To do this the agent takes an action and evaluates thesubsequent rewardcost of taking that action given that it was

6 Wireless Communications and Mobile Computing

Table 2 Stage one action list

Action Connection Processor Tuple1198601 Wi-Fi Device 1198601 = [0004 1 120594119889 (119864119905119886 + 119864119905119888 + 119864119901119889 sdot 120579) Γ119889]1198602 Wi-Fi Fog 1198602 = [062 1 120594119891 (119864119905119886 + 119864119905119888 + 119864119901119891 sdot 120579) Γ119891]1198603 Wi-Fi Cloud 1198603 = [1 1 120594119888 (119864119905119886 + 119864119905119888 + 119864119901119888 sdot 120579) Γ119888]1198604 NB-IoT Device 1198604 = [0 0 120594119889 (119864119905119887 + 119864119901119889 sdot 120579) Γ119889]1198605 NB-IoT Cloud 1198605 = [02 0 120594119888 (119864119905119887 + 119864119901119888 sdot 120579) Γ119888]

in a certain state This rewardcost is then used to update alook-up-table known as the119876-table which is later utilised bythe agent to select the best action Further the agent calculatesthe 119876-value for every possible stateaction pair Therefore asimple implementation can result in the agent learning onlinethe best actions regardless of the policy

Moreover 119876-learning offers two key features whichenable an efficient solution to our problem First as it isa model-free learning approach [21 22] it is (1) capableof operating in dynamically changing environments (2) alow-complexity algorithm which does not require a lot ofpower thus reducing the energy consumption of the IoTnetwork Second 119876-learning is known to converge in mostcases [23] which has also been demonstrated in multiagentnoncooperative environments [24] as are IoT networks

We propose a two-stage approach to solve the energy-aware smart IoT connectivity where each of the stagesemploys 119876-learning41 First Stage Learning Stage 1 consists of learning thebest combination of connectivity and processing locationin view of the device and application requirements and thelimitations offered by each of these options Thus there arefive possible actions that may be taken by each device asdescribed in Table 2 As a side note all the variables inTable 2 are the feature scaled values (into the range of [01]) calculated through (5) The tuples shown represent thelimitations of each action eg 119860 119894 = [119877 Σ 120594119897 119864119905 + 119864119901 Γ119897]where 119877 and 119864119905 + 119864119901 are described in Sections 33 and32 respectively 120594119897 is the available processing capacity andΓ119897 is the processing cost where 119897 = 119889 119891 119888 as defined inTable 1 The parameter Σ = 1 2 refers to the level ofdata security offered by the wireless technology whereby thevalue 1 indicates eSIM protection (only provided by NB-IoT)and 2 the absence of that Moreover each device may bein four different states as shown in Table 3 depending onthe context-aware constraints defined jointly by the deviceand application These constraints are 1198771015840 Σ1015840 and 1205941015840 whichrepresent the response time security level and computationalpower requirements respectively

411 Penalty Function Determination Each device will esti-mate the penalty function associated with each possibleaction it is able to take following the system shown in Table 3where120593119901 = 119877minus1198771015840 ΣminusΣ1015840 120594minus1205941015840 | 119901 = 1 2 3 is the differencebetween the available and required characteristicsThe fourthpenalty is 1205934 = 1205941015840 sdot 119860(5)119894 minus 119887 where 119860(5)119894 is the fifth index of 119894119905ℎaction and the parameter 119887 is the available budget

The penalty function determination policy aims to satisfythe optimisation objective by including the elements that aredesired to be minimised As seen from Table 3 the penaltyfunctions consist of three main elements constant termdissatisfaction level and energy consumption The constantvalue is the cost of being in the states and it decreases whilethe level of state increases This element compels the agenttry to achieve the highest possible level of states as it is oneof the objectives of the optimisation problem The elementof dissatisfaction level as a supportive of the constant valueincurs cost for not satisfying the device requirements inorder to improve the satisfaction levels Lastly the energyconsumption element provides minimisation in the end-to-end energy consumption (connection and data processing)The parameter 0 le ] le 1 is the battery level where 0represents an empty battery and 1 represents the full chargeIn the expressions in Table 3 the parameter 120589 specifies thepriority level of the energy consumption For instance lowvalues of 120589 prioritise the energy consumption once the batterylevel ] is very low (eg 5) while high values prioritise theenergy consumption even when the battery level is high (eg50)

In addition to all these normally the algorithm tends toselect an option with a cloud processing as it is the mostenergy efficient one However some amount of data willnot be offloaded due to budget constraints and will thenbe processed locally which is the most energy consumingoption Note that this amount is evaluated by the second stagelearning Thus the selected option by the first stage wouldbe more energy consuming than the fog processing-includedoption as the processing will be the combination of the cloudand device Therefore the last parts of the penalty functions(inside the square brackets) prevent the algorithm frommaking blind decisions which ignores the budget availabilityby including an average energy consumption of the actionswith the device processing The reason of taking the averagevalue is that the final action is yet to be taken during thelearning process The coefficients of these three elementsare determined empirically However they can be used toprioritise any element that is desired to be minimised more

The119876-table entries are then updated according to the fol-lowing expression where 119904 1199041015840 119875 and 119886 are the current statenext state penalty function and action under evaluation

119876 (119904 119886) larr997888 119876 (119904 119886)+ 120582 (119875 (119904) + 120601min (119876 (1199041015840 119886)) minus 119876 (119904 119886)) (6)

Wireless Communications and Mobile Computing 7

Table 3 List of possible states of each device in Stage one and corresponding penalty calculation

State Description Penalty function (119875)

1205901 None of the constraints are satisfied 104 + sum119901=1minus3

120593119901 + 10120589] sdot 1205731 sdot 119860(4)119894 sdot [(1 minus 12059341205941015840 sdot 119860(5)119894 ) +1205934 sdot ((119860(4)1 + 119860(4)4 ) 2)

1205941015840 sdot 119860(5)119894 ]1205902 One constraint is satisfied 5 times 103 + 08 sum

119901=1minus3

120593119901 + 10120589] sdot 1205731 sdot 119860(4)119894 sdot [(1 minus 12059341205941015840 sdot 119860(5)119894 ) + 1205934 sdot ((119860(4)1 + 119860(4)4 ) 2)1205941015840 sdot 119860(5)119894 ]

1205903 Two constraints are satisfied 2 times 103 + 06 sum119901=1minus3

120593119901 + 10120589] sdot 1205731 sdot 119860(4)119894 sdot [(1 minus 12059341205941015840 sdot 119860(5)119894 ) +1205934 sdot ((119860(4)1 + 119860(4)4 ) 2)

1205941015840 sdot 119860(5)119894 ]1205904 Three constraints are satisfied 08 sum

119901=1minus2

120593119901 + 1205933 + 10120589] sdot 1205731 sdot 119860(4)119894 sdot [(1 minus 12059341205941015840 sdot 119860(5)119894 ) +1205934 sdot ((119860(4)1 + 119860(4)4 ) 2)

1205941015840 sdot 119860(5)119894 ]

Table 4 List of possible states of each device in Stage two and corresponding penalty calculation

State Description Penalty function1 No availability in cloud or fog for 1205941015840 103 + 1205732 (119860(4)119894 sdot (1 minus 119894) + 1205941015840 sdot 119894 sdot 119860(5)119894 )2 Enough availability in cloud or fog but no budget for 1205941015840 103 + 1205732 (119860(4)119894 sdot (1 minus 119894) + 1205941015840 sdot 119894 sdot 119860(5)119894 )3 Enough availability and budget for 1205941015840 1205732 (119860(4)119894 sdot (1 minus 119894) + 1205941015840 sdot 119894 sdot 119860(5)119894 )

42 Second Stage Learning The second stage aims to find thebest policy for task offloading by considering the budget andavailability of the fog or cloud To this end the second stageis activated only when the action taken in Stage 1 does notresult in local processing (ie 1198601 and 1198604) In Stage 2 119876-learning is also employed with 21 possible actions = [0 005 1] and the constraints are the available budget 119887 andthe availability of the fog andor cloud The resulting statesand penalty functions for this stage are listed in Table 4

421 Penalty Function Determination The penalty functionof this stage is determined with a similar procedure to thefirst stage hence there are three cost elements constant termenergy consumption and monetary cost Similar to the firststage the constant value ensures ending up with the highestpossible level of state Having the energy consumption andmonetary cost elements simultaneously provides finding thebest trade-off between the two However unlike the firststage these elements are calculated for a piece of data thatis planned to be transferred as specifying the best amount isthe objective of this stage learning Similarly the coefficientsare obtained empirically

The interaction between Stage 1 and Stage 2 in the learningprocess is depicted in Algorithms 1 and 2 respectively

5 Results and Analysis

In this section we implement the proposed reinforcementlearning approach in a simulation environment as shown inFigure 4 using the parameter values defined in Table 1 Weconsider that half of the IoT devices connect with NB-IoT inview of the data privacy and related security requirementsthese represent Group A The remaining devices connect tothe eNB through the WiFi gateway hence over two wirelesshops and represent Group B Consequently there are sixpossible fixed scenarios that may be formed by selecting theprocessing location of each group of devices these are listed

in Table 6 A total of 100 iterations is conducted and in eachrandom battery levels are allocated to each of the devices

We compare the results obtained with our method tothe six listed scenarios in terms of five different parametersenergy cost dissatisfaction number of out of budget devicesand joint penalty First energy represents the end-to-endenergy consumption caused from both connection and dataprocessing Second cost is the overall monetary cost incurredby the use of the data processing locations such as fog andcloud Third dissatisfaction is a measure of the total numberof device requirements that are not satisfied Fourth numberof out of budget devices reflects the count of devices thatexceed their available monetary budgets during performingtheir tasks Finally the joint penalty indicates the cumulativecombination of previous four parameters (energy cost dissat-isfaction and number of out of budget devices)

The results in terms of gain (positive values) and loss(negative values) are shown in Figure 5 Note that the valuesfor parameters energy cost dissatisfaction and joint penaltyare obtained as follows

119892 (119909) = 119901119904 minus 119901119902119901119902 times 100 (7)

where 119901119904 and 119901119902 are the values from Table 5 for Scenarios A-Fand 119876-learning respectively

On the other hand the gainloss values for the parameterof number of out of budget devices in Figure 5 is calculatedusing the function given as

119900 (119909) = 119874119906119905 119900119891 119861119906119889119892119890119905 119863119890V119894119888119890119904119873119866 times 100 (8)

It is worth noting that the results provided in Figure 5 areevaluated using the average values given in Table 5 along with95 confidence intervals Moreover the joint cost parameterin Table 5 is calculated by summing them However beforethe summation other four parameters (energy consumption

8 Wireless Communications and Mobile Computing

Data Context-aware constraints available computational capacity in gateway and eNB budgetResult Combination of connectivity route and processing venue

1 initialization2 for all IoT devices do3 Determine the current state using Table 34 Evaluate all the actions5 Calculate the penalty using Table 36 Select the best action7 Jump to the next state8 Update the 119876-table9 if the selected action includes fog(gateway) or

cloud (eNB) processing then10 go to Algorithm 211 end12 end

Algorithm 1 First stage learning

Data Action selected by the first stage available computational capacity in gateway and eNB budgetResult Share of data to be offloaded13 initialization14 for all IoT devices do15 Determine the current state using Table 416 Evaluate all the actions17 Calculate the penalty using Table 418 Select the best action19 Jump to the next state20 Update the 119876-table21 end

Algorithm 2 Second stage learning

minus200 minus150 minus100 minus50 0 50 100 150 200minus200

minus150

minus100

minus50

0

50

100

150

200

LTE eNBWiFi GW 1WiFi GW 2WiFi GW 3

WiFi GW 4WiFi GW 5IoT Devices

Figure 4 Sample snapshot of the simulation environment IoTdevices are located randomly while positions of the gateways arefixed

cost dissatisfaction and number of out of budget devices) arefeature scaled into the range of [0 1] using the function in (5)in order to keep their impacts in the same scale

Our method outperforms any fixed combination whenexamining the joint or holistic gain with values rangingfrom 959 to 28354 Similarly the reinforcement learningtechnique results in better matching between the context-aware constraint and the availability of the IoT networkcompare to any other scenario with gains varying from18333 to 34444 Although the processing cost of ourproposed method is higher than that of Scenario A theresulting gain in energy saving is even more important aswell as the context-aware constraint compliance The closestcontender to reinforcement learning with respect to thegenerated results is Scenario C in which the processing ofGroup A IoT devices is locally conducted while that of GroupB occurs in the gateway Nonetheless the reinforcementlearning allows for a device-driven context-aware connectiv-ity that improves the compliance criteria by more than twotimes while saving 4322 of energy resulting in a holisticgain of 5852 Scenario D manages to reduce the energyconsumption more than our proposed approach at the sametotal cost however 303 of the devices are out of budgetresulting in incomplete or interrupted computational tasks

Wireless Communications and Mobile Computing 9

Table 5 Results on various metrics for 119876-learning and the scenarios

Energy Consumption (mJ) Cost Dissatisfaction Out of Budget Devices Joint CostQ-Learning 569 plusmn 0322 9677 plusmn 401 18 plusmn 0291 0 plusmn 0 07822Scenario A 1488 plusmn 0385 024 plusmn 615119890minus3 51 plusmn 028 0 plusmn 0 15323Scenario B 755 plusmn 024 11857 plusmn 449 529 plusmn 0181 303 plusmn 0217 20679Scenario C 816 plusmn 0284 1207 plusmn 0383 581 plusmn 0208 0 plusmn 0 12399Scenario D 083 plusmn 0025 13041 plusmn 454 6 plusmn 0 303 plusmn 0217 17756Scenario E 748 plusmn 0281 11968 plusmn 383 781 plusmn 0208 297 plusmn 0213 24643Scenario F 015 plusmn 459119890minus3 23802 plusmn 616 8 plusmn 0 6 plusmn 0339 30000

Table 6 List of fixed scenarios with connection types and locationsof data processing

Scenario Group A Group BA Device DeviceB Cloud DeviceC Device FogD Cloud FogE Device CloudF Cloud Cloud

Gain of Q-learning over the scenarios

Scenario A

Scenario B

Scenario C

Scenario D

Scenario E

Scenario Fminus100

minus50

0

50

100

150

200

250

300

350

Gai

n (

)

Total energy (mJ)Total costTotal dissatisfaction

Out of budget devicesJoint Penalty

Figure 5 Summary of results for 120589 = 01 Positive and negativevalues reflect gain and loss respectively Gainloss occurs when the119876-learningscenarios is better than the scenarios119876-learning

Moreover in this scenario connected devices are more thantwo times more likely to be dissatisfied with one or more ofthe context-aware requirements

Next we examine the impact of the battery priority factor120589 on the energy efficiency As shown in Figure 6 low valuesof 120589 result in almost neglecting the battery life of the device inthe optimisation process until it drops below 10 Very high

0 10 20 30 40 50 60 70 80 90 100Battery Level ()

1

2

3

4

5

6

7

Ener

gy C

onsu

mpt

ion

(J)

Impact of the Energy Prioritization Factor ()

= 01 = 03 = 05

= 09 = 12

times10minus3

Figure 6 Impact of energy prioritisation factor 120589

values of 120589 prioritise the reduction of energy consumption forall devices except those that have higher than 70battery lifeTo this end it is possible to tune this parameter dependingon the scenario at hand and in a device-specific manner Forinstance some devices may be part of a moving vehicle withthe possibility of agile and low cost battery replenishmentSuch devices may benefit from low settings of 120589 to allowmore flexibility in meeting the remaining constraints Otherdevices may be in hard-to-reach places and would requireskilled force special equipment and hence high cost toreplace the dead battery In this case higher settings of 120589 aremore suitable and would result in better cost to quality ratio

The simulation results achieved in this work are verypromising as they indicate a large margin for improvementthat is not possible in fixed connection schemes The pro-posed reinforcement learning method relies on centralisedintelligence which has access to all the constraints andrequirements of all devices gateways and connectionsHence the 119876-learning-based method selects the best action(connection typeprocessing location pair in the first stageand amount of data to be transmitted in the second stage)after the convergence We appreciate that such a deployment

10 Wireless Communications and Mobile Computing

is not realistic and propose to explore the feasibility andcorresponding gains of multiagent and distributed reinforce-ment learning as adopted in [24] in our future workNonetheless this work is undoubtedly the first to highlightthe importance of context-aware connectivity in the IoT con-text that addresses jointly security energy and computationalpower as well as cost We present a new application SmartPorts and quantify the potential margin for improvement byemploying the novel scheme and highlight its effects on theapplication

6 Conclusion

In this work we have presented novel approach for energy-aware and context-aware IoT connectivity that jointlyoptimises the energy security computational power andresponse time of the connection The proposed schemeemploys reinforcement learning and manages to achieve aholistic gain of up to 28354 compared to deterministicroutes Although some deterministic scenarios may resultin lower computational cost or lower energy consumptionnone is able to meet the holistic context-aware performancetarget In addition we presented an analysis of the impactof the energy prioritisation factor in which we demonstratedthe importance of tuning this parameter in a device-centricmanner in order to achieve better optimisation of the wholesystem

Data Availability

The data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This research was partly funded by EPSRCGlobal ChallengesResearch Fundmdashthe DARE ProjectmdashEPP0287641The firstauthor was supported by the Republic of Turkey Ministry ofNational Education (MoNE-1416YLSY)

References

[1] S Andreev O Galinina A Pyattaev et al ldquoUnderstandingthe IoT connectivity landscape a contemporary M2M radiotechnology roadmaprdquo IEEE Communications Magazine vol 53no 9 pp 32ndash40 2015

[2] L Atzori A Iera and G Morabito ldquoThe internet of things asurveyrdquoComputer Networks vol 54 no 15 pp 2787ndash2805 2010

[3] N Kouzayha M Jaber and Z Dawy ldquoMeasurement-basedsignaling management strategies for cellular IoTrdquo IEEE Internetof Things Journal vol 4 no 5 pp 1434ndash1444 2017

[4] Y Yang M Zhong H Yao F Yu X Fu and O PostolacheldquoInternet of things for smart ports technologies and chal-lengesrdquo IEEE Instrumentation Measurement Magazine vol 21no 1 pp 34ndash43 2018

[5] GSMA ldquo3GPP low power wide area technologiesrdquo GSMAWhite paper Oct 2016

[6] 3GPP ldquoEvolved Universal Terrestrial Radio Access (E-UTRA)LTE coverage enhancementsrdquo 3GPPThechnical Report 36 Jun2012

[7] Technologies Keysight ldquoThe menu at the IoT cafe a guide toIoT wireless technologiesrdquo Application Note 2017

[8] L Farhan S T Shukur A E Alissa M Alrweg U Raza andR Kharel ldquoA survey on the challenges and opportunities of theInternet of Things (IoT)rdquo in Proceedings of the 2017 EleventhInternational Conference on Sensing Technology (ICST) pp 1ndash5December 2017

[9] S Tayade P Rost A Maeder and H D Schotten ldquoDevice-centric energy optimization for edge cloud offloadingrdquo inProceedings of the 2017 IEEEGlobal Communications Conference(GLOBECOM 2017) pp 1ndash7 Singapore December 2017

[10] F Renna J Doyle V Giotsas and Y Andreopoulos ldquoQueryprocessing for the internet-of-things coupling of device energyconsumption and cloud infrastructure billingrdquo in Proceedingsof the 2016 IEEE First International Conference on Internet-of-Things Design and Implementation (IoTDI) pp 83ndash94 BerlinGermany April 2016

[11] S Persia C Carciofi and M Faccioli ldquoNB-IoT and LoRAconnectivity analysis for M2MIoT smart grids applicationsrdquo inProceedings of the 2017 AEIT International Annual Conferencepp 1ndash6 Cagliari September 2017

[12] A Mihovska and M Sarkar ldquoSmart connectivity for internet ofthings (IoT) applicationsrdquo in New Advances in the Internet ofThings vol 715 of Studies in Computational Intelligence pp 105ndash118 Springer International Publishing Cham 2018

[13] N Kouzayha M Jaber and Z Dawy ldquoM2M data aggregationover cellular networks signaling-delay trade-offsrdquo in Proceed-ings of the 2014 IEEE Globecom Workshops (GC Wkshps) pp1155ndash1160 December 2014

[14] J Xu L Chen and P Zhou ldquoJoint service caching and taskoffloading for mobile edge computing in dense networksrdquoArXiv e-prints 180105868 Jan 2018

[15] O Y Bursalioglu Z Li C Wang and H PapadopoulosldquoEfficient C-RAN random access for IoT devices learning linksvia recommendation systemsrdquo ArXiv e-prints 180104001 Jan2018

[16] H Li K Ota and M Dong ldquoLearning IoT in edge deeplearning for the internet of things with edge computingrdquo IEEENetwork vol 32 no 1 pp 96ndash101 2018

[17] E Oyekanlu ldquoPredictive edge computing for time series ofindustrial IoT and large scale critical infrastructure based onopen-source software analytic of big datardquo in Proceedings of the2017 IEEE International Conference on Big Data (Big Data) pp1663ndash1669 Boston MA USA December 2017

[18] S Barbarossa S Sardellitti E Ceci and M Merluzzi ldquoTheedge cloud a holistic view of communication computation andcachingrdquo ArXiv e-prints 180200700 Feb 2018

[19] T X Vu S Chatzinotas and B Ottersten ldquoEdge-cachingwireless networks performance analysis and optimizationrdquoIEEE Transactions on Wireless Communications vol 17 no 4pp 2827ndash2839 2018

[20] ITU-R ldquoPropagation data and prediction methods for theplanning of short-range outdoor radiocommunication sys-tems and radio local area networks in the frequency range300 MHz to 100 GHzrdquo International TelecommunicationUnionmdashRadiocommunication Sector Geneva 2017 Recommen-dation ITU-R P1411-9

Wireless Communications and Mobile Computing 11

[21] L P KaelblingM L Littman andAWMoore ldquoReinforcementlearning a surveyrdquo Journal of Artificial Intelligence Research vol4 pp 237ndash285 1996

[22] E M Russek I Momennejad MM Botvinick S J Gershmanand N D Daw ldquoPredictive representations can link model-based reinforcement learning tomodel-freemechanismsrdquo PLoSComputational Biology vol 13 no 9 Article ID e1005768 2017

[23] E Even-Dar and Y Mansour ldquoConvergence of optimistic andincremental q-learningrdquo in Advances in Neural InformationProcessing Systems pp 1499ndash1506 2002

[24] M Jaber M A Imran R Tafazolli and A TukmanovldquoA distributed SON-based user-centric backhaul provisioningschemerdquo IEEE Access vol 4 pp 2314ndash2330 2016

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 6: Energy-Aware Smart Connectivity for IoT Networks: Enabling ...

6 Wireless Communications and Mobile Computing

Table 2 Stage one action list

Action Connection Processor Tuple1198601 Wi-Fi Device 1198601 = [0004 1 120594119889 (119864119905119886 + 119864119905119888 + 119864119901119889 sdot 120579) Γ119889]1198602 Wi-Fi Fog 1198602 = [062 1 120594119891 (119864119905119886 + 119864119905119888 + 119864119901119891 sdot 120579) Γ119891]1198603 Wi-Fi Cloud 1198603 = [1 1 120594119888 (119864119905119886 + 119864119905119888 + 119864119901119888 sdot 120579) Γ119888]1198604 NB-IoT Device 1198604 = [0 0 120594119889 (119864119905119887 + 119864119901119889 sdot 120579) Γ119889]1198605 NB-IoT Cloud 1198605 = [02 0 120594119888 (119864119905119887 + 119864119901119888 sdot 120579) Γ119888]

in a certain state This rewardcost is then used to update alook-up-table known as the119876-table which is later utilised bythe agent to select the best action Further the agent calculatesthe 119876-value for every possible stateaction pair Therefore asimple implementation can result in the agent learning onlinethe best actions regardless of the policy

Moreover 119876-learning offers two key features whichenable an efficient solution to our problem First as it isa model-free learning approach [21 22] it is (1) capableof operating in dynamically changing environments (2) alow-complexity algorithm which does not require a lot ofpower thus reducing the energy consumption of the IoTnetwork Second 119876-learning is known to converge in mostcases [23] which has also been demonstrated in multiagentnoncooperative environments [24] as are IoT networks

We propose a two-stage approach to solve the energy-aware smart IoT connectivity where each of the stagesemploys 119876-learning41 First Stage Learning Stage 1 consists of learning thebest combination of connectivity and processing locationin view of the device and application requirements and thelimitations offered by each of these options Thus there arefive possible actions that may be taken by each device asdescribed in Table 2 As a side note all the variables inTable 2 are the feature scaled values (into the range of [01]) calculated through (5) The tuples shown represent thelimitations of each action eg 119860 119894 = [119877 Σ 120594119897 119864119905 + 119864119901 Γ119897]where 119877 and 119864119905 + 119864119901 are described in Sections 33 and32 respectively 120594119897 is the available processing capacity andΓ119897 is the processing cost where 119897 = 119889 119891 119888 as defined inTable 1 The parameter Σ = 1 2 refers to the level ofdata security offered by the wireless technology whereby thevalue 1 indicates eSIM protection (only provided by NB-IoT)and 2 the absence of that Moreover each device may bein four different states as shown in Table 3 depending onthe context-aware constraints defined jointly by the deviceand application These constraints are 1198771015840 Σ1015840 and 1205941015840 whichrepresent the response time security level and computationalpower requirements respectively

411 Penalty Function Determination Each device will esti-mate the penalty function associated with each possibleaction it is able to take following the system shown in Table 3where120593119901 = 119877minus1198771015840 ΣminusΣ1015840 120594minus1205941015840 | 119901 = 1 2 3 is the differencebetween the available and required characteristicsThe fourthpenalty is 1205934 = 1205941015840 sdot 119860(5)119894 minus 119887 where 119860(5)119894 is the fifth index of 119894119905ℎaction and the parameter 119887 is the available budget

The penalty function determination policy aims to satisfythe optimisation objective by including the elements that aredesired to be minimised As seen from Table 3 the penaltyfunctions consist of three main elements constant termdissatisfaction level and energy consumption The constantvalue is the cost of being in the states and it decreases whilethe level of state increases This element compels the agenttry to achieve the highest possible level of states as it is oneof the objectives of the optimisation problem The elementof dissatisfaction level as a supportive of the constant valueincurs cost for not satisfying the device requirements inorder to improve the satisfaction levels Lastly the energyconsumption element provides minimisation in the end-to-end energy consumption (connection and data processing)The parameter 0 le ] le 1 is the battery level where 0represents an empty battery and 1 represents the full chargeIn the expressions in Table 3 the parameter 120589 specifies thepriority level of the energy consumption For instance lowvalues of 120589 prioritise the energy consumption once the batterylevel ] is very low (eg 5) while high values prioritise theenergy consumption even when the battery level is high (eg50)

In addition to all these normally the algorithm tends toselect an option with a cloud processing as it is the mostenergy efficient one However some amount of data willnot be offloaded due to budget constraints and will thenbe processed locally which is the most energy consumingoption Note that this amount is evaluated by the second stagelearning Thus the selected option by the first stage wouldbe more energy consuming than the fog processing-includedoption as the processing will be the combination of the cloudand device Therefore the last parts of the penalty functions(inside the square brackets) prevent the algorithm frommaking blind decisions which ignores the budget availabilityby including an average energy consumption of the actionswith the device processing The reason of taking the averagevalue is that the final action is yet to be taken during thelearning process The coefficients of these three elementsare determined empirically However they can be used toprioritise any element that is desired to be minimised more

The119876-table entries are then updated according to the fol-lowing expression where 119904 1199041015840 119875 and 119886 are the current statenext state penalty function and action under evaluation

119876 (119904 119886) larr997888 119876 (119904 119886)+ 120582 (119875 (119904) + 120601min (119876 (1199041015840 119886)) minus 119876 (119904 119886)) (6)

Wireless Communications and Mobile Computing 7

Table 3 List of possible states of each device in Stage one and corresponding penalty calculation

State Description Penalty function (119875)

1205901 None of the constraints are satisfied 104 + sum119901=1minus3

120593119901 + 10120589] sdot 1205731 sdot 119860(4)119894 sdot [(1 minus 12059341205941015840 sdot 119860(5)119894 ) +1205934 sdot ((119860(4)1 + 119860(4)4 ) 2)

1205941015840 sdot 119860(5)119894 ]1205902 One constraint is satisfied 5 times 103 + 08 sum

119901=1minus3

120593119901 + 10120589] sdot 1205731 sdot 119860(4)119894 sdot [(1 minus 12059341205941015840 sdot 119860(5)119894 ) + 1205934 sdot ((119860(4)1 + 119860(4)4 ) 2)1205941015840 sdot 119860(5)119894 ]

1205903 Two constraints are satisfied 2 times 103 + 06 sum119901=1minus3

120593119901 + 10120589] sdot 1205731 sdot 119860(4)119894 sdot [(1 minus 12059341205941015840 sdot 119860(5)119894 ) +1205934 sdot ((119860(4)1 + 119860(4)4 ) 2)

1205941015840 sdot 119860(5)119894 ]1205904 Three constraints are satisfied 08 sum

119901=1minus2

120593119901 + 1205933 + 10120589] sdot 1205731 sdot 119860(4)119894 sdot [(1 minus 12059341205941015840 sdot 119860(5)119894 ) +1205934 sdot ((119860(4)1 + 119860(4)4 ) 2)

1205941015840 sdot 119860(5)119894 ]

Table 4 List of possible states of each device in Stage two and corresponding penalty calculation

State Description Penalty function1 No availability in cloud or fog for 1205941015840 103 + 1205732 (119860(4)119894 sdot (1 minus 119894) + 1205941015840 sdot 119894 sdot 119860(5)119894 )2 Enough availability in cloud or fog but no budget for 1205941015840 103 + 1205732 (119860(4)119894 sdot (1 minus 119894) + 1205941015840 sdot 119894 sdot 119860(5)119894 )3 Enough availability and budget for 1205941015840 1205732 (119860(4)119894 sdot (1 minus 119894) + 1205941015840 sdot 119894 sdot 119860(5)119894 )

42 Second Stage Learning The second stage aims to find thebest policy for task offloading by considering the budget andavailability of the fog or cloud To this end the second stageis activated only when the action taken in Stage 1 does notresult in local processing (ie 1198601 and 1198604) In Stage 2 119876-learning is also employed with 21 possible actions = [0 005 1] and the constraints are the available budget 119887 andthe availability of the fog andor cloud The resulting statesand penalty functions for this stage are listed in Table 4

421 Penalty Function Determination The penalty functionof this stage is determined with a similar procedure to thefirst stage hence there are three cost elements constant termenergy consumption and monetary cost Similar to the firststage the constant value ensures ending up with the highestpossible level of state Having the energy consumption andmonetary cost elements simultaneously provides finding thebest trade-off between the two However unlike the firststage these elements are calculated for a piece of data thatis planned to be transferred as specifying the best amount isthe objective of this stage learning Similarly the coefficientsare obtained empirically

The interaction between Stage 1 and Stage 2 in the learningprocess is depicted in Algorithms 1 and 2 respectively

5 Results and Analysis

In this section we implement the proposed reinforcementlearning approach in a simulation environment as shown inFigure 4 using the parameter values defined in Table 1 Weconsider that half of the IoT devices connect with NB-IoT inview of the data privacy and related security requirementsthese represent Group A The remaining devices connect tothe eNB through the WiFi gateway hence over two wirelesshops and represent Group B Consequently there are sixpossible fixed scenarios that may be formed by selecting theprocessing location of each group of devices these are listed

in Table 6 A total of 100 iterations is conducted and in eachrandom battery levels are allocated to each of the devices

We compare the results obtained with our method tothe six listed scenarios in terms of five different parametersenergy cost dissatisfaction number of out of budget devicesand joint penalty First energy represents the end-to-endenergy consumption caused from both connection and dataprocessing Second cost is the overall monetary cost incurredby the use of the data processing locations such as fog andcloud Third dissatisfaction is a measure of the total numberof device requirements that are not satisfied Fourth numberof out of budget devices reflects the count of devices thatexceed their available monetary budgets during performingtheir tasks Finally the joint penalty indicates the cumulativecombination of previous four parameters (energy cost dissat-isfaction and number of out of budget devices)

The results in terms of gain (positive values) and loss(negative values) are shown in Figure 5 Note that the valuesfor parameters energy cost dissatisfaction and joint penaltyare obtained as follows

119892 (119909) = 119901119904 minus 119901119902119901119902 times 100 (7)

where 119901119904 and 119901119902 are the values from Table 5 for Scenarios A-Fand 119876-learning respectively

On the other hand the gainloss values for the parameterof number of out of budget devices in Figure 5 is calculatedusing the function given as

119900 (119909) = 119874119906119905 119900119891 119861119906119889119892119890119905 119863119890V119894119888119890119904119873119866 times 100 (8)

It is worth noting that the results provided in Figure 5 areevaluated using the average values given in Table 5 along with95 confidence intervals Moreover the joint cost parameterin Table 5 is calculated by summing them However beforethe summation other four parameters (energy consumption

8 Wireless Communications and Mobile Computing

Data Context-aware constraints available computational capacity in gateway and eNB budgetResult Combination of connectivity route and processing venue

1 initialization2 for all IoT devices do3 Determine the current state using Table 34 Evaluate all the actions5 Calculate the penalty using Table 36 Select the best action7 Jump to the next state8 Update the 119876-table9 if the selected action includes fog(gateway) or

cloud (eNB) processing then10 go to Algorithm 211 end12 end

Algorithm 1 First stage learning

Data Action selected by the first stage available computational capacity in gateway and eNB budgetResult Share of data to be offloaded13 initialization14 for all IoT devices do15 Determine the current state using Table 416 Evaluate all the actions17 Calculate the penalty using Table 418 Select the best action19 Jump to the next state20 Update the 119876-table21 end

Algorithm 2 Second stage learning

minus200 minus150 minus100 minus50 0 50 100 150 200minus200

minus150

minus100

minus50

0

50

100

150

200

LTE eNBWiFi GW 1WiFi GW 2WiFi GW 3

WiFi GW 4WiFi GW 5IoT Devices

Figure 4 Sample snapshot of the simulation environment IoTdevices are located randomly while positions of the gateways arefixed

cost dissatisfaction and number of out of budget devices) arefeature scaled into the range of [0 1] using the function in (5)in order to keep their impacts in the same scale

Our method outperforms any fixed combination whenexamining the joint or holistic gain with values rangingfrom 959 to 28354 Similarly the reinforcement learningtechnique results in better matching between the context-aware constraint and the availability of the IoT networkcompare to any other scenario with gains varying from18333 to 34444 Although the processing cost of ourproposed method is higher than that of Scenario A theresulting gain in energy saving is even more important aswell as the context-aware constraint compliance The closestcontender to reinforcement learning with respect to thegenerated results is Scenario C in which the processing ofGroup A IoT devices is locally conducted while that of GroupB occurs in the gateway Nonetheless the reinforcementlearning allows for a device-driven context-aware connectiv-ity that improves the compliance criteria by more than twotimes while saving 4322 of energy resulting in a holisticgain of 5852 Scenario D manages to reduce the energyconsumption more than our proposed approach at the sametotal cost however 303 of the devices are out of budgetresulting in incomplete or interrupted computational tasks

Wireless Communications and Mobile Computing 9

Table 5 Results on various metrics for 119876-learning and the scenarios

Energy Consumption (mJ) Cost Dissatisfaction Out of Budget Devices Joint CostQ-Learning 569 plusmn 0322 9677 plusmn 401 18 plusmn 0291 0 plusmn 0 07822Scenario A 1488 plusmn 0385 024 plusmn 615119890minus3 51 plusmn 028 0 plusmn 0 15323Scenario B 755 plusmn 024 11857 plusmn 449 529 plusmn 0181 303 plusmn 0217 20679Scenario C 816 plusmn 0284 1207 plusmn 0383 581 plusmn 0208 0 plusmn 0 12399Scenario D 083 plusmn 0025 13041 plusmn 454 6 plusmn 0 303 plusmn 0217 17756Scenario E 748 plusmn 0281 11968 plusmn 383 781 plusmn 0208 297 plusmn 0213 24643Scenario F 015 plusmn 459119890minus3 23802 plusmn 616 8 plusmn 0 6 plusmn 0339 30000

Table 6 List of fixed scenarios with connection types and locationsof data processing

Scenario Group A Group BA Device DeviceB Cloud DeviceC Device FogD Cloud FogE Device CloudF Cloud Cloud

Gain of Q-learning over the scenarios

Scenario A

Scenario B

Scenario C

Scenario D

Scenario E

Scenario Fminus100

minus50

0

50

100

150

200

250

300

350

Gai

n (

)

Total energy (mJ)Total costTotal dissatisfaction

Out of budget devicesJoint Penalty

Figure 5 Summary of results for 120589 = 01 Positive and negativevalues reflect gain and loss respectively Gainloss occurs when the119876-learningscenarios is better than the scenarios119876-learning

Moreover in this scenario connected devices are more thantwo times more likely to be dissatisfied with one or more ofthe context-aware requirements

Next we examine the impact of the battery priority factor120589 on the energy efficiency As shown in Figure 6 low valuesof 120589 result in almost neglecting the battery life of the device inthe optimisation process until it drops below 10 Very high

0 10 20 30 40 50 60 70 80 90 100Battery Level ()

1

2

3

4

5

6

7

Ener

gy C

onsu

mpt

ion

(J)

Impact of the Energy Prioritization Factor ()

= 01 = 03 = 05

= 09 = 12

times10minus3

Figure 6 Impact of energy prioritisation factor 120589

values of 120589 prioritise the reduction of energy consumption forall devices except those that have higher than 70battery lifeTo this end it is possible to tune this parameter dependingon the scenario at hand and in a device-specific manner Forinstance some devices may be part of a moving vehicle withthe possibility of agile and low cost battery replenishmentSuch devices may benefit from low settings of 120589 to allowmore flexibility in meeting the remaining constraints Otherdevices may be in hard-to-reach places and would requireskilled force special equipment and hence high cost toreplace the dead battery In this case higher settings of 120589 aremore suitable and would result in better cost to quality ratio

The simulation results achieved in this work are verypromising as they indicate a large margin for improvementthat is not possible in fixed connection schemes The pro-posed reinforcement learning method relies on centralisedintelligence which has access to all the constraints andrequirements of all devices gateways and connectionsHence the 119876-learning-based method selects the best action(connection typeprocessing location pair in the first stageand amount of data to be transmitted in the second stage)after the convergence We appreciate that such a deployment

10 Wireless Communications and Mobile Computing

is not realistic and propose to explore the feasibility andcorresponding gains of multiagent and distributed reinforce-ment learning as adopted in [24] in our future workNonetheless this work is undoubtedly the first to highlightthe importance of context-aware connectivity in the IoT con-text that addresses jointly security energy and computationalpower as well as cost We present a new application SmartPorts and quantify the potential margin for improvement byemploying the novel scheme and highlight its effects on theapplication

6 Conclusion

In this work we have presented novel approach for energy-aware and context-aware IoT connectivity that jointlyoptimises the energy security computational power andresponse time of the connection The proposed schemeemploys reinforcement learning and manages to achieve aholistic gain of up to 28354 compared to deterministicroutes Although some deterministic scenarios may resultin lower computational cost or lower energy consumptionnone is able to meet the holistic context-aware performancetarget In addition we presented an analysis of the impactof the energy prioritisation factor in which we demonstratedthe importance of tuning this parameter in a device-centricmanner in order to achieve better optimisation of the wholesystem

Data Availability

The data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This research was partly funded by EPSRCGlobal ChallengesResearch Fundmdashthe DARE ProjectmdashEPP0287641The firstauthor was supported by the Republic of Turkey Ministry ofNational Education (MoNE-1416YLSY)

References

[1] S Andreev O Galinina A Pyattaev et al ldquoUnderstandingthe IoT connectivity landscape a contemporary M2M radiotechnology roadmaprdquo IEEE Communications Magazine vol 53no 9 pp 32ndash40 2015

[2] L Atzori A Iera and G Morabito ldquoThe internet of things asurveyrdquoComputer Networks vol 54 no 15 pp 2787ndash2805 2010

[3] N Kouzayha M Jaber and Z Dawy ldquoMeasurement-basedsignaling management strategies for cellular IoTrdquo IEEE Internetof Things Journal vol 4 no 5 pp 1434ndash1444 2017

[4] Y Yang M Zhong H Yao F Yu X Fu and O PostolacheldquoInternet of things for smart ports technologies and chal-lengesrdquo IEEE Instrumentation Measurement Magazine vol 21no 1 pp 34ndash43 2018

[5] GSMA ldquo3GPP low power wide area technologiesrdquo GSMAWhite paper Oct 2016

[6] 3GPP ldquoEvolved Universal Terrestrial Radio Access (E-UTRA)LTE coverage enhancementsrdquo 3GPPThechnical Report 36 Jun2012

[7] Technologies Keysight ldquoThe menu at the IoT cafe a guide toIoT wireless technologiesrdquo Application Note 2017

[8] L Farhan S T Shukur A E Alissa M Alrweg U Raza andR Kharel ldquoA survey on the challenges and opportunities of theInternet of Things (IoT)rdquo in Proceedings of the 2017 EleventhInternational Conference on Sensing Technology (ICST) pp 1ndash5December 2017

[9] S Tayade P Rost A Maeder and H D Schotten ldquoDevice-centric energy optimization for edge cloud offloadingrdquo inProceedings of the 2017 IEEEGlobal Communications Conference(GLOBECOM 2017) pp 1ndash7 Singapore December 2017

[10] F Renna J Doyle V Giotsas and Y Andreopoulos ldquoQueryprocessing for the internet-of-things coupling of device energyconsumption and cloud infrastructure billingrdquo in Proceedingsof the 2016 IEEE First International Conference on Internet-of-Things Design and Implementation (IoTDI) pp 83ndash94 BerlinGermany April 2016

[11] S Persia C Carciofi and M Faccioli ldquoNB-IoT and LoRAconnectivity analysis for M2MIoT smart grids applicationsrdquo inProceedings of the 2017 AEIT International Annual Conferencepp 1ndash6 Cagliari September 2017

[12] A Mihovska and M Sarkar ldquoSmart connectivity for internet ofthings (IoT) applicationsrdquo in New Advances in the Internet ofThings vol 715 of Studies in Computational Intelligence pp 105ndash118 Springer International Publishing Cham 2018

[13] N Kouzayha M Jaber and Z Dawy ldquoM2M data aggregationover cellular networks signaling-delay trade-offsrdquo in Proceed-ings of the 2014 IEEE Globecom Workshops (GC Wkshps) pp1155ndash1160 December 2014

[14] J Xu L Chen and P Zhou ldquoJoint service caching and taskoffloading for mobile edge computing in dense networksrdquoArXiv e-prints 180105868 Jan 2018

[15] O Y Bursalioglu Z Li C Wang and H PapadopoulosldquoEfficient C-RAN random access for IoT devices learning linksvia recommendation systemsrdquo ArXiv e-prints 180104001 Jan2018

[16] H Li K Ota and M Dong ldquoLearning IoT in edge deeplearning for the internet of things with edge computingrdquo IEEENetwork vol 32 no 1 pp 96ndash101 2018

[17] E Oyekanlu ldquoPredictive edge computing for time series ofindustrial IoT and large scale critical infrastructure based onopen-source software analytic of big datardquo in Proceedings of the2017 IEEE International Conference on Big Data (Big Data) pp1663ndash1669 Boston MA USA December 2017

[18] S Barbarossa S Sardellitti E Ceci and M Merluzzi ldquoTheedge cloud a holistic view of communication computation andcachingrdquo ArXiv e-prints 180200700 Feb 2018

[19] T X Vu S Chatzinotas and B Ottersten ldquoEdge-cachingwireless networks performance analysis and optimizationrdquoIEEE Transactions on Wireless Communications vol 17 no 4pp 2827ndash2839 2018

[20] ITU-R ldquoPropagation data and prediction methods for theplanning of short-range outdoor radiocommunication sys-tems and radio local area networks in the frequency range300 MHz to 100 GHzrdquo International TelecommunicationUnionmdashRadiocommunication Sector Geneva 2017 Recommen-dation ITU-R P1411-9

Wireless Communications and Mobile Computing 11

[21] L P KaelblingM L Littman andAWMoore ldquoReinforcementlearning a surveyrdquo Journal of Artificial Intelligence Research vol4 pp 237ndash285 1996

[22] E M Russek I Momennejad MM Botvinick S J Gershmanand N D Daw ldquoPredictive representations can link model-based reinforcement learning tomodel-freemechanismsrdquo PLoSComputational Biology vol 13 no 9 Article ID e1005768 2017

[23] E Even-Dar and Y Mansour ldquoConvergence of optimistic andincremental q-learningrdquo in Advances in Neural InformationProcessing Systems pp 1499ndash1506 2002

[24] M Jaber M A Imran R Tafazolli and A TukmanovldquoA distributed SON-based user-centric backhaul provisioningschemerdquo IEEE Access vol 4 pp 2314ndash2330 2016

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 7: Energy-Aware Smart Connectivity for IoT Networks: Enabling ...

Wireless Communications and Mobile Computing 7

Table 3 List of possible states of each device in Stage one and corresponding penalty calculation

State Description Penalty function (119875)

1205901 None of the constraints are satisfied 104 + sum119901=1minus3

120593119901 + 10120589] sdot 1205731 sdot 119860(4)119894 sdot [(1 minus 12059341205941015840 sdot 119860(5)119894 ) +1205934 sdot ((119860(4)1 + 119860(4)4 ) 2)

1205941015840 sdot 119860(5)119894 ]1205902 One constraint is satisfied 5 times 103 + 08 sum

119901=1minus3

120593119901 + 10120589] sdot 1205731 sdot 119860(4)119894 sdot [(1 minus 12059341205941015840 sdot 119860(5)119894 ) + 1205934 sdot ((119860(4)1 + 119860(4)4 ) 2)1205941015840 sdot 119860(5)119894 ]

1205903 Two constraints are satisfied 2 times 103 + 06 sum119901=1minus3

120593119901 + 10120589] sdot 1205731 sdot 119860(4)119894 sdot [(1 minus 12059341205941015840 sdot 119860(5)119894 ) +1205934 sdot ((119860(4)1 + 119860(4)4 ) 2)

1205941015840 sdot 119860(5)119894 ]1205904 Three constraints are satisfied 08 sum

119901=1minus2

120593119901 + 1205933 + 10120589] sdot 1205731 sdot 119860(4)119894 sdot [(1 minus 12059341205941015840 sdot 119860(5)119894 ) +1205934 sdot ((119860(4)1 + 119860(4)4 ) 2)

1205941015840 sdot 119860(5)119894 ]

Table 4 List of possible states of each device in Stage two and corresponding penalty calculation

State Description Penalty function1 No availability in cloud or fog for 1205941015840 103 + 1205732 (119860(4)119894 sdot (1 minus 119894) + 1205941015840 sdot 119894 sdot 119860(5)119894 )2 Enough availability in cloud or fog but no budget for 1205941015840 103 + 1205732 (119860(4)119894 sdot (1 minus 119894) + 1205941015840 sdot 119894 sdot 119860(5)119894 )3 Enough availability and budget for 1205941015840 1205732 (119860(4)119894 sdot (1 minus 119894) + 1205941015840 sdot 119894 sdot 119860(5)119894 )

42 Second Stage Learning The second stage aims to find thebest policy for task offloading by considering the budget andavailability of the fog or cloud To this end the second stageis activated only when the action taken in Stage 1 does notresult in local processing (ie 1198601 and 1198604) In Stage 2 119876-learning is also employed with 21 possible actions = [0 005 1] and the constraints are the available budget 119887 andthe availability of the fog andor cloud The resulting statesand penalty functions for this stage are listed in Table 4

421 Penalty Function Determination The penalty functionof this stage is determined with a similar procedure to thefirst stage hence there are three cost elements constant termenergy consumption and monetary cost Similar to the firststage the constant value ensures ending up with the highestpossible level of state Having the energy consumption andmonetary cost elements simultaneously provides finding thebest trade-off between the two However unlike the firststage these elements are calculated for a piece of data thatis planned to be transferred as specifying the best amount isthe objective of this stage learning Similarly the coefficientsare obtained empirically

The interaction between Stage 1 and Stage 2 in the learningprocess is depicted in Algorithms 1 and 2 respectively

5 Results and Analysis

In this section we implement the proposed reinforcementlearning approach in a simulation environment as shown inFigure 4 using the parameter values defined in Table 1 Weconsider that half of the IoT devices connect with NB-IoT inview of the data privacy and related security requirementsthese represent Group A The remaining devices connect tothe eNB through the WiFi gateway hence over two wirelesshops and represent Group B Consequently there are sixpossible fixed scenarios that may be formed by selecting theprocessing location of each group of devices these are listed

in Table 6 A total of 100 iterations is conducted and in eachrandom battery levels are allocated to each of the devices

We compare the results obtained with our method tothe six listed scenarios in terms of five different parametersenergy cost dissatisfaction number of out of budget devicesand joint penalty First energy represents the end-to-endenergy consumption caused from both connection and dataprocessing Second cost is the overall monetary cost incurredby the use of the data processing locations such as fog andcloud Third dissatisfaction is a measure of the total numberof device requirements that are not satisfied Fourth numberof out of budget devices reflects the count of devices thatexceed their available monetary budgets during performingtheir tasks Finally the joint penalty indicates the cumulativecombination of previous four parameters (energy cost dissat-isfaction and number of out of budget devices)

The results in terms of gain (positive values) and loss(negative values) are shown in Figure 5 Note that the valuesfor parameters energy cost dissatisfaction and joint penaltyare obtained as follows

119892 (119909) = 119901119904 minus 119901119902119901119902 times 100 (7)

where 119901119904 and 119901119902 are the values from Table 5 for Scenarios A-Fand 119876-learning respectively

On the other hand the gainloss values for the parameterof number of out of budget devices in Figure 5 is calculatedusing the function given as

119900 (119909) = 119874119906119905 119900119891 119861119906119889119892119890119905 119863119890V119894119888119890119904119873119866 times 100 (8)

It is worth noting that the results provided in Figure 5 areevaluated using the average values given in Table 5 along with95 confidence intervals Moreover the joint cost parameterin Table 5 is calculated by summing them However beforethe summation other four parameters (energy consumption

8 Wireless Communications and Mobile Computing

Data Context-aware constraints available computational capacity in gateway and eNB budgetResult Combination of connectivity route and processing venue

1 initialization2 for all IoT devices do3 Determine the current state using Table 34 Evaluate all the actions5 Calculate the penalty using Table 36 Select the best action7 Jump to the next state8 Update the 119876-table9 if the selected action includes fog(gateway) or

cloud (eNB) processing then10 go to Algorithm 211 end12 end

Algorithm 1 First stage learning

Data Action selected by the first stage available computational capacity in gateway and eNB budgetResult Share of data to be offloaded13 initialization14 for all IoT devices do15 Determine the current state using Table 416 Evaluate all the actions17 Calculate the penalty using Table 418 Select the best action19 Jump to the next state20 Update the 119876-table21 end

Algorithm 2 Second stage learning

minus200 minus150 minus100 minus50 0 50 100 150 200minus200

minus150

minus100

minus50

0

50

100

150

200

LTE eNBWiFi GW 1WiFi GW 2WiFi GW 3

WiFi GW 4WiFi GW 5IoT Devices

Figure 4 Sample snapshot of the simulation environment IoTdevices are located randomly while positions of the gateways arefixed

cost dissatisfaction and number of out of budget devices) arefeature scaled into the range of [0 1] using the function in (5)in order to keep their impacts in the same scale

Our method outperforms any fixed combination whenexamining the joint or holistic gain with values rangingfrom 959 to 28354 Similarly the reinforcement learningtechnique results in better matching between the context-aware constraint and the availability of the IoT networkcompare to any other scenario with gains varying from18333 to 34444 Although the processing cost of ourproposed method is higher than that of Scenario A theresulting gain in energy saving is even more important aswell as the context-aware constraint compliance The closestcontender to reinforcement learning with respect to thegenerated results is Scenario C in which the processing ofGroup A IoT devices is locally conducted while that of GroupB occurs in the gateway Nonetheless the reinforcementlearning allows for a device-driven context-aware connectiv-ity that improves the compliance criteria by more than twotimes while saving 4322 of energy resulting in a holisticgain of 5852 Scenario D manages to reduce the energyconsumption more than our proposed approach at the sametotal cost however 303 of the devices are out of budgetresulting in incomplete or interrupted computational tasks

Wireless Communications and Mobile Computing 9

Table 5 Results on various metrics for 119876-learning and the scenarios

Energy Consumption (mJ) Cost Dissatisfaction Out of Budget Devices Joint CostQ-Learning 569 plusmn 0322 9677 plusmn 401 18 plusmn 0291 0 plusmn 0 07822Scenario A 1488 plusmn 0385 024 plusmn 615119890minus3 51 plusmn 028 0 plusmn 0 15323Scenario B 755 plusmn 024 11857 plusmn 449 529 plusmn 0181 303 plusmn 0217 20679Scenario C 816 plusmn 0284 1207 plusmn 0383 581 plusmn 0208 0 plusmn 0 12399Scenario D 083 plusmn 0025 13041 plusmn 454 6 plusmn 0 303 plusmn 0217 17756Scenario E 748 plusmn 0281 11968 plusmn 383 781 plusmn 0208 297 plusmn 0213 24643Scenario F 015 plusmn 459119890minus3 23802 plusmn 616 8 plusmn 0 6 plusmn 0339 30000

Table 6 List of fixed scenarios with connection types and locationsof data processing

Scenario Group A Group BA Device DeviceB Cloud DeviceC Device FogD Cloud FogE Device CloudF Cloud Cloud

Gain of Q-learning over the scenarios

Scenario A

Scenario B

Scenario C

Scenario D

Scenario E

Scenario Fminus100

minus50

0

50

100

150

200

250

300

350

Gai

n (

)

Total energy (mJ)Total costTotal dissatisfaction

Out of budget devicesJoint Penalty

Figure 5 Summary of results for 120589 = 01 Positive and negativevalues reflect gain and loss respectively Gainloss occurs when the119876-learningscenarios is better than the scenarios119876-learning

Moreover in this scenario connected devices are more thantwo times more likely to be dissatisfied with one or more ofthe context-aware requirements

Next we examine the impact of the battery priority factor120589 on the energy efficiency As shown in Figure 6 low valuesof 120589 result in almost neglecting the battery life of the device inthe optimisation process until it drops below 10 Very high

0 10 20 30 40 50 60 70 80 90 100Battery Level ()

1

2

3

4

5

6

7

Ener

gy C

onsu

mpt

ion

(J)

Impact of the Energy Prioritization Factor ()

= 01 = 03 = 05

= 09 = 12

times10minus3

Figure 6 Impact of energy prioritisation factor 120589

values of 120589 prioritise the reduction of energy consumption forall devices except those that have higher than 70battery lifeTo this end it is possible to tune this parameter dependingon the scenario at hand and in a device-specific manner Forinstance some devices may be part of a moving vehicle withthe possibility of agile and low cost battery replenishmentSuch devices may benefit from low settings of 120589 to allowmore flexibility in meeting the remaining constraints Otherdevices may be in hard-to-reach places and would requireskilled force special equipment and hence high cost toreplace the dead battery In this case higher settings of 120589 aremore suitable and would result in better cost to quality ratio

The simulation results achieved in this work are verypromising as they indicate a large margin for improvementthat is not possible in fixed connection schemes The pro-posed reinforcement learning method relies on centralisedintelligence which has access to all the constraints andrequirements of all devices gateways and connectionsHence the 119876-learning-based method selects the best action(connection typeprocessing location pair in the first stageand amount of data to be transmitted in the second stage)after the convergence We appreciate that such a deployment

10 Wireless Communications and Mobile Computing

is not realistic and propose to explore the feasibility andcorresponding gains of multiagent and distributed reinforce-ment learning as adopted in [24] in our future workNonetheless this work is undoubtedly the first to highlightthe importance of context-aware connectivity in the IoT con-text that addresses jointly security energy and computationalpower as well as cost We present a new application SmartPorts and quantify the potential margin for improvement byemploying the novel scheme and highlight its effects on theapplication

6 Conclusion

In this work we have presented novel approach for energy-aware and context-aware IoT connectivity that jointlyoptimises the energy security computational power andresponse time of the connection The proposed schemeemploys reinforcement learning and manages to achieve aholistic gain of up to 28354 compared to deterministicroutes Although some deterministic scenarios may resultin lower computational cost or lower energy consumptionnone is able to meet the holistic context-aware performancetarget In addition we presented an analysis of the impactof the energy prioritisation factor in which we demonstratedthe importance of tuning this parameter in a device-centricmanner in order to achieve better optimisation of the wholesystem

Data Availability

The data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This research was partly funded by EPSRCGlobal ChallengesResearch Fundmdashthe DARE ProjectmdashEPP0287641The firstauthor was supported by the Republic of Turkey Ministry ofNational Education (MoNE-1416YLSY)

References

[1] S Andreev O Galinina A Pyattaev et al ldquoUnderstandingthe IoT connectivity landscape a contemporary M2M radiotechnology roadmaprdquo IEEE Communications Magazine vol 53no 9 pp 32ndash40 2015

[2] L Atzori A Iera and G Morabito ldquoThe internet of things asurveyrdquoComputer Networks vol 54 no 15 pp 2787ndash2805 2010

[3] N Kouzayha M Jaber and Z Dawy ldquoMeasurement-basedsignaling management strategies for cellular IoTrdquo IEEE Internetof Things Journal vol 4 no 5 pp 1434ndash1444 2017

[4] Y Yang M Zhong H Yao F Yu X Fu and O PostolacheldquoInternet of things for smart ports technologies and chal-lengesrdquo IEEE Instrumentation Measurement Magazine vol 21no 1 pp 34ndash43 2018

[5] GSMA ldquo3GPP low power wide area technologiesrdquo GSMAWhite paper Oct 2016

[6] 3GPP ldquoEvolved Universal Terrestrial Radio Access (E-UTRA)LTE coverage enhancementsrdquo 3GPPThechnical Report 36 Jun2012

[7] Technologies Keysight ldquoThe menu at the IoT cafe a guide toIoT wireless technologiesrdquo Application Note 2017

[8] L Farhan S T Shukur A E Alissa M Alrweg U Raza andR Kharel ldquoA survey on the challenges and opportunities of theInternet of Things (IoT)rdquo in Proceedings of the 2017 EleventhInternational Conference on Sensing Technology (ICST) pp 1ndash5December 2017

[9] S Tayade P Rost A Maeder and H D Schotten ldquoDevice-centric energy optimization for edge cloud offloadingrdquo inProceedings of the 2017 IEEEGlobal Communications Conference(GLOBECOM 2017) pp 1ndash7 Singapore December 2017

[10] F Renna J Doyle V Giotsas and Y Andreopoulos ldquoQueryprocessing for the internet-of-things coupling of device energyconsumption and cloud infrastructure billingrdquo in Proceedingsof the 2016 IEEE First International Conference on Internet-of-Things Design and Implementation (IoTDI) pp 83ndash94 BerlinGermany April 2016

[11] S Persia C Carciofi and M Faccioli ldquoNB-IoT and LoRAconnectivity analysis for M2MIoT smart grids applicationsrdquo inProceedings of the 2017 AEIT International Annual Conferencepp 1ndash6 Cagliari September 2017

[12] A Mihovska and M Sarkar ldquoSmart connectivity for internet ofthings (IoT) applicationsrdquo in New Advances in the Internet ofThings vol 715 of Studies in Computational Intelligence pp 105ndash118 Springer International Publishing Cham 2018

[13] N Kouzayha M Jaber and Z Dawy ldquoM2M data aggregationover cellular networks signaling-delay trade-offsrdquo in Proceed-ings of the 2014 IEEE Globecom Workshops (GC Wkshps) pp1155ndash1160 December 2014

[14] J Xu L Chen and P Zhou ldquoJoint service caching and taskoffloading for mobile edge computing in dense networksrdquoArXiv e-prints 180105868 Jan 2018

[15] O Y Bursalioglu Z Li C Wang and H PapadopoulosldquoEfficient C-RAN random access for IoT devices learning linksvia recommendation systemsrdquo ArXiv e-prints 180104001 Jan2018

[16] H Li K Ota and M Dong ldquoLearning IoT in edge deeplearning for the internet of things with edge computingrdquo IEEENetwork vol 32 no 1 pp 96ndash101 2018

[17] E Oyekanlu ldquoPredictive edge computing for time series ofindustrial IoT and large scale critical infrastructure based onopen-source software analytic of big datardquo in Proceedings of the2017 IEEE International Conference on Big Data (Big Data) pp1663ndash1669 Boston MA USA December 2017

[18] S Barbarossa S Sardellitti E Ceci and M Merluzzi ldquoTheedge cloud a holistic view of communication computation andcachingrdquo ArXiv e-prints 180200700 Feb 2018

[19] T X Vu S Chatzinotas and B Ottersten ldquoEdge-cachingwireless networks performance analysis and optimizationrdquoIEEE Transactions on Wireless Communications vol 17 no 4pp 2827ndash2839 2018

[20] ITU-R ldquoPropagation data and prediction methods for theplanning of short-range outdoor radiocommunication sys-tems and radio local area networks in the frequency range300 MHz to 100 GHzrdquo International TelecommunicationUnionmdashRadiocommunication Sector Geneva 2017 Recommen-dation ITU-R P1411-9

Wireless Communications and Mobile Computing 11

[21] L P KaelblingM L Littman andAWMoore ldquoReinforcementlearning a surveyrdquo Journal of Artificial Intelligence Research vol4 pp 237ndash285 1996

[22] E M Russek I Momennejad MM Botvinick S J Gershmanand N D Daw ldquoPredictive representations can link model-based reinforcement learning tomodel-freemechanismsrdquo PLoSComputational Biology vol 13 no 9 Article ID e1005768 2017

[23] E Even-Dar and Y Mansour ldquoConvergence of optimistic andincremental q-learningrdquo in Advances in Neural InformationProcessing Systems pp 1499ndash1506 2002

[24] M Jaber M A Imran R Tafazolli and A TukmanovldquoA distributed SON-based user-centric backhaul provisioningschemerdquo IEEE Access vol 4 pp 2314ndash2330 2016

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 8: Energy-Aware Smart Connectivity for IoT Networks: Enabling ...

8 Wireless Communications and Mobile Computing

Data Context-aware constraints available computational capacity in gateway and eNB budgetResult Combination of connectivity route and processing venue

1 initialization2 for all IoT devices do3 Determine the current state using Table 34 Evaluate all the actions5 Calculate the penalty using Table 36 Select the best action7 Jump to the next state8 Update the 119876-table9 if the selected action includes fog(gateway) or

cloud (eNB) processing then10 go to Algorithm 211 end12 end

Algorithm 1 First stage learning

Data Action selected by the first stage available computational capacity in gateway and eNB budgetResult Share of data to be offloaded13 initialization14 for all IoT devices do15 Determine the current state using Table 416 Evaluate all the actions17 Calculate the penalty using Table 418 Select the best action19 Jump to the next state20 Update the 119876-table21 end

Algorithm 2 Second stage learning

minus200 minus150 minus100 minus50 0 50 100 150 200minus200

minus150

minus100

minus50

0

50

100

150

200

LTE eNBWiFi GW 1WiFi GW 2WiFi GW 3

WiFi GW 4WiFi GW 5IoT Devices

Figure 4 Sample snapshot of the simulation environment IoTdevices are located randomly while positions of the gateways arefixed

cost dissatisfaction and number of out of budget devices) arefeature scaled into the range of [0 1] using the function in (5)in order to keep their impacts in the same scale

Our method outperforms any fixed combination whenexamining the joint or holistic gain with values rangingfrom 959 to 28354 Similarly the reinforcement learningtechnique results in better matching between the context-aware constraint and the availability of the IoT networkcompare to any other scenario with gains varying from18333 to 34444 Although the processing cost of ourproposed method is higher than that of Scenario A theresulting gain in energy saving is even more important aswell as the context-aware constraint compliance The closestcontender to reinforcement learning with respect to thegenerated results is Scenario C in which the processing ofGroup A IoT devices is locally conducted while that of GroupB occurs in the gateway Nonetheless the reinforcementlearning allows for a device-driven context-aware connectiv-ity that improves the compliance criteria by more than twotimes while saving 4322 of energy resulting in a holisticgain of 5852 Scenario D manages to reduce the energyconsumption more than our proposed approach at the sametotal cost however 303 of the devices are out of budgetresulting in incomplete or interrupted computational tasks

Wireless Communications and Mobile Computing 9

Table 5 Results on various metrics for 119876-learning and the scenarios

Energy Consumption (mJ) Cost Dissatisfaction Out of Budget Devices Joint CostQ-Learning 569 plusmn 0322 9677 plusmn 401 18 plusmn 0291 0 plusmn 0 07822Scenario A 1488 plusmn 0385 024 plusmn 615119890minus3 51 plusmn 028 0 plusmn 0 15323Scenario B 755 plusmn 024 11857 plusmn 449 529 plusmn 0181 303 plusmn 0217 20679Scenario C 816 plusmn 0284 1207 plusmn 0383 581 plusmn 0208 0 plusmn 0 12399Scenario D 083 plusmn 0025 13041 plusmn 454 6 plusmn 0 303 plusmn 0217 17756Scenario E 748 plusmn 0281 11968 plusmn 383 781 plusmn 0208 297 plusmn 0213 24643Scenario F 015 plusmn 459119890minus3 23802 plusmn 616 8 plusmn 0 6 plusmn 0339 30000

Table 6 List of fixed scenarios with connection types and locationsof data processing

Scenario Group A Group BA Device DeviceB Cloud DeviceC Device FogD Cloud FogE Device CloudF Cloud Cloud

Gain of Q-learning over the scenarios

Scenario A

Scenario B

Scenario C

Scenario D

Scenario E

Scenario Fminus100

minus50

0

50

100

150

200

250

300

350

Gai

n (

)

Total energy (mJ)Total costTotal dissatisfaction

Out of budget devicesJoint Penalty

Figure 5 Summary of results for 120589 = 01 Positive and negativevalues reflect gain and loss respectively Gainloss occurs when the119876-learningscenarios is better than the scenarios119876-learning

Moreover in this scenario connected devices are more thantwo times more likely to be dissatisfied with one or more ofthe context-aware requirements

Next we examine the impact of the battery priority factor120589 on the energy efficiency As shown in Figure 6 low valuesof 120589 result in almost neglecting the battery life of the device inthe optimisation process until it drops below 10 Very high

0 10 20 30 40 50 60 70 80 90 100Battery Level ()

1

2

3

4

5

6

7

Ener

gy C

onsu

mpt

ion

(J)

Impact of the Energy Prioritization Factor ()

= 01 = 03 = 05

= 09 = 12

times10minus3

Figure 6 Impact of energy prioritisation factor 120589

values of 120589 prioritise the reduction of energy consumption forall devices except those that have higher than 70battery lifeTo this end it is possible to tune this parameter dependingon the scenario at hand and in a device-specific manner Forinstance some devices may be part of a moving vehicle withthe possibility of agile and low cost battery replenishmentSuch devices may benefit from low settings of 120589 to allowmore flexibility in meeting the remaining constraints Otherdevices may be in hard-to-reach places and would requireskilled force special equipment and hence high cost toreplace the dead battery In this case higher settings of 120589 aremore suitable and would result in better cost to quality ratio

The simulation results achieved in this work are verypromising as they indicate a large margin for improvementthat is not possible in fixed connection schemes The pro-posed reinforcement learning method relies on centralisedintelligence which has access to all the constraints andrequirements of all devices gateways and connectionsHence the 119876-learning-based method selects the best action(connection typeprocessing location pair in the first stageand amount of data to be transmitted in the second stage)after the convergence We appreciate that such a deployment

10 Wireless Communications and Mobile Computing

is not realistic and propose to explore the feasibility andcorresponding gains of multiagent and distributed reinforce-ment learning as adopted in [24] in our future workNonetheless this work is undoubtedly the first to highlightthe importance of context-aware connectivity in the IoT con-text that addresses jointly security energy and computationalpower as well as cost We present a new application SmartPorts and quantify the potential margin for improvement byemploying the novel scheme and highlight its effects on theapplication

6 Conclusion

In this work we have presented novel approach for energy-aware and context-aware IoT connectivity that jointlyoptimises the energy security computational power andresponse time of the connection The proposed schemeemploys reinforcement learning and manages to achieve aholistic gain of up to 28354 compared to deterministicroutes Although some deterministic scenarios may resultin lower computational cost or lower energy consumptionnone is able to meet the holistic context-aware performancetarget In addition we presented an analysis of the impactof the energy prioritisation factor in which we demonstratedthe importance of tuning this parameter in a device-centricmanner in order to achieve better optimisation of the wholesystem

Data Availability

The data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This research was partly funded by EPSRCGlobal ChallengesResearch Fundmdashthe DARE ProjectmdashEPP0287641The firstauthor was supported by the Republic of Turkey Ministry ofNational Education (MoNE-1416YLSY)

References

[1] S Andreev O Galinina A Pyattaev et al ldquoUnderstandingthe IoT connectivity landscape a contemporary M2M radiotechnology roadmaprdquo IEEE Communications Magazine vol 53no 9 pp 32ndash40 2015

[2] L Atzori A Iera and G Morabito ldquoThe internet of things asurveyrdquoComputer Networks vol 54 no 15 pp 2787ndash2805 2010

[3] N Kouzayha M Jaber and Z Dawy ldquoMeasurement-basedsignaling management strategies for cellular IoTrdquo IEEE Internetof Things Journal vol 4 no 5 pp 1434ndash1444 2017

[4] Y Yang M Zhong H Yao F Yu X Fu and O PostolacheldquoInternet of things for smart ports technologies and chal-lengesrdquo IEEE Instrumentation Measurement Magazine vol 21no 1 pp 34ndash43 2018

[5] GSMA ldquo3GPP low power wide area technologiesrdquo GSMAWhite paper Oct 2016

[6] 3GPP ldquoEvolved Universal Terrestrial Radio Access (E-UTRA)LTE coverage enhancementsrdquo 3GPPThechnical Report 36 Jun2012

[7] Technologies Keysight ldquoThe menu at the IoT cafe a guide toIoT wireless technologiesrdquo Application Note 2017

[8] L Farhan S T Shukur A E Alissa M Alrweg U Raza andR Kharel ldquoA survey on the challenges and opportunities of theInternet of Things (IoT)rdquo in Proceedings of the 2017 EleventhInternational Conference on Sensing Technology (ICST) pp 1ndash5December 2017

[9] S Tayade P Rost A Maeder and H D Schotten ldquoDevice-centric energy optimization for edge cloud offloadingrdquo inProceedings of the 2017 IEEEGlobal Communications Conference(GLOBECOM 2017) pp 1ndash7 Singapore December 2017

[10] F Renna J Doyle V Giotsas and Y Andreopoulos ldquoQueryprocessing for the internet-of-things coupling of device energyconsumption and cloud infrastructure billingrdquo in Proceedingsof the 2016 IEEE First International Conference on Internet-of-Things Design and Implementation (IoTDI) pp 83ndash94 BerlinGermany April 2016

[11] S Persia C Carciofi and M Faccioli ldquoNB-IoT and LoRAconnectivity analysis for M2MIoT smart grids applicationsrdquo inProceedings of the 2017 AEIT International Annual Conferencepp 1ndash6 Cagliari September 2017

[12] A Mihovska and M Sarkar ldquoSmart connectivity for internet ofthings (IoT) applicationsrdquo in New Advances in the Internet ofThings vol 715 of Studies in Computational Intelligence pp 105ndash118 Springer International Publishing Cham 2018

[13] N Kouzayha M Jaber and Z Dawy ldquoM2M data aggregationover cellular networks signaling-delay trade-offsrdquo in Proceed-ings of the 2014 IEEE Globecom Workshops (GC Wkshps) pp1155ndash1160 December 2014

[14] J Xu L Chen and P Zhou ldquoJoint service caching and taskoffloading for mobile edge computing in dense networksrdquoArXiv e-prints 180105868 Jan 2018

[15] O Y Bursalioglu Z Li C Wang and H PapadopoulosldquoEfficient C-RAN random access for IoT devices learning linksvia recommendation systemsrdquo ArXiv e-prints 180104001 Jan2018

[16] H Li K Ota and M Dong ldquoLearning IoT in edge deeplearning for the internet of things with edge computingrdquo IEEENetwork vol 32 no 1 pp 96ndash101 2018

[17] E Oyekanlu ldquoPredictive edge computing for time series ofindustrial IoT and large scale critical infrastructure based onopen-source software analytic of big datardquo in Proceedings of the2017 IEEE International Conference on Big Data (Big Data) pp1663ndash1669 Boston MA USA December 2017

[18] S Barbarossa S Sardellitti E Ceci and M Merluzzi ldquoTheedge cloud a holistic view of communication computation andcachingrdquo ArXiv e-prints 180200700 Feb 2018

[19] T X Vu S Chatzinotas and B Ottersten ldquoEdge-cachingwireless networks performance analysis and optimizationrdquoIEEE Transactions on Wireless Communications vol 17 no 4pp 2827ndash2839 2018

[20] ITU-R ldquoPropagation data and prediction methods for theplanning of short-range outdoor radiocommunication sys-tems and radio local area networks in the frequency range300 MHz to 100 GHzrdquo International TelecommunicationUnionmdashRadiocommunication Sector Geneva 2017 Recommen-dation ITU-R P1411-9

Wireless Communications and Mobile Computing 11

[21] L P KaelblingM L Littman andAWMoore ldquoReinforcementlearning a surveyrdquo Journal of Artificial Intelligence Research vol4 pp 237ndash285 1996

[22] E M Russek I Momennejad MM Botvinick S J Gershmanand N D Daw ldquoPredictive representations can link model-based reinforcement learning tomodel-freemechanismsrdquo PLoSComputational Biology vol 13 no 9 Article ID e1005768 2017

[23] E Even-Dar and Y Mansour ldquoConvergence of optimistic andincremental q-learningrdquo in Advances in Neural InformationProcessing Systems pp 1499ndash1506 2002

[24] M Jaber M A Imran R Tafazolli and A TukmanovldquoA distributed SON-based user-centric backhaul provisioningschemerdquo IEEE Access vol 4 pp 2314ndash2330 2016

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 9: Energy-Aware Smart Connectivity for IoT Networks: Enabling ...

Wireless Communications and Mobile Computing 9

Table 5 Results on various metrics for 119876-learning and the scenarios

Energy Consumption (mJ) Cost Dissatisfaction Out of Budget Devices Joint CostQ-Learning 569 plusmn 0322 9677 plusmn 401 18 plusmn 0291 0 plusmn 0 07822Scenario A 1488 plusmn 0385 024 plusmn 615119890minus3 51 plusmn 028 0 plusmn 0 15323Scenario B 755 plusmn 024 11857 plusmn 449 529 plusmn 0181 303 plusmn 0217 20679Scenario C 816 plusmn 0284 1207 plusmn 0383 581 plusmn 0208 0 plusmn 0 12399Scenario D 083 plusmn 0025 13041 plusmn 454 6 plusmn 0 303 plusmn 0217 17756Scenario E 748 plusmn 0281 11968 plusmn 383 781 plusmn 0208 297 plusmn 0213 24643Scenario F 015 plusmn 459119890minus3 23802 plusmn 616 8 plusmn 0 6 plusmn 0339 30000

Table 6 List of fixed scenarios with connection types and locationsof data processing

Scenario Group A Group BA Device DeviceB Cloud DeviceC Device FogD Cloud FogE Device CloudF Cloud Cloud

Gain of Q-learning over the scenarios

Scenario A

Scenario B

Scenario C

Scenario D

Scenario E

Scenario Fminus100

minus50

0

50

100

150

200

250

300

350

Gai

n (

)

Total energy (mJ)Total costTotal dissatisfaction

Out of budget devicesJoint Penalty

Figure 5 Summary of results for 120589 = 01 Positive and negativevalues reflect gain and loss respectively Gainloss occurs when the119876-learningscenarios is better than the scenarios119876-learning

Moreover in this scenario connected devices are more thantwo times more likely to be dissatisfied with one or more ofthe context-aware requirements

Next we examine the impact of the battery priority factor120589 on the energy efficiency As shown in Figure 6 low valuesof 120589 result in almost neglecting the battery life of the device inthe optimisation process until it drops below 10 Very high

0 10 20 30 40 50 60 70 80 90 100Battery Level ()

1

2

3

4

5

6

7

Ener

gy C

onsu

mpt

ion

(J)

Impact of the Energy Prioritization Factor ()

= 01 = 03 = 05

= 09 = 12

times10minus3

Figure 6 Impact of energy prioritisation factor 120589

values of 120589 prioritise the reduction of energy consumption forall devices except those that have higher than 70battery lifeTo this end it is possible to tune this parameter dependingon the scenario at hand and in a device-specific manner Forinstance some devices may be part of a moving vehicle withthe possibility of agile and low cost battery replenishmentSuch devices may benefit from low settings of 120589 to allowmore flexibility in meeting the remaining constraints Otherdevices may be in hard-to-reach places and would requireskilled force special equipment and hence high cost toreplace the dead battery In this case higher settings of 120589 aremore suitable and would result in better cost to quality ratio

The simulation results achieved in this work are verypromising as they indicate a large margin for improvementthat is not possible in fixed connection schemes The pro-posed reinforcement learning method relies on centralisedintelligence which has access to all the constraints andrequirements of all devices gateways and connectionsHence the 119876-learning-based method selects the best action(connection typeprocessing location pair in the first stageand amount of data to be transmitted in the second stage)after the convergence We appreciate that such a deployment

10 Wireless Communications and Mobile Computing

is not realistic and propose to explore the feasibility andcorresponding gains of multiagent and distributed reinforce-ment learning as adopted in [24] in our future workNonetheless this work is undoubtedly the first to highlightthe importance of context-aware connectivity in the IoT con-text that addresses jointly security energy and computationalpower as well as cost We present a new application SmartPorts and quantify the potential margin for improvement byemploying the novel scheme and highlight its effects on theapplication

6 Conclusion

In this work we have presented novel approach for energy-aware and context-aware IoT connectivity that jointlyoptimises the energy security computational power andresponse time of the connection The proposed schemeemploys reinforcement learning and manages to achieve aholistic gain of up to 28354 compared to deterministicroutes Although some deterministic scenarios may resultin lower computational cost or lower energy consumptionnone is able to meet the holistic context-aware performancetarget In addition we presented an analysis of the impactof the energy prioritisation factor in which we demonstratedthe importance of tuning this parameter in a device-centricmanner in order to achieve better optimisation of the wholesystem

Data Availability

The data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This research was partly funded by EPSRCGlobal ChallengesResearch Fundmdashthe DARE ProjectmdashEPP0287641The firstauthor was supported by the Republic of Turkey Ministry ofNational Education (MoNE-1416YLSY)

References

[1] S Andreev O Galinina A Pyattaev et al ldquoUnderstandingthe IoT connectivity landscape a contemporary M2M radiotechnology roadmaprdquo IEEE Communications Magazine vol 53no 9 pp 32ndash40 2015

[2] L Atzori A Iera and G Morabito ldquoThe internet of things asurveyrdquoComputer Networks vol 54 no 15 pp 2787ndash2805 2010

[3] N Kouzayha M Jaber and Z Dawy ldquoMeasurement-basedsignaling management strategies for cellular IoTrdquo IEEE Internetof Things Journal vol 4 no 5 pp 1434ndash1444 2017

[4] Y Yang M Zhong H Yao F Yu X Fu and O PostolacheldquoInternet of things for smart ports technologies and chal-lengesrdquo IEEE Instrumentation Measurement Magazine vol 21no 1 pp 34ndash43 2018

[5] GSMA ldquo3GPP low power wide area technologiesrdquo GSMAWhite paper Oct 2016

[6] 3GPP ldquoEvolved Universal Terrestrial Radio Access (E-UTRA)LTE coverage enhancementsrdquo 3GPPThechnical Report 36 Jun2012

[7] Technologies Keysight ldquoThe menu at the IoT cafe a guide toIoT wireless technologiesrdquo Application Note 2017

[8] L Farhan S T Shukur A E Alissa M Alrweg U Raza andR Kharel ldquoA survey on the challenges and opportunities of theInternet of Things (IoT)rdquo in Proceedings of the 2017 EleventhInternational Conference on Sensing Technology (ICST) pp 1ndash5December 2017

[9] S Tayade P Rost A Maeder and H D Schotten ldquoDevice-centric energy optimization for edge cloud offloadingrdquo inProceedings of the 2017 IEEEGlobal Communications Conference(GLOBECOM 2017) pp 1ndash7 Singapore December 2017

[10] F Renna J Doyle V Giotsas and Y Andreopoulos ldquoQueryprocessing for the internet-of-things coupling of device energyconsumption and cloud infrastructure billingrdquo in Proceedingsof the 2016 IEEE First International Conference on Internet-of-Things Design and Implementation (IoTDI) pp 83ndash94 BerlinGermany April 2016

[11] S Persia C Carciofi and M Faccioli ldquoNB-IoT and LoRAconnectivity analysis for M2MIoT smart grids applicationsrdquo inProceedings of the 2017 AEIT International Annual Conferencepp 1ndash6 Cagliari September 2017

[12] A Mihovska and M Sarkar ldquoSmart connectivity for internet ofthings (IoT) applicationsrdquo in New Advances in the Internet ofThings vol 715 of Studies in Computational Intelligence pp 105ndash118 Springer International Publishing Cham 2018

[13] N Kouzayha M Jaber and Z Dawy ldquoM2M data aggregationover cellular networks signaling-delay trade-offsrdquo in Proceed-ings of the 2014 IEEE Globecom Workshops (GC Wkshps) pp1155ndash1160 December 2014

[14] J Xu L Chen and P Zhou ldquoJoint service caching and taskoffloading for mobile edge computing in dense networksrdquoArXiv e-prints 180105868 Jan 2018

[15] O Y Bursalioglu Z Li C Wang and H PapadopoulosldquoEfficient C-RAN random access for IoT devices learning linksvia recommendation systemsrdquo ArXiv e-prints 180104001 Jan2018

[16] H Li K Ota and M Dong ldquoLearning IoT in edge deeplearning for the internet of things with edge computingrdquo IEEENetwork vol 32 no 1 pp 96ndash101 2018

[17] E Oyekanlu ldquoPredictive edge computing for time series ofindustrial IoT and large scale critical infrastructure based onopen-source software analytic of big datardquo in Proceedings of the2017 IEEE International Conference on Big Data (Big Data) pp1663ndash1669 Boston MA USA December 2017

[18] S Barbarossa S Sardellitti E Ceci and M Merluzzi ldquoTheedge cloud a holistic view of communication computation andcachingrdquo ArXiv e-prints 180200700 Feb 2018

[19] T X Vu S Chatzinotas and B Ottersten ldquoEdge-cachingwireless networks performance analysis and optimizationrdquoIEEE Transactions on Wireless Communications vol 17 no 4pp 2827ndash2839 2018

[20] ITU-R ldquoPropagation data and prediction methods for theplanning of short-range outdoor radiocommunication sys-tems and radio local area networks in the frequency range300 MHz to 100 GHzrdquo International TelecommunicationUnionmdashRadiocommunication Sector Geneva 2017 Recommen-dation ITU-R P1411-9

Wireless Communications and Mobile Computing 11

[21] L P KaelblingM L Littman andAWMoore ldquoReinforcementlearning a surveyrdquo Journal of Artificial Intelligence Research vol4 pp 237ndash285 1996

[22] E M Russek I Momennejad MM Botvinick S J Gershmanand N D Daw ldquoPredictive representations can link model-based reinforcement learning tomodel-freemechanismsrdquo PLoSComputational Biology vol 13 no 9 Article ID e1005768 2017

[23] E Even-Dar and Y Mansour ldquoConvergence of optimistic andincremental q-learningrdquo in Advances in Neural InformationProcessing Systems pp 1499ndash1506 2002

[24] M Jaber M A Imran R Tafazolli and A TukmanovldquoA distributed SON-based user-centric backhaul provisioningschemerdquo IEEE Access vol 4 pp 2314ndash2330 2016

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 10: Energy-Aware Smart Connectivity for IoT Networks: Enabling ...

10 Wireless Communications and Mobile Computing

is not realistic and propose to explore the feasibility andcorresponding gains of multiagent and distributed reinforce-ment learning as adopted in [24] in our future workNonetheless this work is undoubtedly the first to highlightthe importance of context-aware connectivity in the IoT con-text that addresses jointly security energy and computationalpower as well as cost We present a new application SmartPorts and quantify the potential margin for improvement byemploying the novel scheme and highlight its effects on theapplication

6 Conclusion

In this work we have presented novel approach for energy-aware and context-aware IoT connectivity that jointlyoptimises the energy security computational power andresponse time of the connection The proposed schemeemploys reinforcement learning and manages to achieve aholistic gain of up to 28354 compared to deterministicroutes Although some deterministic scenarios may resultin lower computational cost or lower energy consumptionnone is able to meet the holistic context-aware performancetarget In addition we presented an analysis of the impactof the energy prioritisation factor in which we demonstratedthe importance of tuning this parameter in a device-centricmanner in order to achieve better optimisation of the wholesystem

Data Availability

The data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This research was partly funded by EPSRCGlobal ChallengesResearch Fundmdashthe DARE ProjectmdashEPP0287641The firstauthor was supported by the Republic of Turkey Ministry ofNational Education (MoNE-1416YLSY)

References

[1] S Andreev O Galinina A Pyattaev et al ldquoUnderstandingthe IoT connectivity landscape a contemporary M2M radiotechnology roadmaprdquo IEEE Communications Magazine vol 53no 9 pp 32ndash40 2015

[2] L Atzori A Iera and G Morabito ldquoThe internet of things asurveyrdquoComputer Networks vol 54 no 15 pp 2787ndash2805 2010

[3] N Kouzayha M Jaber and Z Dawy ldquoMeasurement-basedsignaling management strategies for cellular IoTrdquo IEEE Internetof Things Journal vol 4 no 5 pp 1434ndash1444 2017

[4] Y Yang M Zhong H Yao F Yu X Fu and O PostolacheldquoInternet of things for smart ports technologies and chal-lengesrdquo IEEE Instrumentation Measurement Magazine vol 21no 1 pp 34ndash43 2018

[5] GSMA ldquo3GPP low power wide area technologiesrdquo GSMAWhite paper Oct 2016

[6] 3GPP ldquoEvolved Universal Terrestrial Radio Access (E-UTRA)LTE coverage enhancementsrdquo 3GPPThechnical Report 36 Jun2012

[7] Technologies Keysight ldquoThe menu at the IoT cafe a guide toIoT wireless technologiesrdquo Application Note 2017

[8] L Farhan S T Shukur A E Alissa M Alrweg U Raza andR Kharel ldquoA survey on the challenges and opportunities of theInternet of Things (IoT)rdquo in Proceedings of the 2017 EleventhInternational Conference on Sensing Technology (ICST) pp 1ndash5December 2017

[9] S Tayade P Rost A Maeder and H D Schotten ldquoDevice-centric energy optimization for edge cloud offloadingrdquo inProceedings of the 2017 IEEEGlobal Communications Conference(GLOBECOM 2017) pp 1ndash7 Singapore December 2017

[10] F Renna J Doyle V Giotsas and Y Andreopoulos ldquoQueryprocessing for the internet-of-things coupling of device energyconsumption and cloud infrastructure billingrdquo in Proceedingsof the 2016 IEEE First International Conference on Internet-of-Things Design and Implementation (IoTDI) pp 83ndash94 BerlinGermany April 2016

[11] S Persia C Carciofi and M Faccioli ldquoNB-IoT and LoRAconnectivity analysis for M2MIoT smart grids applicationsrdquo inProceedings of the 2017 AEIT International Annual Conferencepp 1ndash6 Cagliari September 2017

[12] A Mihovska and M Sarkar ldquoSmart connectivity for internet ofthings (IoT) applicationsrdquo in New Advances in the Internet ofThings vol 715 of Studies in Computational Intelligence pp 105ndash118 Springer International Publishing Cham 2018

[13] N Kouzayha M Jaber and Z Dawy ldquoM2M data aggregationover cellular networks signaling-delay trade-offsrdquo in Proceed-ings of the 2014 IEEE Globecom Workshops (GC Wkshps) pp1155ndash1160 December 2014

[14] J Xu L Chen and P Zhou ldquoJoint service caching and taskoffloading for mobile edge computing in dense networksrdquoArXiv e-prints 180105868 Jan 2018

[15] O Y Bursalioglu Z Li C Wang and H PapadopoulosldquoEfficient C-RAN random access for IoT devices learning linksvia recommendation systemsrdquo ArXiv e-prints 180104001 Jan2018

[16] H Li K Ota and M Dong ldquoLearning IoT in edge deeplearning for the internet of things with edge computingrdquo IEEENetwork vol 32 no 1 pp 96ndash101 2018

[17] E Oyekanlu ldquoPredictive edge computing for time series ofindustrial IoT and large scale critical infrastructure based onopen-source software analytic of big datardquo in Proceedings of the2017 IEEE International Conference on Big Data (Big Data) pp1663ndash1669 Boston MA USA December 2017

[18] S Barbarossa S Sardellitti E Ceci and M Merluzzi ldquoTheedge cloud a holistic view of communication computation andcachingrdquo ArXiv e-prints 180200700 Feb 2018

[19] T X Vu S Chatzinotas and B Ottersten ldquoEdge-cachingwireless networks performance analysis and optimizationrdquoIEEE Transactions on Wireless Communications vol 17 no 4pp 2827ndash2839 2018

[20] ITU-R ldquoPropagation data and prediction methods for theplanning of short-range outdoor radiocommunication sys-tems and radio local area networks in the frequency range300 MHz to 100 GHzrdquo International TelecommunicationUnionmdashRadiocommunication Sector Geneva 2017 Recommen-dation ITU-R P1411-9

Wireless Communications and Mobile Computing 11

[21] L P KaelblingM L Littman andAWMoore ldquoReinforcementlearning a surveyrdquo Journal of Artificial Intelligence Research vol4 pp 237ndash285 1996

[22] E M Russek I Momennejad MM Botvinick S J Gershmanand N D Daw ldquoPredictive representations can link model-based reinforcement learning tomodel-freemechanismsrdquo PLoSComputational Biology vol 13 no 9 Article ID e1005768 2017

[23] E Even-Dar and Y Mansour ldquoConvergence of optimistic andincremental q-learningrdquo in Advances in Neural InformationProcessing Systems pp 1499ndash1506 2002

[24] M Jaber M A Imran R Tafazolli and A TukmanovldquoA distributed SON-based user-centric backhaul provisioningschemerdquo IEEE Access vol 4 pp 2314ndash2330 2016

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 11: Energy-Aware Smart Connectivity for IoT Networks: Enabling ...

Wireless Communications and Mobile Computing 11

[21] L P KaelblingM L Littman andAWMoore ldquoReinforcementlearning a surveyrdquo Journal of Artificial Intelligence Research vol4 pp 237ndash285 1996

[22] E M Russek I Momennejad MM Botvinick S J Gershmanand N D Daw ldquoPredictive representations can link model-based reinforcement learning tomodel-freemechanismsrdquo PLoSComputational Biology vol 13 no 9 Article ID e1005768 2017

[23] E Even-Dar and Y Mansour ldquoConvergence of optimistic andincremental q-learningrdquo in Advances in Neural InformationProcessing Systems pp 1499ndash1506 2002

[24] M Jaber M A Imran R Tafazolli and A TukmanovldquoA distributed SON-based user-centric backhaul provisioningschemerdquo IEEE Access vol 4 pp 2314ndash2330 2016

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 12: Energy-Aware Smart Connectivity for IoT Networks: Enabling ...

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom