Machine learning techniques in wireless sensor network...
Transcript of Machine learning techniques in wireless sensor network...
Machine learning techniques in wireless sensor network exploitation
LE BORGNE Yann-Aël, BONTEMPI Gianluca
http://www.ulb.ac.be/di/mlg
ULB Machine Learning Group
1050 Brussels – Belgium
Work supported by the COMP2SYS project, sponsored by the Human Resources and Mobility program of theEuropean community (MEST-CT-2004-505079)
Agenda
Technology, applications andchallenges
Data modelling andpredictions:Application to exhaustivemonitoring applications
Experimental results
Wireless sensors
Wireless sensor modules, aka ‘mote’MicrocontrollerMemoryRadioSensorsBattery
‘Self contained’
Current models
Tmote sky
TI MSP430 8MHz
48KB prog flash, 10KB RAM, 512KB data flash
250kbps radio
Light, temp, hum sensors
µchip (Particle computing)
PIC12F675 4MHz
1.4KB prog flash, 64b SRAM, 128B data flash
19.2kbps radio
Light, temp, movement
Eyes node (EU project)
TI MSP430F149 5MHz
60KB prog flash, 2KB RAM, 4KB data flash
115kbps radio
Mica2Dot (Crossbow)
Atmel AVR 8MHz
128KB prog flash, 4KB RAM, 512KB data SRAM
38.4kbps radio
Taste of the future
Smart dust project:
Golem Deputy Dust (Berkeley)
Features acceleration and light sensors
Only 5mm³
Speckled computing (EU consortium led by uni edimburgh)
Using sensors in spray?
Programming (Event based)
TinyOS: Initiated at Berkeley university
Component based programming (Object)InterfacesImplementation
Event-based operating systemsEvents can be triggered by
System notification (i.e packet received, data ready)Pre-scheduled timer
Limitation: only one task a a time
Programming (Multi-threading)
•Mantis: University of Colorado
•C-Like programming
•Multi-threaded operating system
•Despite a small footprint (500B RAM, 14KB Flash), multi threading adds an overhead in code executionand memory storage
Main advantages ofwireless sensors
Ease of deployment : No wiringinfrastructure
Cheap (Euros -> Cents?)
Provide access to new kind of dataEveryday environmentRemote and hostile environment
Potential domains ofapplications
Wireless sensors could be used in a variety ofdomains:
Environmental monitoringAgriculture Civil engineering IndustryHealth care Defense….
Examples…
Habitat monitoring - Great duck island - 2003Initial deployment – Intel Berkeley and College of Bar HarborThe goal was to study Leach’s Storm Petrel nesting habitsOver 150 Mica nodes deployed for 4 months
•Monitoring volcanic eruption – Tungurahua Volcano - 2005
•Coordinator: Harvard University
•Goal: Real time survey of volcano activity
•16 Tmote deployed for 2 months
WSN: A gateway to thephysical environment
•Ideally:
•Sensors continuously sample the environment
•Report their measurements to a central server
•Data is then monitored by an end user or processed by the computer to achieve the desired task
Main challenges
EnergyWether relying on batteries or renewable energy, energy budget must be carefully usedE.g. with Tmotes, lifetime with 2AA batteries is a few days if operated continuously
BandwidthOnly a certain amount of data can be transmittedby a sensorThis amount is reduced when data routing isrequired to get to the central server
Energy consumption andduty cycle
Component Power consumption
CPU 3mW
Radio receive 38mW
Radio transmit
35mW
Flash read 7mW
Flash write 27mW
Sleep 15µW
Tmote Sky power consumption
•With 2AA batteries, operation time is about a few days if radio and CPU on
Possibility to switch components on and off (CPU, memory, radio)
Use of duty cycle:Motes are in a sleeping state most of
the timeRegularly wake up, take measurements,
and forward or send packets
•Example: Duty cycle of 20% would increasemote lifetime by a factor 5
Beyond duty cycle
Sensor activity and bandwith can befurther reduced by modelling data
CompressionUse of prediction models
Efficiency of these techniques dependson the application
Two main types of data gathering schemes
Exhaustive monitoring:All data collected by sensors are required at the base stationExample: Environmental monitoring, climatic studies, pilot studiesInformation can be compressed using spatial and temporal depndencies
High level information extraction: The whole network is used to achieve a ‘high level’ functionExample: Chemical product or vehicule recognition, eventdetection, tracking…Information can be further processed within the network to extract only relevant features for the final task
Part 2
Data modelling andpredictions:
Application to exhaustivemonitoring applications
Exhaustive monitoringFor example (TinyDB syntax)
Select temp FROM{s1,s2,s3,s4,s5}SAMPLE PERIOD 1000
Collects temperature readings from a set of five sensorsevery 1000ms.
t s1 s2 s3 s4 s5
1 s1(1) s2(1) s3(1) s4(1) s5(1)
3 s1(3) s2(3) s3(3) s4(3) s5(3)… … … … … …
2 s1(2) s2(2) s3(2) s4(2) s5(2)Observation database DTobtained over time
Using prediction models for reducingboth mote activity and bandwidth
General idea:Find prediction models that can predict the measurementsof a sensor given a subset of othersUse these prediction models to reduce the number ofsensors that need to report their measurements
Example:
If prediction models can be found for sensors in blue, only sensorsin black need to queried, allowing sensors in blue to remain in theirsleeping state.
Extension to multiple subsets
Use of different subsets of queried sensors so thatEnergy consumption is better distributedAll sensors keep on being queried
If predicted subsets such as above can be found, they can be usedalternatively, and network lifetime would be extended by a factor 2.
Predicted subet #1 (blue) Predicted subset #2 (blue)
Predicted/Queried subsets
The goal is therefore to find partitions of the set ofsensors in different pairs of subsets <Sp,Sq>:
Sq: Subset of sensors whose readings are queriedSp: Subset of sensors whose readings are predicted from SqSq U Sp = S
Let S* be the set of pairs {<Sp,Sq>} such that:Each sensor sp in Sp is ε-predictable (more on next slides), i.e. is predictable within a user defined error bound εSq is minimal: No sensor in Sq is ε-predictable from a subsetof Sq
Predictability for a sensor spgiven a subset of sensors sq
Choice for a class ofprediction models
Learning procedureŝp(sq(t))
Observation Dataset DT
Error estimation ĜpCross validation
Ex: Linear regression, neural nets, K-nearest neighbours,…
Set of T observations
DT is a T*S matrix
Ex: Least mean square, backpropagation, lazy learning
Ex: ε-approximation: P(|ŝp(t)-sp(t)|>ε)
Mean square error: Et[(ŝp(t)-sp(t))²]< ε
ŝp : Model for predicted sensor
Sq: Set of predictor sensors
sp is predictable from sq if Ĝp matches the user-defined errorbound ε
Prediction model choice:Closer look at the data
Ant nest experimentGoal: Control the degree ofhumidity of air
Cluster room monitoringGoal: Control thetemperature in the cluster room
Humidity percentage of two nearby sensors
Temperature readings of sensor under airco system and sensor next to the door
Correlations
A powerful way to catch these correlations is by modelling
data by multi dimensional gaussians:
Data collected during the first 2 hours of the cluster room experiment
Spatial correlations are very strong
Predictions using multi dimensional gaussians (MDG)
From a set of observed data DT, MDG can catch all lineardependencies in the probabilistic model
Interestingly, a MDG conditioned on observed readings isalso a MDG. This allows to extract expected value andvariance for a sensor sp given a set of observed sensors{sq}:
• Using the conditioned variance, a sensor sp is ε-predictable if
ε <1.96*sqrt(Σŝp|sq)
(95% confidence level in error bound)
Example
•Sensor 2 predictable from sensor 1?
1.96*σŝ2|s1=0.23 so if ε<0.23, sensor 2 is predictable
•What is sensor 2 prediction when sensor 1 is 23°C?
μŝ2|s1=23.8
Global statisticsμ(s)=(21.6,22.8)
1.96.σs1=1.07
1.96.σs2=0.9
How to construct a pair <Sp,Sq>?
Ranking criterion Tj for ordering sensors sj accordingto their time to live
Remaining time to live Tj of each mote: Time by which a mote is expected to dieTj=Qremaining/Qactivity
Qactivity: Average energy spent during active stateQremaining: Energy remaining
Run a ‘backward search’:Two subsets {Sq}<-{S} and {Sp}<-{}Remove sensors sj (sorted by Tj) from {Sq} and add them to {Sp}, if sj ε-predictable from {Sq}
Exploitation of the systemLet a cycle of length K be a set of K pairs of subset ki=<Spi,Sqi>from S* (Construction of the cycle on next slide)
This cycle can be represented by an activity matrix Mij
0 1 … 1 0
1 1 … 0 0
… … … … …
0 0 … 1 1
1 0 … 1 0
Each column of the activitymatrix represents an activityschedule for a sensor (0 stands for sleeping and 1 for active)
Once the matrix is designed, column vectors (K bits) are sent to corresponding motes, which apply it cyclically and only sendreadings when required
Choice for K: Discussion later…
k1
k2
kK-1
kK
s1 s2 sS-1 sS
Matrix constructionStart with matrix filled with 1 (Sqi=S for all i)For each step i
Sort sensors by TjBackward elimination:
Remove sensors, sorted by increasing Tj, from Sqi and add them to Spi if predictable
Update TjFor sensors not queried, add them to a step of the matrix
Example: K=3, 4 sensors:
0 0 1 1
1 0 0 1
0 1 1 0
T1=T2=T3=T4=1000
0 0 1 1
1 0 0 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
0 0 1 1
1 1 1 1
1 1 1 1
T1=T2=1000*3/2=1500
T3=T4=1000
T1=T3=1000*3/2=1500
T2=1000*3/1=3000
T4=1000
T1=T4=1000*3/2=1500
T2=T3=1000*3/1=3000
K1
K2
K3
S1 S2 S3 S4
K parameter
Tradeoff betweenEnergy distribution: As K grows, more chances are given to distribute energyCost of sending activity schedule to motes
No theoretical evaluation but:Only K bits are needed to send a schedule to a moteThis cost is negligeable if the cycle is not changedtoo often
Synthetic view of the system
Collect data fom all sensors for a period TCompute the multi dimentional gaussian
Create activity matrix MKS row by rowFind a pair <Sp,Sq> according to user defined ε andtime to live TjUpdate time to live Tj
Send mote schedules (MKS columns) to corresponding motes
Part 3
Case study:Monitoring temperature
in the cluster room
Experimental results
Dataset: 7 temperature sensors in the cluster room, sampling temperature every 30 seconds for 2 days (5760 readings)
Sensor 6, under HVAC Sensor 2, Close to the door
Example of collected data:
Activity matrix and lifetimeextension fator
Gaussian is computed over the first 2 hoursCycle length is K=7ε-threshold is varied from 0.1 to 1 degreeInitial Tj are 1000
0 0 0 0 0 0 1
0 0 0 0 0 1 0
0 0 0 0 1 0 0
0 0 0 1 0 0 0
0 0 1 0 0 0 0
0 1 0 0 0 0 0
1 0 0 0 0 0 0
1 0 0 1 0 0 0
0 0 1 0 0 0 1
0 1 0 0 1 0 0
0 0 0 1 0 0 0
0 0 1 0 0 1 0
0 1 0 0 0 0 0
0 0 0 1 0 0 0
1 1 1 1 1 1 1
1 1 1 1 1 1 1
1 1 1 1 1 1 1
1 1 1 1 1 1 1
1 1 1 1 1 1 1
1 1 1 1 1 1 1
1 1 1 1 1 1 1
Lifetime *7 Lifetime *3.5 Lifetime *1ε =1°C ε =0.5°C ε =0.1°C
Increasing K
ε=0.5°C
•As K grows, energy consumption isbetter distributed among motes
•lifetime factor tends asymptoticallyto its optimum
•Given a maximum bound on K (user defined), the system returns theactivity matrix MKS such that lifetimefactor is maximized, here K=3.
Error bound violationsFirst Day
ε=0.5°C, K=3Absolute error is the absolute difference betweenthe model prediction and the actual readings
Sensor 1, P(|ŝ1-s1|>0.5)=0.007 Sensor 2, P(|ŝ2-s2|>0.5)=0.008
Error bound violationsSecond day
Sensor 1 Sensor 2
•Error grows abnormally bigger
•Change in the data distribution!!!
Closer look at data
Sensor 1: At the end of the second day, batteries are exhausted, andreadings are erroneous
Sensor 2: Someone entered theroom during the second day. Sharp change in the model establishedduring the first two hours
Events cause the model to bewrong
The system is based on predictions
When events change the original distribution used for identifying the model, the model getswrong
How to detect such events?
Use of conditional probabilitiesover time
From the activity matrix K=3
1 0 0 1 0 1 0
0 0 1 0 0 0 1
0 1 0 0 1 0 0
• It is possible to find prediction models for the distribution of a sensor at a given step of the cycle given sensors queried at previoussteps
• This would allow the system to check, using temporal and spatial dependencies, wether changes in the distribution occur
•Compute MDG of for all entriesequal to 1
•Extract conditional probabilitiesfrom the MDG
Ongoing work
Detecting change in the distribution
Online updating of the modelOnce the model is defined, it is possible to update the covariance matrix with new values
Other models (Linear/Non linear) and ways to estimate the error (PRESS)
Other ongoing work
Temporal predictionSensors build models for temporal predictions, based on past valuesCentral server apply exactly the same prediction methodsSensors only send readings when they differ from the modelfor a given accuracy specified by the end user
LimitsIt requires the motes to wake up and check if the new reading match the predicted one. System is good atdetecting changes, but has lower energy savings
GoalFusion both strategies (cycle based on spatial predictionsand temporal predictions)
Other ongoing work
‘High level’ data extractionApplications where the network is used to classify events(Chemical product recognition, vehicle classification, waveintensity prediction…)Output is a function of the whole set of data provided by thenetwork
Identification of subsets of sensors that can achievethe task as well as the whole set of sensors
How to fusion data within the network to extract goodpredictors at the central server?
Thank you for yourattention
References
Y. Le Borgne, G. Bontempi. « Round robin cycle for predictions in wireless sensor networks ». Proc. of 2nd international conference on Intelligent sensors, Sensor networks and Information Processing, 2005.A. Desphande, C. Guestrin, S. Madden, J. Hellerstein, W. Hong. « Model driven data acquisition in sensor networks ». Proc. of the 30th VLDB conference, 2004.
ULB MLG wireless sensor WebSitehttp://www.ulb.ac.be/di/mlg/sensorNet/index.html
Wireless sensor networks: An information processing approach
F. Zhao and L. Guibas (2005)