Mobile CrowdSensing and Context-aware Real-time Data Fusion in MCS Applications
description
Transcript of Mobile CrowdSensing and Context-aware Real-time Data Fusion in MCS Applications
© 2011 IBM Corporation
Mobile CrowdSensing and Context-aware Real-time Data Fusion in MCS Applications
Hui Lei, Fan Ye
IBM T. J. Watson Research
© 2010 IBM Corporation
Outline
Mobile Crowdsensing and its applications A general MCS platform and brief research agenda Conext-aware real-time data fusion in MCS applications Relevance to Army scenarios Summary
Contributions from Han Chen, Raghu Ganti as well
© 2010 IBM Corporation
What is Mobile Crowdsensing?
• Mobile sensing devices are pervasively available and are a rich and inexpensive source of sensor data
• One data point: 59.6 million iPhone users• Sensors embedded in the iPhone: GPS, accelerometer, gyroscope, ambient light, proximity,
microphone, and camera sensors• Mobile crowdsensing (MCS) refers to applications that leverage consumer mobile devices (GPS,
smart phones, and car sensors) to collect and share data about the user or the physical world, either interactively or autonomously, towards a common goal
• Mobile crowdsensing enables a new category of applications, both participatory and opportunistic, including smarter city, smarter transportation, and smarter energy applications without requiring major investments in the sensing infrastructure
Mobile sensor data collection, analysis, and
consumption
© 2010 IBM Corporation
Example MCS application: Public safety
Sample phones to obtain human crowd density, movements in large areas to maintain safety of the public– Selectively sample a set of devices for location, density,
temperature, noise levels– Aggregate to obtain global picture of crowd density and
movements Handle emergencies such as disasters (e.g., hurricanes, high-rise
building fire) and terror attacks– Authorities (police, fire) can gain an overall picture of the
situation at different places to prioritize and coordinate the response
– Individuals can report their locations (e.g., room/floor when trapped in a high-rise) and video/picture/voice/text report of detailed situation
Public event order maintenance– Stampede causes numerous deaths/injuries
• Cambodia bridge 10’, 350 deaths; Chicago station nightclub 03’, 100 deaths/230 injured; German Love Parade 10’, 19/342; Mecca Saudi Arabia, more than 2000 deaths since 90’
• Tens of incidents around the global even in 21st century and thousands of deaths/injuries
– Authorities can estimate the number of people participating a public event (e.g., protests, aggregations, open-air concerts) and ensure the orderly movements of people
Help tracking and modeling of disease dissemination (e.g., SARS)– Based on the movement pattern and density of crowd flow,
CDC may build better models to predict the scope and speed of disease spread and take appropriate control measures
© 2010 IBM Corporation
More Example Applications for the Enterprise
Application Category Sensors Used Events and Conditions Detected
Benefits to organizations Benefits to individuals
Smarter Air Travelers
Airlines and Airports
GPS, accelerometer, Bluetooth, NFC, people,
Passenger whereabouts and itinerary
Estimated wait times at security checkpoints, customs, and terminal restaurants
Estimated travel times between gates
Fewer passengers who miss a flight
Better handling of overbooked flights
Targeted and context-sensitive promotions
Improved handling of service lines
Improved customer satisfaction and loyalty
Better travel planning Notifications on alternative flights
and easy rebookingBetter navigation through terminal
with options for finding least crowded restaurants and security checkpoints, and turn-by-turn walking directions
Personalized recommendation on terminal restaurants and shops
Fleet Management
Logistics GPS, OBD (speed, acceleration), people
Location of load, demand, and capacity, traffic conditions, driving patterns
Enhanced scheduling and dispatching,
Improved asset utilization,Improved safety records
Service responsiveness,Improved visibility
© 2010 IBM Corporation
More Example Applications for the public or individuals
Application Category Sensors Used Events and Conditions Detected
Benefits to organizations Benefits to individuals
Highway Monitoring
Smarter City
GPS, OBD (speed, acceleration, fuel consumption), camera
Traffic hotspots, potholes, accidents, road work, road closures, dysfunctional traffic lights, “green-ess” of road segments
Highway infrastructure maintenance and planning,
Carbon footprint reduction
Better planning of routes and travel times,
Savings in fuel consumption
Public Safety Smarter City
GPS, microphone, temperature
Public events and location, crowd density
More accurate event reporting,
Effective evacuation, Stampede prevention,Improved resource allocation
Personal safety and comfort,Awareness of event hotspots
“Sense Me” Social Networking
GPS, accelerometer, microphone
User activities, social context, significant places, behavior
Targeted marketing enabled by recognition of user behavior patterns and preferences
Autonomous sharing within social circles,
Enhanced social interactions
Biker Net Social Networking
GPS, accelerometer, microphone, camera
Conditions of bike paths (slopes, air quality, noise level, cars passing by), health and performance of bikers
Monitoring and improvement of bike path conditions
Assessment of biking performance, Improved biking experience,Formation of new social networks,Peer comparison
© 2010 IBM Corporation
Mobile Crowdsensing Presents Unique Challenges(compared to conventional sensor-based applications)
• The population of mobile sensing devices is highly dynamic. There may be excess or gaps in sensing capabilities at times.
• Depending on resource availability on the device, the sensing function is not always available for external use.
• Crowdsensing data may contribute to many diverse use cases, while a conventional sensor network typically supports a single use case
• Human participants are an important part of crowdsensing. A social architecture with incentive mechanisms is required to recruit, engage, and retain the human participants.
• The privacy of the human participants must be preserved.
© 2010 IBM Corporation
Current State of the Art: Application Silos
Current state of the art as verified by nearly 30 existing crowdsensing applications studied– Each application requires two application-specific
pieces, one on device and one in the backend– Push data from device to backend, with optional
primitive processing on device, upon triggering conditions (e.g., entering a store, on a bus)
Limitations of the current paradigm– Inability to scale
• Phones have a cap on the number of concurrent applications
• Data gathered from societal-scale sensing may overwhelm network and backend server
– Low efficiency: Applications perform sensing and processing activities independently without understanding the consequences on each other• Likely duplicate sensing and processing across
applications• No collaboration or coordination across devices.
Devices may not all be needed when the device population is dense
– Hard to program• Applications have to address challenges in energy,
privacy and data quality in an ad hoc manner, reinventing the wheel all the time
• Applications have to deal with heterogeneous devices, limiting the number of device platforms an application can run on
App 1 App 2 App k
App1
App2
App k
App1
App k
App1
App2
App k
App2
© 2010 IBM Corporation
Our Vision: A general MCS platform
Develop a general platform reusable for different mobile crowdsensing applications
– On devices: Provide middleware components that run on mobile sensing devices for enabling crowdsensing in a coordinated, privacy-preserving, and energy-adaptive manner
– MCS Device Agent: supports interaction with the backend infrastructure
– Social Architecture: supports interaction with human participants in crowdsensing
– MCS Data Collector: supports interaction with embedded sensors
–At the cloud backend:– Provide MCS Gateways at the edge of the
network to connect with local mobile devices and present an aggregated view of vicinity sensor data to backend applications
– Provide an MCS Data Broker as a backend application service that consolidates data needs from multiple applications and discover and orchestrate data needed by applications
– Build a rich Domain Analytics Library for processing temporal-spatial crowdsensing data to derive domain-specific insight
Mobile SensingDevices
Access Applianc
es
Wide AreaNetwork
Application
Gateway
Data Center
Smart supply chain
Smart grid
Smart healthc
are
Smart buildin
g
app1 app2 app3
MCS Data
Broker
MCS Gateway
Domain Analytic
s Library
Social Architectu
re
MCS Device Agent
MCS Data Collector
© 2010 IBM Corporation
Research Agenda in Brief
Understand data needs specification from applications and negotiate with gateways to select those that can provide the requested data
Maintain metadata about device and aggregate data availability Select devices that can produce the desired data and generate sensing directives to
configure their sensing plug-ins Monitor changes in data availability and quality, and make adaptations to ensure data
needs are satisfied continuously A set of commonly used sensing analytics running on devices A device agent that coordinates the sensing and local processing activities for
efficiency Application specific problems in both local sensing and backend mining
© 2010 IBM Corporation
Real-time Data Fusion in the Monitoring of Human Crowd Distribution and Movements
In airports, railway stations, public gathering, how to gain an accuracy overview of the human crowd distribution and their movements?
Conventional infrastructure based methods (e.g., camera-based) have drawbacks
– Constrained by angle, light conditions, moving speed, density of population
– Not easy to track the movements across cameras
– Complete coverage requires careful planning
– Cost in installation and maintenance of the infrastructure Using the sensors (radios, microphones) on phones, they can detect devices in vicinity and estimate the
size of local neighborhood
– Obtain overall distribution and movements by the fusion of information from individual devices Interesting tradeoff between efficiency and real-time
– Least latency when phones are sensing and reporting continuously, but with worse efficiency in energy and bandwidth
Key to efficiency: the scanning frequency of each device should depend on 1) how much changes has happened since last time, and 2) how much other devices have already reported
– More changes in neighborhood, more frequent scanning of the neighborhood
– Devices can piggyback how much they have reported during scanning exchange; or a waking up device can obtain hints from backend first
Generalize to an optimization problem: given a tolerable latency, how should devices adapt their scanning such that the energy consumption is minimized?
Joint work with University of Minnesota, Tian He
© 2010 IBM Corporation
Context-aware Data Fusion in Queuing Time Estimation
Queuing time is an important piece of information in many application scenarios– Queues at check-in counters, security, restaurants in airports; check-out lines in
supermarkets– Important for airport operators to foresee potential bottlenecks and for passengers to
plan their journey– Infrastructure based methods require significant human and monetary costs in
installation and maintenance (e.g,. BLIP systems) Use the accelerometer data to infer the changes in human activities and estimate queuing
time– Sample data is promising: different patterns for queuing comparing to walking
Use context input to improve accuracy and reduce false alarms– A person wondering around may be falsely interpreted as in a queue– Leverage the context: similar temporal patterns from different passengers in the same
spatial scope indicate they are more likely in a queue• Those in the same queue move in subsequent order, and move in about same
distance / time Collecting more data to measure how reliable the context is to differentiate between
wondering and queuing Generalize: how to build a framework to exploit the spatial and temporal correlation in
context to more reliably infer the information?
X acceleration
-15
-10
-5
0
5
10
15
1.299E+12 1.299E+12 1.299E+12 1.299E+12 1.299E+12 1.299E+12 1.299E+12 1.299E+12 1.299E+12
Time
Acc
eler
atio
n
Walking Food Queue
Walking
Check-outQueue
Walking Food Queue
Joint work with UIUC, Jiawei Han
© 2010 IBM Corporation
Penetration Threshold for Reliable Data Fusion
Samples may be sparse: not everyone carries a phone; not every phone installs the MCS agent
What’s the minimum penetration threshold needed for reliable fusion of results?– Examples: how many samples needed for 90% confidence in queuing time estimation, or
fuel consumption prediction? Model the fusion from theoretical perspective
– Human arrival follow Poisson distribution; service time follow exponential distribution– Derive the statistical bounds for the threshold to achieve a certain confidence level of
estimation Compare against empirical data for validation
– Collect the fuel consumption data from ~100 truck fleet– Use data from different fractions of the trucks to predict the overall fuel consumption
• Find the minimum fraction needed to have predictions within certain margins to actual numbers
Generalize– if only a certain percentage of devices can be sampled, which ones to choose so as to
maximize the confidence– How to progressively make the selection, each time using the previously sampled subset
as the context to help choose the next batch?
Joint work with UIUC, Tarek Abdelzaher
© 2010 IBM Corporation
Relevance to Army problems
In non-conventional combat scenarios that require crowd control, e.g., civilian order maintenance, peacekeeping
– Monitor the distribution and movements of crowds of different mixtures• Gain a broad and real-time view of the overall situation• Complementary to existing technologies relying on special hardware or
infrastructure (cameras, UAVs)– Support both overview of large crowds and zoom-in on finer spots
• Instruct devices around interesting phenomena to sample on more modalities, higher frequencies, or finer granularities
In disaster response where ground conditions and relief resources change quickly– Monitor the evolving situation for victims that need care and attention– Keep updated about the location of response personnel and amount of relief supply– Prioritize and coordinate the response efforts such that personnel and supply are
directed to most urgent cases The MCS platform has broader impact on intelligence collection in the battlefield
– Each GI / vehicle can be equipped with various sensors and mobile devices– Multiple applications, each of which serving a different purpose (e.g., one for road
conditions and one for suspicious activities), can run in parallel on the same MCS platform, collecting data from the same underlying set of devices and sensors
– The MCS platform will handle the dynamic changes in device population, mobility, and resource levels to ensure quality in application data
© 2010 IBM Corporation
Summary of the status
MCS is a new paradigm to build large scale sensing applications but the current approach has major drawbacks
Defined the architecture of MCS platform and functions of its components Identified a number of key application scenarios and a research agenda to drive the
development of the MCS platform– Public safety– Airport pax flow monitoring
Initial results for context-aware real-time data fusion issues in a few driving application problems, with collaborators from schools
– Real-time crowd detection and movement tracking– Context-aware queuing time estimation in public transport – Penetration threshold analysis for reliable data fusion
The general MCS platform will greatly easy the development of applications, including those relevant to Army
Submissions and on-going research efforts– Mobile Crowdsensing: Current State and Future Challenges, Raghu Ganti, Fan Ye,
Hui Lei, in submission to IEEE Comm. Magazine– Ongoing research work with Jiawei Han, Tarek Abdelzaher
© 2010 IBM Corporation
Backup
© 2010 IBM Corporation
Existing Crowdsensing Applications
Traffic– CarTel - Traffic using mobile phones– Nericell - Monitoring road and traffic conditions
using mobile phones– Cooperative transit tracking– GreenGPS - Fuel consumption
Individual health, entertainment, finance– BikeNet - Bike route monitoring– CenceMe - Sensing presence in social networks– CenWits - Hiker tracking using sensor networks
(can be easily extended to mobile phones)– DietSense - Diet monitoring using pictures of
what you eat– Clean cooking in India– Market price dispersion - mobile phones + bill
scanning Public works maintenance
– Garbage watch– Pothole portal
18
Environment– Suelo - Human assisted soil monitoring– Common Sense - Air quality monitoring using
handheld devices– Ear phone assessment - Noise pollution monitoring– Harbor monitoring - Monitor quality of harbor using
mobile phones– PEIR - Personal environment monitoring– Hab watch - Monitor habitats– SoundSense - Noise pollution monitoring– CreekWatch – Creek monitoring
Life in a city– PetrolWatch - Monitor petrol prices using mobile
phone cameras– Neighborhood culture and identities inferred from
mobile phones– ParkNet - Parking space estimation– Video highlights of events using mobile phones– Walkability - Safety of walking on streets– YellowButton – Emergency reporting
© 2010 IBM Corporation
Crowdsensing: From Autonomous to Participatory
• A continuum of effort: Mobile crowdsensing varies along a continuum from autonomous to participatory depending on how much effort individuals must extend to gather data
• A range of incentives: The incentives (and incentive systems) needed depend on how much effort is being asked of the user; but even autonomous crowdsensing requires some incentive to attract users and encourage them to opt in
Users are made aware of the crowdsensing application and must install app and opt in to sensing activity
User may turn sensing on/off
User goes to places / takes routes where data needed
User may take measures to improve qualityof a sample
User may need to do things to collect a sample
User may turn sensing on/off
User goes to places / takes routes where data needed
User may take measures to improve qualityof a sample
User may turn sensing on/off
User goes to places / takes routes where data needed
User may turn sensing on/off
Autonomous ParticipatoryMobile Crowdsensing
© 2010 IBM Corporation
Example Applications - Enterprise
Application Category Sensors Used
Events and Conditions Detected
Benefits to organizations
Benefits to individuals
Smarter Air Travelers
Airlines and Airports
GPS, accelerometer, Bluetooth, NFC, people,
Passenger whereabouts and itinerary
Estimated wait times at security checkpoints, customs, and terminal restaurants
Estimated travel times between gates
Fewer passengers who miss a flight
Better handling of overbooked flights
Targeted and context-sensitive promotions
Improved handling of service lines
Improved customer satisfaction and loyalty
Better travel planning Notifications on alternative flights
and easy rebookingBetter navigation through terminal
with options for finding least crowded restaurants and security checkpoints, and turn-by-turn walking directions
Personalized recommendation on terminal restaurants and shops
Fleet Management
Logistics GPS, OBD (speed, acceleration), people
Location of load, demand, and capacity, traffic conditions, driving patterns
Enhanced scheduling and dispatching,
Improved asset utilization,Improved safety records
Service responsiveness,Improved visibility
Mobile Workforce Management
Multi-sector GPS, accelerometer, people
Mobile worker activities, mobility patterns, time spent on jobs
Optimization of scheduling and workforce utilization,
Performance management, Recommendation of best
practicesImproved safety
Improved productivity,Better job satisfaction,Self assessment, Better collaboration
Cellular coverage
Telco GPS, cell signal quality
Signal quality map of cell phones
Enhanced cell tower placement and improved call services
Reduced number of dropped calls, Improved call quality
Credit card authorization
Finance GPS, accelerometer
Card holder location, proximity to where a credit card is used
Real-time detection of potential credit card frauds
Fraud protection, Quality of experience
© 2010 IBM Corporation
Example Applications
Application Category Sensors Used Events and Conditions Detected
Benefits to organizations
Benefits to individuals
Highway Monitoring
Smarter City
GPS, OBD (speed, acceleration, fuel consumption), camera
Traffic hotspots, potholes, accidents, road work, road closures, dysfunctional traffic lights, “green-ess” of road segments
Highway infrastructure maintenance and planning,
Carbon footprint reduction
Better planning of routes and travel times,
Savings in fuel consumption
Public Safety Smarter City
GPS, microphone, temperature
Public events and location, crowd density
More accurate event reporting,
Effective evacuation, Stampede prevention,Improved resource allocation
Personal safety and comfort,Awareness of event hotspots
“Sense Me” Social Networking
GPS, accelerometer, microphone
User activities, social context, significant places, behavior
Targeted marketing enabled by recognition of user behavior patterns and preferences
Autonomous sharing within social circles,
Enhanced social interactions
Biker Net Social Networking
GPS, accelerometer, microphone, camera
Conditions of bike paths (slopes, air quality, noise level, cars passing by), health and performance of bikers
Monitoring and improvement of bike path conditions
Assessment of biking performance, Improved biking experience,Formation of new social networks,Peer comparison
© 2010 IBM Corporation
Functions at the broker Receive data needs specs from applications
– What elements should we have in the data needs spec language. Consolidate data needs from multiple applications
– Identify and avoid duplicates in different data needs. Two cases
– 1) the same data is requested by more than one application;
– 2) higher quality data (e.g., higher resolution) is requested by another application but can be consumed by different apps
Negotiate with gateways and select those that can provide the requested data
– Maybe the same language could be used by gateways to describe data availability at aggregate level?
Reuse/adapt existing data streams for a new data need
– Adapt existing data streams for new data needs
– 1) existing subset of streams: same event types, at the same spatial/temporal scope, with the same or higher resolution/quality,
– 2) the new app requires data semantically at a lower level than existing apps: pothole and spikes
• One plausible solution is to make modular local analytics and migrate some of them such that they can run either at devices or gateways.
• Alternative solutions include: 1) allow broker/gateways to generate and host a small "flow graph" that turns low level events into high level ones. The risk is lack of limit on the complexity of such auto generated 'flow graphs'. 2) require app developers to make apps flexible to use possibly different types of events, and let broker and apps negotiate to switch to different event types. The drawback is it places extra burden on app developers.
© 2010 IBM Corporation
Functions at gateways (1) Divide devices into groups and assign each to a gateway
– Mostly for a data center, the gateway is probably a process on a VM. – The division may not have to follow geographical constraints. What are the possible division
schemes to facilitate the search of devices groups given data needs spec? Maintain metadata about device and aggregate data availability
– what state to maintain about the data availability of individual devices, in what form? • for gateways to track the kind/quality of data each device can produce, so gateways can
select devices and produce sensing directives for devices. – what information to maintain about the aggregate data availability, and in what form?
• for negotiation between broker and gateways, and for the broker to choose which gateways can satisfy a data need.
– Some elements in the metadata include: individual devices' event types, energy, quality in different metrics; at aggregate level, event types and quality metrics. It's likely the data needs spec language or its elements could be reused here.
Select devices that can produce the desired data and generate sensing directives to configure their sensing plug-ins
– given data needs from the broker, how a gateway decides which devices to choose• Sensing events are straightforward, but quality may not be. • Define quality as multi-dimensional scalar values, • Or a comparator is supplied to compute the quality and make the decision on the fly
– how to generate parameters to configure corresponding sensing plug-ins. • which sensor to sample, at what granularity, and what local analytics to run, etc.
© 2010 IBM Corporation
Functions at gateways (2)
Device selection– May need a universal language and negotiation protocol that runs between
lower/upper layers, both b/w devices and gateways, and gateways and the broker Monitor changes in data availability and quality, and make adaptations to ensure data
needs are satisfied continuously– selecting different subsets of devices, or adapting their sensing directives. – If gateways can no longer shield the changes, they may need to notify the broker to
select other gateways.– May need to define common quality metrics and some controlling mechanisms,
possibly in the form of a library of common quality control modules reusable by different apps.
– Some utility-based approach could be used for the adaptation, e.g., selecting different devices or adapting sensing directives, based on the utility for apps (i.e., benefits) and devices/owners (i.e., costs).
'de-perturb' the noised added to the data by the privacy mechanism on devices – Aggregate data and remove the aggregate of noises
© 2010 IBM Corporation
Functions at devices
A set of common local analytic plug-ins, and the common plug-in development spec– local analytics written based on the spec can be installed and managed in a plug-in management
platform (i.e., agent platform)– There're probably two kinds of plug-ins:
• Those call the physical sensor access API on the device OS and provide a device-independent API for processing analytics;
• those that takes the data produced from the first kind and do some local processing to produce sensing events. – A common plug-in library shipped as part of MCS middleware will make it easy to develop new apps. – Some plug-in that produce semantically more abstract event types (e.g. potholes) could have a
modular composition for efficient placement – a more advanced feature The platform needs to maintain data availability of the hosting device and update such information at the
associated gateway– need to decide what elements exist in the data availability metadata. Besides event types, quality
metrics, resource availability and user policy probably should be included as well– need to design an efficient protocol for devices to synchronize the state maintained by gateways.
Privacy protection of the owner of the devices.– We need to understand what aspects of privacy (e.g., location, activity pattern) the owner wants to
protect and design mechanisms to provide such privacy. Incentive mechanism to recruit, engage and retain owners to participate
– This function is likely to run across three layers. Need to define how it interacts with the rest of MCS. Hand-off protocols for devices to move across association boundaries of gateways
– Debate on implication between architecture and business/operating models