Listening to the pulse of our cities with Stream Reasoning (and few more technologies)
-
Upload
emanuele-della-valle -
Category
Internet
-
view
647 -
download
0
Transcript of Listening to the pulse of our cities with Stream Reasoning (and few more technologies)
Listening to the pulse of our cities with Stream Reasoning (and few more technologies)Emanuele Della Valle@manudellavalle - [email protected]://emanueledellavalle.org
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
Share, Remix, Reuse — Legally This work is licensed under the Creative Commons
Attribution 3.0 Unported License. Your are free:
• to Share — to copy, distribute and transmit the work
• to Remix — to adapt the work Under the following conditions
• Attribution — You must attribute the work by inserting– “[source http://emanueledellavalle.org]” at the end of each
reused slide– a credits slide stating
- These slides are partially based on “Listening to the pulse of our cities fusing Social Media Streams and Call Data Records” by Emanuele Della Valle
To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/ 2
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org3
Me
Assistant Professor at DEIBPolitecnico di Milano
Expert in semantic technologies and stream computing
Brander of stream reasoning: an approach to master the velocity and variety dimension of Big Data
15 years experience in research and innovation projects
Startupper: fluxedo.com R&D advisor: socialometers.com
@manudellavalle
http://emanueledellavalle.org
http://streamreasoning.org
http://fluxedo.com
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
Acknowledgements
Politecnico di Milano• DEIB
– What- Scientific direction- Semantic technologies- Stream Processing- Data science
– Who- Emanuele Della Valle- Marco Balduini
• Density Design Lab– What
- Visual analytics– Who
- Paolo Ciuccarelli- Matteo Azzi
Telecom Italia• SKIL Lab
– What- Big Data technology- Data Science
– Who - Fabrizio Antonelli- Roberto Larker
Funding agency
4
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
Agenda
Context Problem Experimental setting Solution Evaluation Conclusions
5
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
The digital reflection of our cities is sharpening
6
[photo: http://hoglundassociates.com/Images/Cloud_Gate.jpg]
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
The digital reflection of our cities is sharpening
7
[photo: http://hoglundassociates.com/Images/Cloud_Gate.jpg]
because the urban environment is captured in open datasets
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
The digital reflection of our cities is sharpening
8
[photo: http://hoglundassociates.com/Images/Cloud_Gate.jpg]
and streams of information flows through our cities thanks to
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
The digital reflection of our cities is sharpening
9
[photo: http://hoglundassociates.com/Images/Cloud_Gate.jpg]
and streams of information flows through our cities thanks tothe pervasive deploymentof sensors
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
The digital reflection of our cities is sharpening
10
[photo: http://hoglundassociates.com/Images/Cloud_Gate.jpg]
and streams of information flows through our cities thanks tothe wide adoption of smart phones
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
The digital reflection of our cities is sharpening
11
[photo: http://hoglundassociates.com/Images/Cloud_Gate.jpg]
and streams of information flows through our cities thanks tothe usage of (location-based) social networks
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
and it tracks changes with a decreasing delay
12
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
and it tracks changes with a decreasing delay
13
Data source By when Frequency DelayCensus data 100s year years monthsNewspaper 100s year days 1 dayWeather sensors 10s year hours/minutes hours/minutesTV news 10s years hours minutesTraffic sensors years 15 minutes minutesCall Data Recors years 15 minutes hoursSocial media years seconds seconds IoT recently milliseconds milliseconds
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org 14
Data piles up without easing decision making
I have to decide:A or B?
Why not C?What if D?
mayor
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
But smarter Big Data can …
…advance our ability to feel the pulse of our cities
15
fusing all those data sources
making sense of the fused information
mayor
Definitely E!
to improve decision making and deliver innovative services
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
Can we collect, analyse and repurpose• social media and
• Call Data Records to allow
• perceiving emerging patterns and
• observing their dynamics?
Let's focus on a concrete research question
16
[photo: https://www.flickr.com/photos/debord/4932655275]
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
Can we collect, analyse and repurpose
• social media captured at place and events and
• privacy-preserving aggregates of Call Data Records
to allow visually• perceiving emerging patterns and
• observing their dynamics?
More precisely, the research question is
17
[photo: https://www.flickr.com/photos/debord/4932655275]
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
How to set up an experiment?
18
[photo: https://www.flickr.com/photos/myfuturedotcom/6053042920]
Question AnswerWhich city? MilanComparing what? Milan Design Week vs. Milan in generalExperimental subjects? Event Managers & casual audience
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
What's Milan Design Week?
19
[map: http://www.fuorisalone.it]
The Milan Design Week (MDW) is a city-scale event • held yearly in Milan, • featuring around 1,200 events • in 500+ places spread across the city and • attracting about half a million people from all over the
world.
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org 20
CitySensing for event managers (2013)F. Antonelli, M.Azzi, M.Balduini, P.Ciuccarelli, E.Della Valle, R. Larcher: City sensing: visualising mobile and social data about a city scale event. AVI 2014: 337-338
http://jol.telecomitalia.com/jolskil/citysensing/
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org 21
CitySensing for casual audience (2014)
M.Balduini, E.Della Valle, M.Azzi, R.Larcher, F.Antonelli, and P.Ciuccarelli: CitySensing: Fusing City Data for Visual Storytelling. IEEE MultiMedia.
http://jol.telecomitalia.com/jolskil/citysensing/http://citysensing.fuorisalone.it/
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
Ingredients of the proposed solution
Big Data technologies- Address "volume" of data that do not fit in
memory- Address "velocity" of data streams in memory
semantic technologies - Address "variety" using Ontology Based Data
Access- Named Entity Recognition and Linking
data science- Statistical modelling- Detecting anomalies
Visual analytics- Allow no-expert access to data- Tell stories out of data
22
StreamReasoning
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org 23
What's Stream Reasoning?
Tame Variety and Velocity simultaneously
Traditional StreamReasoning
E.Della Valle, S. Ceri, F. van Harmelen, D. Fensel: It's a Streaming World! Reasoning upon Rapidly Changing Information. IEEE Intelligent Systems 24(6): 83-89 (2009)
E. Della Valle, D. Dell'Aglio, A. Margara: Taming velocity and variety simultaneously in big data with stream reasoning: tutorial. DEBS 2016: 394-401
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org 24
What's Stream Reasoning?
Tame Variety and Velocity simultaneously
Traditional StreamReasoning
E.Della Valle, S. Ceri, F. van Harmelen, D. Fensel: It's a Streaming World! Reasoning upon Rapidly Changing Information. IEEE Intelligent Systems 24(6): 83-89 (2009)
E. Della Valle, D. Dell'Aglio, A. Margara: Taming velocity and variety simultaneously in big data with stream reasoning: tutorial. DEBS 2016: 394-401
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
How CitySensing works – step 0
Time
Reality
Capture
Frame
Digital Reflex
Set up a conceptual model (FraPPE) to master the variety in the data sources
M.Balduini, E. Della Valle: FraPPE: a vocabulary to represent heterogeneous spatio-temporal data to support visual analytics. ISWC 2015
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
How CitySensing works – step 0
02/05/2023
Grid
Cell
Time
Frame
Set up a conceptual model (FraPPE) to master the variety in the data sources
M.Balduini, E. Della Valle: FraPPE: a vocabulary to represent heterogeneous spatio-temporal data to support visual analytics. ISWC 2015
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
How CitySensing works – step 0
02/05/2023
Pixel Frame 1
Time
Set up a conceptual model (FraPPE) to master the variety in the data sources
M.Balduini, E. Della Valle: FraPPE: a vocabulary to represent heterogeneous spatio-temporal data to support visual analytics. ISWC 2015
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
How CitySensing works – step 0
02/05/2023
Place A
Event A
Time
Set up a conceptual model (FraPPE) to master the variety in the data sources
M.Balduini, E. Della Valle: FraPPE: a vocabulary to represent heterogeneous spatio-temporal data to support visual analytics. ISWC 2015
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
How CitySensing works – step 0
02/05/2023
Event A
Time
Frame 1
Set up a conceptual model (FraPPE) to master the variety in the data sources
M.Balduini, E. Della Valle: FraPPE: a vocabulary to represent heterogeneous spatio-temporal data to support visual analytics. ISWC 2015
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
M.Balduini, E. Della Valle: FraPPE: a vocabulary to represent heterogeneous spatio-temporal data to support visual analytics. ISWC 2015
How CitySensing works – step 0
02/05/2023
Event B
Place B
Time
Frame 2
Set up a conceptual model (FraPPE) to master the variety in the data sources
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
FraPPE offers an homogenous view to the visual analytics interface built on heterogeneous data
How CitySensing works – step 0
31
Geo-spatial fragmentProvenance fragmentTime Varying fragmentFraPPE specifics
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
How CitySensing works – step 1
32
For every pixel compute continuously the volume of Call Data Records (using privacy-preserving aggregation)
Real data recorded on 13 April 2013 between 13:00 and 00:00
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
How CitySensing works – step 2
33
Find continuously the anomalous pixels comparing the current volumes with a model of the volumes in this time period
Real data recorded on 13 April 2013 between 13:00 and 00:00
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
How CitySensing works – step 3
34
Map continuously anomalies to the districts of Milano Design Week
Brera
Tortona
What'sthis?
Real data recorded on 13 April 2013 between 13:00 and 00:00
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
How CitySensing works – step 4
35
For every anomalous pixel continuously capture the hashtags and semantic entities named in the social media streams
Brera
Tortona
What'sthis?
Real data recorded on 13 April 2013 between 13:00 and 00:00
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
How CitySensing works – step 5
36
Continuously discard the hashtags and semantic entities that are systematically used
Brera
Tortona
Real data recorded on 13 April 2013 between 13:00 and 00:00
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org 37
Logical architecture of CitySensing – setup time
Analyse Data Stream
Build Models
Capture Data Stream Capture Static Data
MDW
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org 38
Logical architecture of CitySensing – run time
Analyse Data Stream
Build Models
Detect Anomalies
Capture Data Stream
Visualize Analysis
Store Analysis
Capture Static Data
MDW
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org 39
Logical architecture of CitySensing – run time
Analyse Data Stream
Build Models
Detect Anomalies
Capture Data Stream
Visualize Analysis
Store Analysis
Capture Static Data
MDW
StreamReasoning
InductiveDeductive
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org 40
Few more details on Stream Reasoning
Uses logical window
Connects to a variety ofdata streams
Real-timequery answering
complex event processing analysis
Stream Reasonerfor data
"in-motion"(In-memory)
Storedata
"at-rest"(distributed)
optimizesjoins
MDW
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
Capturing static data via FraPPE
The frame duration was fixed to15 minutes
Milano area was covered with • 1 grid (100x100)• 10,000 cells• 250x250 meters in each cell
(the size of the mobile network cells in the centre of Milan)
During the Milano Design Week a total of 5.76 Mln pixel werecaptured
+1000 events in +600 placeswhere collected using the crowd-sourced databases of fuorisalone.it, breradesigndistrict.it and tortonaroundesign.com thanks to a partnership with studiolabo
41
Cells in which there are placeshosting Milan Design Week 2013events
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
Processing Telecom Italia Call Data Records
1.92 Mln Gaussian models were built• one for each pixel (i.e., for each frame and cell)• grouping the frames by working and week-end days • using two months of Call Data Records, and• verifying volume of CDR has a Gaussian distribution with an
Anderson-Darling test with a significance of 0.05
Built on Pig, R e Cascalog The processing on 7 m1.large EC2 machines took 24 hours
42
Bad case Good case
His
togr
am
His
togr
am
Q-Q
Plo
t
Q-Q
plo
t
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
Processing Telecom Italia Call Data Records
Volume of CDR captured in Milan during the Design Week
Calls, SMS and Internet access were aggregated(with privacy-preservingmethods) and an anomaly index was computed for each of the 1.92 Mln pixel/day
The processing of 1 day on 7 m1.large EC2 took 20 mins
43
What 2013 2014Calls 16,743,875 19,719,629
SMSs 19,454,497 20,240,485
Internet data accesses 137,381,761 197,767,245
[image: https://cerijayne.files.wordpress.com/2011/10/outliersss.png]
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
Do CDR-anomalous pixels relate to events?
CDR-anomalous pixels =pixels in which the anomaly index is high (>+2σ and <-2σ)
To test if the anomalous pixels were related to the events of the Milan Design Week• We used three ground truth
– the pixel of Milan– the pixels of Brera district– the pixels of Tortona district
where there was at least an event of Milan Design Week 2013• We compute
– Precision – Recall
of the anomalous pixels to find pixels in those three ground truths
44
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org 45
Do CDR-anomalous pixels relate to events?
09 04:00
09 10:00
09 16:00
09 22:00
10 04:00
10 10:00
10 16:00
10 22:00
11 04:00
11 10:00
11 16:00
11 22:00
12 04:00
12 10:00
12 16:00
12 22:00
13 04:00
13 10:00
13 16:00
13 22:00
14 04:00
14 10:00
14 16:00
14 22:000
0.10.20.30.40.50.60.70.80.9
1
Mila
nB
rera
Toro
tna 09 04:00
09 10:00
09 16:00
09 22:00
10 04:00
10 10:00
10 16:00
10 22:00
11 04:00
11 10:00
11 16:00
11 22:00
12 04:00
12 10:00
12 16:00
12 22:00
13 04:00
13 10:00
13 16:00
13 22:00
14 04:00
14 10:00
14 16:00
14 22:000
0.10.20.30.40.50.60.70.80.9
1
09 04:00
09 10:00
09 16:00
09 22:00
10 04:00
10 10:00
10 16:00
10 22:00
11 04:00
11 10:00
11 16:00
11 22:00
12 04:00
12 10:00
12 16:00
12 22:00
13 04:00
13 10:00
13 16:00
13 22:00
14 04:00
14 10:00
14 16:00
14 22:000
0.10.20.30.40.50.60.70.80.9
1
Tuesday Wednesday Thursday Friday Saturday Sunday
precision
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org 46
Do CDR-anomalous pixels relate to events?
09 04:00
09 10:00
09 16:00
09 22:00
10 04:00
10 10:00
10 16:00
10 22:00
11 04:00
11 10:00
11 16:00
11 22:00
12 04:00
12 10:00
12 16:00
12 22:00
13 04:00
13 10:00
13 16:00
13 22:00
14 04:00
14 10:00
14 16:00
14 22:000
0.10.20.30.40.50.60.70.80.9
1
Mila
nB
rera
Toro
tna 09 04:00
09 10:00
09 16:00
09 22:00
10 04:00
10 10:00
10 16:00
10 22:00
11 04:00
11 10:00
11 16:00
11 22:00
12 04:00
12 10:00
12 16:00
12 22:00
13 04:00
13 10:00
13 16:00
13 22:00
14 04:00
14 10:00
14 16:00
14 22:000
0.10.20.30.40.50.60.70.80.9
1
09 04:00
09 10:00
09 16:00
09 22:00
10 04:00
10 10:00
10 16:00
10 22:00
11 04:00
11 10:00
11 16:00
11 22:00
12 04:00
12 10:00
12 16:00
12 22:00
13 04:00
13 10:00
13 16:00
13 22:00
14 04:00
14 10:00
14 16:00
14 22:000
0.10.20.30.40.50.60.70.80.9
1
Tuesday Wednesday Thursday Friday Saturday Sunday
recall
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org 47
Do CDR-anomalous pixels relate to events?
09 04:00
09 10:00
09 16:00
09 22:00
10 04:00
10 10:00
10 16:00
10 22:00
11 04:00
11 10:00
11 16:00
11 22:00
12 04:00
12 10:00
12 16:00
12 22:00
13 04:00
13 10:00
13 16:00
13 22:00
14 04:00
14 10:00
14 16:00
14 22:000
0.10.20.30.40.50.60.70.80.9
1
Mila
nB
rera
Toro
tna 09 04:00
09 10:00
09 16:00
09 22:00
10 04:00
10 10:00
10 16:00
10 22:00
11 04:00
11 10:00
11 16:00
11 22:00
12 04:00
12 10:00
12 16:00
12 22:00
13 04:00
13 10:00
13 16:00
13 22:00
14 04:00
14 10:00
14 16:00
14 22:000
0.10.20.30.40.50.60.70.80.9
1
09 04:00
09 10:00
09 16:00
09 22:00
10 04:00
10 10:00
10 16:00
10 22:00
11 04:00
11 10:00
11 16:00
11 22:00
12 04:00
12 10:00
12 16:00
12 22:00
13 04:00
13 10:00
13 16:00
13 22:00
14 04:00
14 10:00
14 16:00
14 22:000
0.10.20.30.40.50.60.70.80.9
1
Tuesday Wednesday Thursday Friday Saturday Sunday
precision recall
Lesson learnt
• High precision
• Low recall at city scale
• High recall in Brera and Tortona
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
Processing Social Streams
The machinery: the Streaming Linked Data framework
48
M.Balduini, E.Della Valle, D.Dell'Aglio, M.Tsytsarau, T.Palpanas, and C.Confalonieri:Social Listening of City Scale Events Using the Streaming Linked Data Framework. International Semantic Web Conference (2) 2013: 1-16
Stream Bus
AnalyserDecorator
Adapter Publisher VisualizerStream
HTTP
HTTP
Data Source Streaming Linked Data Server HTML5 Browser
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org 49
Processing Social Streams
M.Balduini, A.Bozzon, E.Della Valle, Y.Huang, G-J Houben: Recommending Venues Using Continuous Predictive Social Media Analytics. IEEE Internet Computing 18(5): 28-35 (2014)
Happily inside a bottle of Heineken beer @ the Heineken Magazzini#heinekendesignweek
EventMilan Design Week
Event Heineken Design Week
LocationThe Magazzini
hosts
has location
K
now
ledg
e G
raph
WCompanyHeineken
W Drinkbeer
producesorganized by
Wide as Wikipedia As deep as you like
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
Processing Social Streams
predictive models were built• For hastags and semantic entities systematically present• Using a Holt-Winter method
• grouping the frames by – working and week-end days and– Early morning, morning, afternoon, evening, and late night
• Analysing 300,000 geo-located micro-posts collected other 6 months in Milano area (november 2013, aprile 2014)
• It takes few seconds per hashtag/semantic entity on a 60€/month VM in a IaaS
50
DataFittedForecastLower 2,5%Upper 97,5%
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
Processing Social Streams
Usage of #milan in the weeks around Milan Design Week
Subtracting the predicted usage of #milan
51
200 – 700
700 – 1100
1100 – 1400
1400 – 1900
1900 – 200
200 – 700
700 – 1100
1100 – 1400
1400 – 1900
1900 – 200
WD WE WD WE WD WE WD WE WD
Milan Design Week
WD WE WD WE WD WE WD WE WD
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
Processing Social Streams
The difference between the observed and the predicted usage of #milan perfectly fits the usage of #mdw (the official hashtag of Milan Design Week)
52
200 – 700
700 – 1100
1100 – 1400
1400 – 1900
1900 – 200
200 – 700
700 – 1100
1100 – 1400
1400 – 1900
1900 – 200
WD WE WD WE WD WE WD WE WD
Milan Design Week
Anomalous usage of
#milan
Usage of #mdw
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
Processing Social Streams
Geo-references micro-posts captured, semantically annotated, cleansed using the predictive models and analyzed in Milan area
For each pixel with at least 1 micro-post we computed The volume related to Milano Design Week The top-10 hashtags The top-3 locations/events
Real-time processing was possible with our in-memory C-SPARQL engine and the Streaming Linked Data framework on a 20€/month VM in a IaaS
53
What 2013 2014Geo-located micropost 57,154 21,782
Linked to Milano Design Week 3,569 3,499
Linked to a specific location/event 761 547
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
Do socially active pixels relate to events?
socially active pixels =pixels in which we captured social media that talk about Milan Design Week
To computes • precision• recall
of the socially active pixels in find pixels in pixels in the three ground truths about Milan, Brera district and Tortona district
54
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
09 04:00
09 10:00
09 16:00
09 22:00
10 04:00
10 10:00
10 16:00
10 22:00
11 04:00
11 10:00
11 16:00
11 22:00
12 04:00
12 10:00
12 16:00
12 22:00
13 04:00
13 10:00
13 16:00
13 22:00
14 04:00
14 10:00
14 16:00
14 22:000
0.10.20.30.40.50.60.70.80.9
1
09 04:00
09 10:00
09 16:00
09 22:00
10 04:00
10 10:00
10 16:00
10 22:00
11 04:00
11 10:00
11 16:00
11 22:00
12 04:00
12 10:00
12 16:00
12 22:00
13 04:00
13 10:00
13 16:00
13 22:00
14 04:00
14 10:00
14 16:00
14 22:000
0.10.20.30.40.50.60.70.80.9
1
09 04:00
09 10:00
09 16:00
09 22:00
10 04:00
10 10:00
10 16:00
10 22:00
11 04:00
11 10:00
11 16:00
11 22:00
12 04:00
12 10:00
12 16:00
12 22:00
13 04:00
13 10:00
13 16:00
13 22:00
14 04:00
14 10:00
14 16:00
14 22:000
0.10.20.30.40.50.60.70.80.9
1
55
Do socially active pixels relate to events? M
ilan
Bre
raTo
rotn
a
Tuesday Wednesday Thursday Friday Saturday Sunday
precision
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
09 04:00
09 10:00
09 16:00
09 22:00
10 04:00
10 10:00
10 16:00
10 22:00
11 04:00
11 10:00
11 16:00
11 22:00
12 04:00
12 10:00
12 16:00
12 22:00
13 04:00
13 10:00
13 16:00
13 22:00
14 04:00
14 10:00
14 16:00
14 22:000
0.10.20.30.40.50.60.70.80.9
1
09 04:00
09 10:00
09 16:00
09 22:00
10 04:00
10 10:00
10 16:00
10 22:00
11 04:00
11 10:00
11 16:00
11 22:00
12 04:00
12 10:00
12 16:00
12 22:00
13 04:00
13 10:00
13 16:00
13 22:00
14 04:00
14 10:00
14 16:00
14 22:000
0.10.20.30.40.50.60.70.80.9
1
09 04:00
09 10:00
09 16:00
09 22:00
10 04:00
10 10:00
10 16:00
10 22:00
11 04:00
11 10:00
11 16:00
11 22:00
12 04:00
12 10:00
12 16:00
12 22:00
13 04:00
13 10:00
13 16:00
13 22:00
14 04:00
14 10:00
14 16:00
14 22:000
0.10.20.30.40.50.60.70.80.9
1
56
Do socially active pixels relate to events? M
ilan
Bre
raTo
rotn
a
Tuesday Wednesday Thursday Friday Saturday Sunday
recall
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
09 04:00
09 10:00
09 16:00
09 22:00
10 04:00
10 10:00
10 16:00
10 22:00
11 04:00
11 10:00
11 16:00
11 22:00
12 04:00
12 10:00
12 16:00
12 22:00
13 04:00
13 10:00
13 16:00
13 22:00
14 04:00
14 10:00
14 16:00
14 22:000
0.10.20.30.40.50.60.70.80.9
1
09 04:00
09 10:00
09 16:00
09 22:00
10 04:00
10 10:00
10 16:00
10 22:00
11 04:00
11 10:00
11 16:00
11 22:00
12 04:00
12 10:00
12 16:00
12 22:00
13 04:00
13 10:00
13 16:00
13 22:00
14 04:00
14 10:00
14 16:00
14 22:000
0.10.20.30.40.50.60.70.80.9
1
09 04:00
09 10:00
09 16:00
09 22:00
10 04:00
10 10:00
10 16:00
10 22:00
11 04:00
11 10:00
11 16:00
11 22:00
12 04:00
12 10:00
12 16:00
12 22:00
13 04:00
13 10:00
13 16:00
13 22:00
14 04:00
14 10:00
14 16:00
14 22:000
0.10.20.30.40.50.60.70.80.9
1
57
Do socially active pixels relate to events? M
ilan
Bre
raTo
rotn
a
Tuesday Wednesday Thursday Friday Saturday Sunday
precision recall
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
09 04:00
09 10:00
09 16:00
09 22:00
10 04:00
10 10:00
10 16:00
10 22:00
11 04:00
11 10:00
11 16:00
11 22:00
12 04:00
12 10:00
12 16:00
12 22:00
13 04:00
13 10:00
13 16:00
13 22:00
14 04:00
14 10:00
14 16:00
14 22:000
0.10.20.30.40.50.60.70.80.9
1
09 04:00
09 10:00
09 16:00
09 22:00
10 04:00
10 10:00
10 16:00
10 22:00
11 04:00
11 10:00
11 16:00
11 22:00
12 04:00
12 10:00
12 16:00
12 22:00
13 04:00
13 10:00
13 16:00
13 22:00
14 04:00
14 10:00
14 16:00
14 22:000
0.10.20.30.40.50.60.70.80.9
1
09 04:00
09 10:00
09 16:00
09 22:00
10 04:00
10 10:00
10 16:00
10 22:00
11 04:00
11 10:00
11 16:00
11 22:00
12 04:00
12 10:00
12 16:00
12 22:00
13 04:00
13 10:00
13 16:00
13 22:00
14 04:00
14 10:00
14 16:00
14 22:000
0.10.20.30.40.50.60.70.80.9
1
58
Do socially active pixels relate to events? M
ilan
Bre
raTo
rotn
a
Tuesday Wednesday Thursday Friday Saturday Sunday
precision recall
Lesson learnt
• High precision
• Acceptable recall in the districts
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
Anomalous Socially active Intersection Similar?
Are CDR-anomalous and socially active pixels similar?
Which of the following four scenarios?
59
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
Are CDR-anomalous and socially active pixels similar?
More formally• Jaccard
• E.g.,
60
J(A,B) = 8/11 J(A,B) = 3/11
A B A
B
J(A,B) = |A ∩ B|
|A∪B|
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
09 04:00
09 10:00
09 16:00
09 22:00
10 04:00
10 10:00
10 16:00
10 22:00
11 04:00
11 10:00
11 16:00
11 22:00
12 04:00
12 10:00
12 16:00
12 22:00
13 04:00
13 10:00
13 16:00
13 22:00
14 04:00
14 10:00
14 16:00
14 22:000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
09 04:00
09 10:00
09 16:00
09 22:00
10 04:00
10 10:00
10 16:00
10 22:00
11 04:00
11 10:00
11 16:00
11 22:00
12 04:00
12 10:00
12 16:00
12 22:00
13 04:00
13 10:00
13 16:00
13 22:00
14 04:00
14 10:00
14 16:00
14 22:000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
61
Are CDR-anomalous and socially active pixels similar?B
rera
Toro
tna
Tuesday Wednesday Thursday Friday Saturday Sunday
recall CDR-anomalous recall socially active Jaccard
Lesson learntAt district level, in the large majority of the cases the
socially active pixels are also CDR-anomalous pixels
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org 62
Visualizing for a casual audience
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org 63
See it in action!
http://youtu.be/MOBie09NHxM
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
Evaluation methodology for casual audience
Guessability study• Can you guess what I mean without any explanation?
E.g.
64
Dinosaur extinction
"The Shining" by Stephen King
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
Evaluation of interface guessability
65
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
The patters you should have got
The CDR-anomaly and the social activity is
66
Correlated Partially correlated Not correlated
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
Evaluation of interface guessability
67
Q: In Brera District the volume of social media signal is partially correlated with the value of mobile anomaly signal A:
FALSE
UNCERTAINTRUE
00.20.40.60.8
1
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
Evaluation of interface guessability
68
Q: In Brera District the volume of social media signal is partially correlated with the value of mobile anomaly signal A:
FALSE
UNCERTAINTRUE
00.20.40.60.8
1
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
Evaluation of interface guessability
69
Q: In Porta Romana the volume of social media signal is strongly correlated with the value of mobile anomaly signal A:
FALSE
UNCERTAINTRUE
0
0.2
0.4
0.6
0.8
1
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
Evaluation of interface guessability
70
Q: In Porta Romana the volume of social media signal is strongly correlated with the value of mobile anomaly signal A:
FALSE
UNCERTAINTRUE
0
0.2
0.4
0.6
0.8
1
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
Evaluation of interface guessability
71
Q: In Tortona District the volume of social media signal is strongly correlated with the value of mobile anomaly signalA:
FALSE
UNCERTAINTRUE
0
0.2
0.4
0.6
0.8
1
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
Evaluation of interface guessability
72
Q: In Tortona District the volume of social media signal is strongly correlated with the value of mobile anomaly signalA:
FALSE
UNCERTAINTRUE
0
0.2
0.4
0.6
0.8
1
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
Back to the research question
73
[photo: https://www.flickr.com/photos/debord/4932655275]
Can we collect, analyse and repurpose
• social media captured at place and events and
• privacy-preserving aggregates of Call Data Records
to allow visually
• perceiving emerging patterns and
• observing their dynamics?
Yes!at least, in Milano Design Week 2013 and 2014
[photo: https://flic.kr/p/beuDaX ]
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
… and I was so crazy to start up a company …
74
http://www.socialometers.com
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
Lesson Learnt for Stream Reasoning
The technical barriers are high The theoretical foundations are incomplete The veracity problem is sort of forgotten
75
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
High Technical Barriers for Stream Reasoning
We are getting close to a shared understanding on RDF Stream Processing (RDF stream and continuous extension of SPARQL)• See http://www.w3.org/community/rsp/
Missing infrastructure• Only one proposal for RDF stream publishing
– http://streamreasoning.github.io/TripleWave/ • Only one proposal for RDF Stream Processing APIs
– http://streamreasoning.org/resources/rsp-services Only prototypes, some unmaintained Need for scalable system built on Big Data technologies
(e.g., Spark/Flink) Lack of systematic and comparative evaluation
• too many benchmarks all focusing RDF stream processing with little emphasis on reasoning
76
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org 77
Incomplete Stream Reasoning theory
Two reference models exist• RSP-QL: Built on SPARQL semantics
– D.Dell'Aglio, E. Della Valle, J-P Calbimonte, Ó. Corcho:RSP-QL Semantics: A Unifying Query Model to Explain Heterogeneity of RDF Stream Processing Systems. Int. J. Semantic Web Inf. Syst. 10(4): 17-44 (2014)
• LARS: Built on datalog-style rules– H.Beck, M.Dao-Tran, T.Eiter, M.Fink: LARS: A Logic-Based Framework for
Analyzing Reasoning over Streams. AAAI 2015: 1431-1438 However
• What's the complexity of Q/A in RSP-QL/LARS?• How to deal with inconsistency appearing over time?• How do stream reasoning and event calculus relates?
OBDA on static data ≠ OBDA for continuous querying ans = data + query Ans(t) = sys(t) + data(t) + query
What about inductive stream reasoning?
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org
The veracity problem is sort of forgotten
Some initial works• D.F. Barbieri, D. Braga, S. Ceri, E. Della Valle, Y. Huang, V.
Tresp, A. Rettinger, H. Wermser: Deductive and Inductive Stream Reasoning for Semantic Social Media Analytics. IEEE Intelligent Systems 25(6): 32-41 (2010)
• M. Nickles, A. Mileo: Web Stream Reasoning Using Probabilistic Answer Set Programming. RR 2014: 197-205
• A-Y Turhan, E. Zenker: Towards Temporal Fuzzy Query Answering on Stream-based Data. HiDeSt@KI 2015: 56-69
Missing Theory?
78
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org 79
Take home message … guess it :-)
Emanuele Della Valle - @manudellavalle - http://emanueledellavalle.org 80
Take home message … guess it :-)
Emanuele Della Valle@manudellavalle
[email protected]://emanueledellavalle.org
Thank you!
Any question?
Listening to the pulse of our cities with Stream Reasoning (and few more technologies)Emanuele Della Valle@manudellavalle - [email protected]://emanueledellavalle.org