SC7 Hangout 2 :Integrating Social Sensing for Security
-
Upload
bigdataeurope -
Category
Technology
-
view
203 -
download
1
Transcript of SC7 Hangout 2 :Integrating Social Sensing for Security
INTEGRATING SOCIAL
SENSING FOR SECURITY
2nd BDE Hangout “Big Data in Secure societies” 21 April 2016
George Giannakopoulos, George Kiomourtzis
NCSR “Demokritos”
BigDataEurope Pilot Remote Sensing Big Data Platform Social Sensing
• query • download • pre-processing • change
detection
• monitor news • cluster into events • filter relevant
events • extract AoI
Change Detection Workflow
Event Detection Workflow
Social media
◎ We know that user generated content (social media, blogs, etc.)
o Volume: Amounts to TBytes per day
o Variety: Structure and Formats; Level of editing (curated/free
text); Languages; Length
o Velocity: Real-time streams and requirements
o Veracity: Credibility and verification
o Value: Usable in risk management, brand monitoring, event
detection
What can media say?
◎ News reporting and social media:
o Report events
o Share
o Discuss events
◎ ...but
o People use free text
o There exists minimal annotation
o Reports are difficult to confirm
Event detection workflow
◎ Listen and monitor news and social media sources
◎ Identify events
i. Compare documents
ii. Form clusters
iii. Determine importance
◎ Enrich and store
i. Extract meta-data and geo-location information
ii. Update semantic infrastructures
◎ Combine with satellite data to inform user
Listen to media
● Define sources a. News feeds (RSS) b. Selected social media accounts (trust) c. Generic streams (e.g. Twitter) d. Keyword-based search
● Periodically check, or... ● ... consume a stream
Identify events
● Form pairs of news texts ● For each pair
○ Compare texts ○ If similarity above threshold
■ Consider related ● Form clusters, based on related pairs
○ Each cluster identifies an event ○ If cluster has a specific support
■ Keep as important
Enrich and store events
● For every social media item ○ Compare to cluster ○ If above threshold
■ Attach item to event ● For every cluster
○ Get metadata (date, location) ○ Extract location names ○ Request geo-location data
● Store meta-data in semantic infrastructure ○ Keep the links to sources
Many sources, many articles (example)
Baby rescued after 6 hours under quake
rubble (CNN)
Oil falls on failed output freeze; Dow above 18,000 (CNBC, Reuters)
GLOBAL MARKETS - Shares follow oil down after
Doha disappointment (Reuters)
U.S.-Philippines enhance military alliance, China isn't happy (CNN)
U.S. warily eyes New Year's threats in cities abroad
(IBT, Reuters)
Clustering (example)
Baby rescued after 6 hours under quake rubble (CNN)
Oil falls on failed output freeze; Dow above 18,000 (CNBC, Reuters)
GLOBAL MARKETS - Shares follow oil down after Doha disappointment (Reuters)
Similarity: 0.2
Similarity: 0.8
(In same cluster)
● N-gram graphs
● Markov Clustering
● Transitive closure
Identify Events (example)
Match
No match
Enrich and store events (example)
News feed items in cluster: 5
Title: Shares follow oil down after Doha disappointment
ID: 5-88affec1f2d6a28ea9e332087a0978bc-14685
Description:
The plunge in crude oil prices took a large slice out of commodity currencies, pushing the dollar
almost 1 percent higher against its Canadian counterpart to C$1.2926 CAD=D4.
Locations extracted:
[Brazil, Britain, Europe, Hong Kong, Iran, Japan, London, Saudi Arabia, Washington]
Related Tweets (IDs):
[722035074401705984]
Locations extracted:
[Brazil : [[[35.31,25.3],[35.31,19.25],[41.09,19.25],[41.09,25.3],[35.31,25.3]]]>,
Britain: <polygon>...]
Strabon
Strabon
Summary
● We listen to what news and social media say ● We analyze and enrich ● We update the semantic infrastructure in
real-time ● We support
○ Discovery of interesting events ○ Location-based focus and, thus... ○ ...Validation with satellite data
Conclusion and future work
● (Social) Media data as a security resource ● Big data infrastructure ● Semantic enrichment in realtime
Next steps: ● Multi-threaded to distributed ● Fine-tune location extraction ● Fine-tune clustering ● Fine-tune post (tweet) assignment
Thank you
George Giannakopoulos, George Kiomourtzis
E-mail: [email protected]
Icons from flaticon.com