Geospatial, Temporal and Economic Analysis of Alternative ...
Statistical, Geospatial, Temporal, Graph Analysis with the ... · 1 Steve Kearns Sr. Director,...
Transcript of Statistical, Geospatial, Temporal, Graph Analysis with the ... · 1 Steve Kearns Sr. Director,...
1
Steve KearnsSr. Director, Product Management@skearns64
Statistical, Geospatial, Temporal, Graph
Multi-Modal Analysis with the Elastic Stack
2
Mo’ Data, Mo’ Problems
Notorious B.I.G.*
3
Today’s Missions Have Complex Requirements
Critical Mission Requirements Many users / many needs
Security / Integrity
Cross-Source Insights
Speed
Scale
Data Enrichment / Quality
DataComplex/Diverse
Location
Machine/Log Files
User-Activity
Documents
Social
Operational RequirementsReal-TimeAvailability
Rapid QueryExecution
FlexibleData Model
High Availability
Horizontal Scale
Simple APIs, Powerful UI
4
Elastic Cloud
Security
Monitoring
Alerting
Graph
X-Pack
KibanaUser Interface
ElasticsearchStore, Index,& Analyze
IngestLogstash Beats
+
ElasticStack
How Do We Help?
5
Multi-Modal Analysis with the Elastic Stack
• Statistical‒ Count, summarize, maybe do some math
• Temporal‒ How does your data change over time?
• Geospatial‒ Where are things happening? Combine with statistical, soon temporal!
• Graph‒ Entirely new way of exploring your data.
Many ways to explore, navigate and discover
6
7
8
9
10
11
New: Timelion
• Kibana Plugin
• New Expression Syntax‒ Describe query, transformation(s), and visualization in one line
• Highly configurable charts
Advanced date math and visualization, without the fuss
12
13
GraphThe Origin Story
14
Data is not FlatMuch like the world
"_source": {"created_at": "Tuesday Mar 28 12:10:52 +0000 2016","text": “Can’t wait for #HLTCON!","user": {"name": "Steve Kearns","screen_name": "skearns64","location": "Boston, MA"
},"hashtags": [{"text": “HLTCON”}]."lang": "en","@timestamp": "2016-03-24T12:09:52.000Z",
}
15
Relationships live in our data
• Direct: one document references multiple entities
"user": {"screen_name": "skearns64","location": "Boston, MA",
}
• Indirect: two or more documents share a reference
"user": {"screen_name": "skearns64","location": "Boston, MA",
}
"user": {"screen_name": ”imotov","location": "Boston, MA",
}
1616
What is Graph Technology Good for?
17
Fraud Detection
• Given credit card purchase histories..‒ Where did people with fraudulent purchases shop most often?‒ What purchasing patterns are unique to this group of suspects? New persons
involved?
• Given car emissions data…‒ Which car manufacturer fails emissions tests most often? ‒ At which shops?
18
Identifying Relationships
• Given a set of documents with extracted entities…‒ What topics / entities / locations are meaningfully related? ‒ If I know one bad actor, can I find others?
• Given network traffic data…‒ What external IPs do machines on my network talk to?‒ If I know one bad actor/IP, can I find others?
19
Recommendations
• Given my purchase history…‒ What am I most likely to buy next?
• Given Last.FM music preferences…‒ What music do people who like Mozart also like? ‒ Can I use this to identify new hate groups?
• Given search and click data.. ‒ What results do people who searched for “Belgium” tend to click on?
20
…There’s no limit to how complicated things can get, on account of one thing always leading to another…E.B. WhiteAmerican essayist, columnist, poet and editor
21
…There’s no limit to how complicated things can get, on account of one thing always leading to another…E.B. WhiteAmerican essayist, columnist, poet and editor
Theoretical Challenges with Graph Technology
• Zipf’s Law results in super-connected entities
• Super connected entities make graph exploration difficult
• Graph exploration is typically done by “most frequent” connections
!X
Dataset Doctype Sharedreferencepoints(concepts) “Super-connected”values
Twitter tweet accountids,hashtags #YOLO
Movielens user likedmovieIDs Shawshankredemption
LastFM user listened-tobands Coldplay,Radiohead,Beatles
Wikipedia article linkedarticleid UnitedStates,LivingPeople
Phonerecords call phonenumber Taxifirms
22
Simple API that combines Search and Graph Techniques
• Simple graph-walking API
• Leverages full Elasticsearch query language
• Relevance or count-based
• Explore your existing indexes
• Distributed query execution
• Near-real-time data availability
!X!X
What have we built?
23
Simple API that combines Search and Graph Techniques
• Simple graph-walking API
• Leverages full Elasticsearch query language
• Relevance or count-based
• Explore your existing indexes
• Distributed query execution
• Near-real-time data availability
!X
24
Simple API that combines Search and Graph Techniques
!X
GET /lastfm_raw/_graph/explore
{ "query": { "query_string": { "query": "Mozart” } }, "vertices": [{ "field": “artists.raw” }], "connections": { "vertices": [{ "field": “artists.raw" }] }}
25
Simple UI to Explore Your Data in New Ways
!X