Visualizing Data in Elasticsearch DevFest DC 2016

26
1 Search, Time Series, and Graph Analysis in the Cloud Dave Erickson [email protected] Visualizing Data in Elasticsearch

Transcript of Visualizing Data in Elasticsearch DevFest DC 2016

Page 1: Visualizing Data in Elasticsearch DevFest DC 2016

1

Search, Time Series, and Graph Analysis in the Cloud

Dave [email protected]

Visualizing Data in Elasticsearch

Page 2: Visualizing Data in Elasticsearch DevFest DC 2016

2

Dave Erickson – Developer

• Biotech

• Electronic Archives & Libraries

• Geospatial

• Healthcare

• Air Traffic Control

• Financial Services

Page 3: Visualizing Data in Elasticsearch DevFest DC 2016

3

Page 4: Visualizing Data in Elasticsearch DevFest DC 2016

4

Elastic Stack: Real Time Search & Analytics at Scale

Elastic Cloud

Security

X-Pack

KibanaUser Interface

ElasticsearchStore, Index,& Analyze

IngestLogstash Beats

+

Alerting

Monitoring

Reporting

Graph

Page 5: Visualizing Data in Elasticsearch DevFest DC 2016

5

Page 6: Visualizing Data in Elasticsearch DevFest DC 2016

6

Visualization is Importanthttps://www.reddit.com/r/dataisugly/

Page 7: Visualizing Data in Elasticsearch DevFest DC 2016

7

Visualization in the Cloud

• Qualities We Want:‒ Parallel‒ Highly Available‒ Platform Independent‒ Multi-tenancy‒ Extensible

• Use Cases:‒ Search, Discovery, & Analytics‒ Metrics & Time Series Data‒ Structured & Unstructured‒ Security Analytics

Page 8: Visualizing Data in Elasticsearch DevFest DC 2016

8

Wait …

Why would you use a search engine for analytics?

Page 9: Visualizing Data in Elasticsearch DevFest DC 2016

9

Search indexes have been around for a long time

Page 10: Visualizing Data in Elasticsearch DevFest DC 2016

10

Scaled, distributed search indexes have been around for a long time

Page 11: Visualizing Data in Elasticsearch DevFest DC 2016

11

Electronic search engines have been around for a long time

1928 – patent application by Emanuel Goldberg for a “Statistical Machine”http://www.google.com/patents/US1838389Basically an optical version of grep that predates almost everything

Page 12: Visualizing Data in Elasticsearch DevFest DC 2016

12

Timeline, in no way complete

• 7th Century B.C.E. ? – library catalogs• 1928 – Goldberg “Statistical Machine”

– Optical search on microfilm

• 1945 – Vannevar Bush “microfilm rapid selector”; “Memex”• 1960s – SMART Information Retrieval System (Cornell U.)• 1974 – grep first appears in Unix v4• 1990s – WWW search engines• 1999 – Doug Cutting Lucene search indexer

Page 13: Visualizing Data in Elasticsearch DevFest DC 2016

13

Inverted Indexes

• Pay the cost at indexing time (insertion time)

• Reap the benefits at retrieval time

“the quick brown fox” “brown fox in the forest”Document (1) Document (2)

“brown bear”Document (3)

Term Postings List Statistics (count)

quick 1 1brown 1, 2, 3 3fox 1, 2 2forest 2 1bear 3 1

Page 14: Visualizing Data in Elasticsearch DevFest DC 2016

14

Pretty Good At RetrievalFind documents mentioning “foxes” ?

Term Postings List Statistics (count)

quick 1 1brown 1, 2, 3 3fox 1, 2 2forest 2 1bear 3 1

“the quick brown fox” “brown fox in the forest”Document (1) Document (2)

“brown bear”Document (3)

Page 15: Visualizing Data in Elasticsearch DevFest DC 2016

15

Excellent at SearchFind documents mentioning “quick” AND “fox” ?

Term Postings List Statistics (count)

quick 1 1brown 1, 2, 3 3fox 1, 2 2forest 2 1bear 3 1

inte

rsec

tion

“the quick brown fox” “brown fox in the forest”Document (1) Document (2)

“brown bear”Document (3)

Page 16: Visualizing Data in Elasticsearch DevFest DC 2016

16

“the quick brown fox” “brown fox in the forest”Document (1) Document (2)

“brown bear”Document (3)

Excellent at Real Time AnalyticsWhat was the most commonly mentioned term?

Term Postings List Statistics (count)

quick 1 1brown 1, 2, 3 3fox 1, 2 2forest 2 1bear 3 1

Page 17: Visualizing Data in Elasticsearch DevFest DC 2016

17

“the quick brown fox” “brown fox in the forest”Document (1) Document (2)

“brown bear”Document (3)

Histogram about the mention of foxes over time:

Term Postings List Statistics (count)

quick 1 1brown 1, 2, 3 3fox 1, 2 2forest 2 1bear 3 1?

Page 18: Visualizing Data in Elasticsearch DevFest DC 2016

18 18

Columnar Indexes

text: “the quick brown fox”date: Monday

text: “brown fox in the forest”date: Tuesday

Document (1)

Document (2)

text: “brown bear”date: Monday

Document (3)

Doc id Date

1 Monday

2 Tuesday

3 Monday

Term Postings List Statistics (count)

quick 1 1brown 1, 2, 3 3fox 1, 2 2forest 2 1bear 3 1

Page 19: Visualizing Data in Elasticsearch DevFest DC 2016

19 19

Now do it in parallel

• Distributed

• Non-blocking

• Read / Write

• Commodity hardware

• Fault-tolerance

• High Availability

Page 20: Visualizing Data in Elasticsearch DevFest DC 2016

20 20

Use Cases

Page 21: Visualizing Data in Elasticsearch DevFest DC 2016

21

QuickLive Demo

Page 22: Visualizing Data in Elasticsearch DevFest DC 2016

22

Page 23: Visualizing Data in Elasticsearch DevFest DC 2016

23

Page 24: Visualizing Data in Elasticsearch DevFest DC 2016

24

Page 25: Visualizing Data in Elasticsearch DevFest DC 2016

25

Page 26: Visualizing Data in Elasticsearch DevFest DC 2016

26

Thank [email protected]