Search and Analytics (using Elasticsearch)

29
Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Search and Analytics (using Elasticsearch) Costin Leau

description

Slides from Costin Leau's talk on Search and Analytics (using Elasticsearch) at the 18th Big Data London meetup

Transcript of Search and Analytics (using Elasticsearch)

Page 1: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Search and Analytics

(using Elasticsearch)

Costin Leau

Page 2: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Why search?

Page 3: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Search – what’s the big deal?

Basic/Metadata retrieval

“Find banks with more then (x) accounts”

Page 4: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Search – what’s the big deal?

Basic/Metadata retrieval

“Find banks near my location”

Page 5: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Search – What we’re all about

Page 6: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Search categories

Basic/Metadata retrieval

Full-text search

Highlighting

Geolocation

Fuzzy search (“did-you-mean”)

Natural Language

Page 7: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Search categories

Basic/Metadata retrieval

Full-text search

Highlighting

Geolocation

Fuzzy search (“did-you-mean”)

Natural Language

data stores

search engines

Page 8: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

‘Players’ in the search market

Search engines

- Google/Bing/Yahoo!/Ask.com/Yandex/Baidu

Open-Source

- Sphinx

- Apache Lucene

- Elasticsearch

- Solr

- Sensei

Enterprise Search

- Oracle Endeca / MDEX

- HP Autonomy

- Exalead

- IBM Enterprise Search

Page 9: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Elasticsearch

Page 10: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Elasticsearch

Open-Source Search & Analytics engine

- Structured & Unstructured Data

- Real Time

- Analytics capabilities (facets)

- REST based

Distributed

- Designed for the Cloud

- Designed for Big Data

Page 11: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Elasticsearch

Open-Source Search & Analytics engine

- Structured & Unstructured Data

- Real Time

- Analytics capabilities (facets)

- REST based

Distributed

- Designed for the Cloud

- Designed for Big Data

Lightweight

Page 12: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Elasticsearch

Open-Source Search & Analytics engine

- Structured & Unstructured Data

- Real Time

- Analytics capabilities (facets)

- REST based

Distributed

- Designed for the Cloud

- Designed for Big Data

Lightweight

Popular: >200K downloads/month

Page 13: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Users

Page 14: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Users

Page 15: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Platform Adoption

http://www.thoughtworks.com/radar#platforms 2013

Page 16: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Platform Adoption

http://www.thoughtworks.com/radar#platforms 2013

Page 18: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Searches 50,000,000 venues every day using

Elasticsearch

Use Case - Geolocation

Page 19: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Use Case – Support/Reporting

Page 20: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Use Case - Centralized Logging

Page 21: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Use Case - Pure Analytics

Page 22: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Search and Big Data

Page 23: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

A Holistic View of a Big Data System

ETL

Real

Time

Streams

Unstructured Data (HDFS)

RT Semi

structured

Database

(hBase,

Cassandra,

Mongo)

Big SQL (Greenplum,

AsterData,

Etc…)

Batch Processing Real-Time

Processing

(s4, storm)

Analytics

Page 24: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

A Holistic View of a Big Data System

ETL

Real

Time

Streams

Unstructured Data (HDFS)

RT Semi

structured

Database

(hBase,

Cassandra,

Mongo)

Big SQL (Greenplum,

AsterData,

Etc…)

Batch Processing

Analytics

Real-Time

Processing

(s4, storm)

Page 25: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Hadoop eco-system

Hadoop Distributed File System (HDFS)

Map Reduce Framework (MapRed)

Page 26: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Hadoop eco-system

Hadoop Distributed File System (HDFS)

Map Reduce Framework (MapRed)

Page 27: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Elasticsearch + Hadoop

0

10

20

30

40

50

60

M/R Pig Hive

Raw w/ ES

0

10

20

30

40

50

60

M/R Pig Hive

Raw w/ ES

Writing Reading / Querying

Page 28: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Explore data through

(Elastic)Search

Page 29: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Thank you! @costinl

http://www.elasticsearch.org/