Elasticsearch 5 in Amazon Elasticsearch Service
-
Upload
amazon-web-services -
Category
Technology
-
view
62 -
download
7
Transcript of Elasticsearch 5 in Amazon Elasticsearch Service
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Elasticsearch 5 in Amazon Elasticsearch Service
Darin BriskmanAmazon Web Services Technical Evangelist
[email protected] or @briskmad
15 Feb 2017
Jon HandlerAWS Principal Solutions Architect
[email protected] or @_searchgeek
Get started at https://aws.amazon.com/elasticsearch-service/
Amazon Search Services
Amazon CloudSearch
Amazon Elasticsearch
Service
Get started at https://aws.amazon.com/elasticsearch-service/
OpenSourceDistributedIndex
ManagedServiceusingElasticsearchandKibana
Fullymanaged;Zeroadmin
HighlyAvailableandReliable
RESTfulAPIforeasyintegrationAmazon
Elasticsearch Service
Get started at https://aws.amazon.com/elasticsearch-service/
Amazon Elasticsearch Service Leading Use Cases
Log Analytics & Operational Monitoring
• Monitor the performance of applications, web servers, and hardware
• Easy to use, powerful data visualization tools to detect issues quickly
• Dig into logs in an intuitive, fine-grained way
• Kibana provides fast, easy visualization
Search
• Application or website provides search capabilities over diverse documents
• Tasked with making this knowledge base searchable and accessible
• Text matching, faceting, filtering, fuzzy search, auto complete, highlighting, and other search features
• Query API to support application search
Leading enterprises trust Amazon Elasticsearch Service for their search and analytics applications
Media&Entertainment
OnlineServices Technology Other
Get started at https://aws.amazon.com/elasticsearch-service/
Adobe Developer Platform (Adobe I/O)
P R O B L E M• Cost effective monitor
for XL amount of log data
• Over 200,000 API calls per second at peak -destinations, response times, bandwidth
• Integrate seamlessly with other components of AWS eco-system.
S O L U T I O N• Log data is routed with
Amazon Kinesis to Amazon Elasticsearch Service, then displayed using AES Kibana
• Adobe team can easily see traffic patterns and error rates, quickly identifying anomalies and potential challenges
B E N E F I T S• Management and
operational simplicity
• Flexibility to try out different cluster configduring dev and test
AmazonKinesisStreams
Spark StreamingAmazon
Elasticsearch Service
Data Sources
1
Get started at https://aws.amazon.com/elasticsearch-service/
McGraw Hill Education
P R O B L E M• Supporting a wide catalog
across multiple services in multiple jurisdictions
• Over 100 million learning events each month
• Tests, quizzes, learning modules begun / completed / abandoned
S O L U T I O N
• Search and analyze test results, student/teacher interaction, teacher effectiveness, student progress
• Analytics of applications and infrastructure are now integrated to understand operations in real time
B E N E F I T S
• Confidence to scale throughout the school year. From 0 to 32TB in 9 months
• Focus on their business, not their infrastructure
Get started at https://aws.amazon.com/elasticsearch-service/
Easy toUse
Deployaproduction-readyElasticsearchclusterinminutes
Simplifiestime-consumingmanagementtaskssuchassoftwarepatching,failurerecovery,backups,andmonitoring
Open
GetdirectaccesstotheElasticsearchopen-sourceAPI
FullycompatiblewiththeopensourceElasticsearchAPI,forallcodeandapplications
Secure
SecureElasticsearchclusterswithAWSIdentityandAccessManagement(IAM)policieswithfine-grainedaccesscontrolaccessforusersandendpoints
Automaticallyappliessecuritypatcheswithoutdisruption,keepingElasticsearchenvironmentssecure
Available
ProvideshighavailabilityusingZoneAwareness,whichreplicatesdatabetweentwoAvailabilityZones
Monitorsthehealthofclustersandautomaticallyreplacesfailednodes,withoutservicedisruption
AWSIntegrated
IntegrateswithAmazonKinesisFirehose,AWSIOT,andAmazonCloudWatchLogsforseamlessdataingestion
AWSCloudTrailforauditing,AWSIdentityandAccessManagement(IAM)forsecurity,andAWSCloudFormationforcloudorchestration
Scalable
Scaleclustersfromasinglenodeupto20nodes
ConfigureclusterstomeetperformancerequirementsbyselectingfromarangeofinstancetypesandstorageoptionsincludingSSD-poweredEBSvolumes
Amazon Elasticsearch Service Benefits
Get started at https://aws.amazon.com/elasticsearch-service/
Easy to use and scalable
AWS SDK
AWS CLI
AWSCloudFormation
Elastic LoadBalancingAWS IAM
Amazon CloudWatch
AWS CloudTrail
Get started at https://aws.amazon.com/elasticsearch-service/
Open
• Drop-in replacement• Zero-change, no-risk
migration to or from open source Elasticsearch
Get started at https://aws.amazon.com/elasticsearch-service/
Secure
• Control access based on originating IP or Principal
• Mix policies to provide application access and Kibana access
• Use IAM roles to provide access for other services
Get started at https://aws.amazon.com/elasticsearch-service/
Available
Amazon Elasticsearch Service cluster
1
3
Instance 1
2
1 2
Instance 2
3
2
1
Instance 3
Availability Zone 1 Availability Zone 2
2
1
Instance 4
3
3
Get started at https://aws.amazon.com/elasticsearch-service/
Logstash
RESTCWL Agent
EC2 Instances
Amazon Kinesis
AmazonRDS
AmazonDynamoDB
AmazonSQS
Queue
LogstashCluster
Amazon Elasticsearch
Service
Amazon CloudWatch
AWSLambda
AWSCloudTrail
Access Logs
Amazon VPC Flow
Logs
Amazon S3 bucket
AWS IoT
Amazon Kinesis Firehose
AWS integrated
Amazon ECS
Dedicated master nodes improve stability
Amazon ES cluster
1
3
3
1
Instance 1
2
1
1
2
Instance 2
3
2
2
3
Instance 3Dedicated master nodes
Data nodes: queries and updates
Get started at https://aws.amazon.com/elasticsearch-service/
Firehose delivery architecture with transformations
intermediate Amazon S3
bucket
backup S3 bucket
source records
data source
source records
Amazon ElasticsearchService
Firehosedelivery stream transformed
records transformedrecords
transformation failure
delivery failure
Get started at https://aws.amazon.com/elasticsearch-service/
Repository Search
• File metadata and possibly file contents for traditional search
• Lambda to keep the repository current
• Good for up to ~60TB of metadata/source data (current limits)
See also: Indexing S3 Metadata blog post by Amit Sharma
Amazon Elasticsearch Service support for Elasticsearch 5
Get started at https://aws.amazon.com/elasticsearch-service/
What to do with a terabyte of logs?
Get started at https://aws.amazon.com/elasticsearch-service/
Visualize it with Kibana 5!
Get started at https://aws.amazon.com/elasticsearch-service/
Scripting with Amazon Elasticsearch Service
Scripting is fully supported using the Painless language. With scripts you can
• Change the precedence of search results• Delete index fields by query• Modify search results to return specific fields• Alter elements in a field
Painless is explicitly designed for Elasticsearch and is both performant and secure.
Get started at https://aws.amazon.com/elasticsearch-service/
Ingest Pipelines and Processors
When you index documents, you can specify a pipeline.The pipeline can have a series of processors that pre-process the data before indexing.Twenty processors are available, some are simple:{ "append":
{ "field": "field1" "value": ["item2", "item3", "item4"] } }
Others are more complex, like the Grok processor for regex with aliased expressions.
Get started at https://aws.amazon.com/elasticsearch-service/
Lots of New Elasticsearch APIs
/_alias/_aliases/_all/_analyze/_bulk/_cache/clear (Index only)/_cat/_cluster/allocation/explain/_cluster/health/_cluster/pending_tasks/_cluster_settings (PUT only):indices.breaker.fielddata.limitindices.breaker.request.limitindices.breaker.total.limit
/_cluster/state/_cluster/stats/_count/_delete_by_query*/_explain/_field_stats/_flush/_forcemerge (Index only) /_mapping/_mget/_msearch/_mtermvectors/_nodes/_plugin/kibana/_recovery (Index only)
/_refresh/_reindex*/_rollover/_search/_search profile/_segments (Index only) /_shard_stores/_shrink/_snapshot/_stats/_status/_tasks/_template/_termvectors/_update_by_query*/_validate
Get started at https://aws.amazon.com/elasticsearch-service/
Shrink and Rollover
Shrink an index to a single shard:POST source_index/_shrink/target_index
Very useful for time-series indexes once ingestion is done!
Rollover an index based on number of documents:POST logs_index/_rollover
{ "conditions": {"max_docs": 100000 } }
Get started at https://aws.amazon.com/elasticsearch-service/
Supported Elasticsearch 5 Plugins
• Smart Chinese Analysis plugin• Stempel Polish Analysis plugin• Ingest Processor Attachment plugin• Ingest Geoip Processor Plugin• Ingest User Agent Processor plugin• Mapper Murmur3 Plugin
中文Polskie
Get started at https://aws.amazon.com/elasticsearch-service/
Testing Ingest Performance
• Load generator• m4.large, single process, single thread
• Amazon Elasticsearch Service• 1 instance, 1 primary, no replicas, EBS gp2 storage
• Data• 1.8m apache web log lines, comprising 196 MB
• _bulk API calls with 10K lines per call• Monitoring data gathered from load generator process
and from the Amazon Elasticsearch Service domain
Get started at https://aws.amazon.com/elasticsearch-service/
Amazon Elasticsearch Service with v2.3 EngineInstance Avg Index Docs/sec
m3.medium 3.93 ms 2811
m3.2xlarge 11.83 ms 3966
r3.large 8.87 ms 3932
r3.8xlarge 10.58 ms 4404
I2.2xlarge 11.2 ms 5305
Ingest Performance Test Results
Instance Avg Index Docs/sec
m3.medium 3.12 ms 3629m3.2xlarge 11.1 ms 5816r3.large 8.76 ms 7221r3.8xlarge 9.59 ms 7726I2.2xlarge 10.3 ms 9676
Amazon Elasticsearch Service with v5.1 Engine
Up to 82% more documents per second!
Get started at https://aws.amazon.com/elasticsearch-service/
Migrating from v2.3 to v5.1
The easy way:1. Create a new Amazon Elasticsearch Service v5.1 cluster2. Snapshot your v2.3 indexes3. Restore the indexes to the v5.1 cluster
… but this won’t get most of the benefits of v5.1
There are many breaking changes in v5, documented athttps://www.elastic.co/guide/en/elasticsearch/reference/5.1/breaking-changes.html
Get started at https://aws.amazon.com/elasticsearch-service/
Three Things to Remember
• Amazon Elasticsearch Service is a drop-in replacement for new and existing Elasticsearch workloads
• Deploy, manage, and scale Elasticsearch more easily in the AWS cloud
• Support for Elasticsearch 5.1 brings scripting, additional plugins and additional performance to Amazon Elasticsearch Service
Get started at https://aws.amazon.com/elasticsearch-service/
Findoutmore:https://aws.amazon.com/elasticsearch-service/
AWSCentralizedLogging:https://aws.amazon.com/answers/logging/centralized-logging/
ElasticsearchattheAWSDatabaseBlog:https://aws.amazon.com/blogs/database/category/elasticsearch/
OraskyourSolutionsArchitect!
Amazon Elasticsearch
Service