Fluentd and AWS at classmethod

Post on 08-May-2015

1.644 views 0 download

description

Presented at http://connpass.com/event/5222/

Transcript of Fluentd and AWS at classmethod

Mar 21, 2014

www.treasuredata.com/

Fluentd & AWS!

Masahiro NakagawaTreasure Data, Inc

1

Who are you?

• Masahiro Nakagawa

• @repeatedly

• Treasure Data, Inc

• Senior Software Engineer

• Fluentd, td-agent, etc...

• Dlang, MessagePack, ...

2

Treasure Data on AWS

4

FrontendQueue

Worker

Hadoop

Fluentd

Applications push metrics to Fluentd(via local Fluentd)

Librato Metricsfor realtime analysis

Treasure Data

for historical analysis

Fluentd sums up data minutes(partial aggregation)

Backend overviewImpalaPresto

Hadoop

Used AWS products

• RDS

• Store service data

• Queue / Scheduler

• S3

• Columnar storage

• EC2

• Clusters: Hadoop, Workers, APIs, etc…6

SeparateStorage and Processor!

Classmethod use case!

7

Fluentd(Treasure Agent)

8

Structured logging

Reliable forwarding

Pluggable architecture

http://fluentd.org/

Collect Store Process Visualize

Data source

Reporting

Monitoring

Data Processing

Related Products

Store Process

Cloudera

Horton Works

Treasure Data

Collect Visualize

Tableau

Excel

R

easier & shorter time

???

Before…

12

Application

・・・

Server2

Application

・・・

Server3

Application

・・・

Server1

FluentLog Server

High Latency!must wait for a day...

Divide & Conquer & Retry

13

error retry

error retry retry

retry

After!

14

Application

・・・

Server2

Application

・・・

Server3

Application

・・・

Server1

Fluentd Fluentd Fluentd

Fluentd Fluentd

In streaming!

Lambda Architecture

15

http://www.drdobbs.com/database/applying-the-big-data-lambda-architectur/240162604

In short

• Open sourced log collector written in Ruby

• Customization is essentialsmall core + many plugins

16

Fluentd is a robust log collectordesigned for processing data streams

Core Plugins

• Divide & Conquer

• Buffring & Retrying

• Error handling

• Message routing

• Parallelize

• read / receive data

• write / send data

17

M x N → M + N

18

Nagios

MongoDB

Hadoop

Alerting

Amazon S3

Analysis

Archiving

MySQL

Apache

Frontend

Access logs

syslogd

App logs

System logs

Backend

Databasesbuffer / buffer / routing

Pluggable Architecture

19

Buffer Output

Input

> Forward> HTTP> File tail> dstat> ...

> Forward> File> MongoDB> ...

> File> Memory

Engine

Output

> rewrite> ...

Pluggable Pluggable

Next release

20

• Fluentd v0.10.45

• in_tail supports multiline and * watch

• in_exec supports json / msgpack

• several fixes

• td-agent 1.1.19

AWS use cases

21

Collecting instance logs

22

• A sign of Immutable Infrastructure

• Hard to manage state-full instance

• Almost instance should be disposable

• Excluding DB, Master, etc...

• How to manage such instance logs?

• Common problem on Cloud environment

• Start Fluentd at launch phase

• It is also useful for Docker / other containers

• Including metadata or host to identify

Collecting using Fluentd

23

Collector Aggregator

AWS Plugins

24

http://fluentd.org/plugin/

• s3

• dynamodb

• redshift

• rds

• elb

• cloudwatch

• sns

• sqs

• ses

• kinesis (soon!)