Collect, summarize and notify of OpenStack's log

Post on 23-Jan-2017

718 views 3 download

Transcript of Collect, summarize and notify of OpenStack's log

Copyright © NTT Communications Corporation. All rights reserved.

Mahito Ogura <m.ogura@ntt.com>NTT Communications

Collect, summarize and notify of OpenStack’s log

Copyright © NTT Communications Corporation. All rights reserved.

Mahito Ogura <m.ogura@ntt.com>

Technology DevelopmentDevOps Engineer

R&D for IaaS, NoSQL, DB as a Service,Hadoop as a service

Contributing to Devstack

About me

2

Copyright © NTT Communications Corporation. All rights reserved.

Our team’s mission

● To evaluate new components and functions for production● To accumulate knowledge of OpenStack’s operation for production

Challenge

● Continuously deploy OpenStack while constantly keeping service functionality and performance

○ This challenging story was spoken in Tokyo summit.“Automated Deployment & Benchmarking with Chef, Cobbler and Rally for OpenStack”

Background

3

Copyright © NTT Communications Corporation. All rights reserved.

Benchmark is helpful, but ...

4

Our challenge is successful !!

● Auto deployment and benchmarking can be constantly keeping service functionality and performance

Output to many OpenStack’s logs

● Normal: 40 ~ 50 lines/min x 3 cluster(16 hosts)● Benchmark: 220 ~ 260 lines/min x 3 cluster(16 hosts)

Debug: FalseOnly Nova

Copyright © NTT Communications Corporation. All rights reserved.

5

It is hard to manually find issues in logs...

barockschloss - https://www.flickr.com/photos/8663137@N04/4943743444/

Copyright © NTT Communications Corporation. All rights reserved.

Overview of System Architecture

OpenStack logsHost resource data

Storage

visualizestore

notify

pull

forward

store

Stream processing server

Time series database

Team collaboration tool

Metrics dashbord & graph editor

Data collector

6

Copyright © NTT Communications Corporation. All rights reserved.

Today’s talk target

OpenStack logsHost resource data

notify alert

forward log data

7

pull summrized data

Stream processing server

Data collector

Copyright © NTT Communications Corporation. All rights reserved.

FluentdFluentd is an open source data collector

Before Fluentd After Fluentd

Image from https://github.com/fluent/fluentd-docs 8

Copyright © NTT Communications Corporation. All rights reserved.

Collect OpenStack’s logs

OpenStack logs (fluent-plugin-tain)

9

nova-api.log

2015-10-19 17:29:37.832 25350 INFO nova.osapi_compute.wsgi.server [-] 10.1.16.22,127.0.0.1 "GET / HTTP/1.1" status: 200 len: 280 time: 0.0009720

{ "asctime":"2015-10-19 17:29:37.832", ”code":”200", "loglevel":"INFO", "objname":"nova.osapi_compute.wsgi.server”, …}

tag: openstack.nova.api.logforward

parse

Copyright © NTT Communications Corporation. All rights reserved.

Forward and Output

OpenStack logsHost resource data

Storage

tag:influxdb.**

tag:slack.**

tag:norikra.**

tag:**

tag:storage.**

10

<match openstack.**> type copy <store> type map tag (“norikra.” + tag) </store> <store> type map tag (“influxdb.” + tag) </store> <store> type map tag (“storage.” + tag) </store></match><match dstat.**> …

Fluentd server configuration

Copyright © NTT Communications Corporation. All rights reserved.

Norikra

Norikra is Open source server software

● Enable to select/count/grouping/etc.. from receiving data by using SQL● Enable to use time window (.win:time_batch(60 s))

11

Query Example:

SELECT hostname, loglevel, NULLABLE(message), COUNT(*) AS countFROM norikra_nova_compute_log.win:time_batch(1 min)WHERE loglevel = 'WARNING’GROUP BY hostname, loglevel, message

http://norikra.github.io/

Copyright © NTT Communications Corporation. All rights reserved.

Norikra vs other stream proecessing

Pros

● Easy to install● Don’t write code! Enable to use SQL

Cons

● Disable to distribute data● Disable to complex processing● Doesn’t have availavility

12

Copyright © NTT Communications Corporation. All rights reserved.

Summarize ‘WARNING’ message

OpenStack logsHost resource data

forward log data

13

pull summrized data

[

1445837968,

{"message":"No network configured!",

"loglevel":"WARNING",

"count":38,

"hostname":"compute-c3"}

]...

SELECT hostname, loglevel, NULLABLE(message), COUNT(*) AS countFROM norikra_nova_compute_log.win:time_batch(1 min)WHERE loglevel = 'WARNING’GROUP BY hostname, loglevel, message

Output summrized data per 1 min

Copyright © NTT Communications Corporation. All rights reserved.

Case: Notify a warning alert

OpenStack logsHost resource data

tag: norikra.**

tag: slack.**

14

tag: slack.nova.compute.warning.log

SELECT hostname, loglevel, NULLABLE(message), COUNT(*) AS countFROM norikra_nova_api_log.win:time_batch(1 min)WHERE loglevel = 'WARNING’GROUP BY hostname, loglevel, message

[

1445837968,

{"message":"No network configured!",

"loglevel":"WARNING",

"count":38,

"hostname":"compute-c3"}

]...

Copyright © NTT Communications Corporation. All rights reserved.

OpenStack’s logs are large and rapid stream

● All issues are hard to find in log stream

Fluentd & Norikra are powerfully software

● Easy to collect and forward for log data● Summarizable to large log data by using SQL● Enable to notify WARNING/ERROR alerts

Conclusion

15

Copyright © NTT Communications Corporation. All rights reserved.

16