4 Stages of Rolling Your Own Logging Solution – When will you jump off the Complexity Elevator_ _...

3
3/25/2014 4 Stages of Rolling Your Own Logging Solution – When will you jump off the Complexity Elevator? | Logentries Blog https://blog.logentries.com/2014/03/rolling-your-own-logging-solution-when-will-you-jump-off-the-complexity-elevator/ 1/3 4 Stages of Rolling Your Own Logging Solution – When will you jump off the Complexity Elevator? Over the past few years we have brought on board many a company that has started out on the path of rolling their own logging solution. Very often companies start down this path…largely because they can (there’s a bunch of open source technologies)…and it’s free, so you can get started with zero down. But as we all know there’s no such thing as a free lunch, and rolling your own solution contains a number of hidden costs, such as some beefy servers for when your log volumes grow, your VALUABLE time wiring together the initial solution and, most costly of all, maintaining and managing your solution as it begins to span a set of clustered servers. In some cases this works well for organizations, and they manage all their logs in-house with their custom built solution, often combining a number of open source components. However, one trend we continue to see is that as companies grow, their logs grow, and so do their in-house custom solutions along with all their complexities. As these systems become more complex, more time is required by the engineers who built them in order to maintain and manage these. The more hassle these systems become, the more likely organizations are to jump off the “roll your own” complexity elevator. We’ve also noticed a trend in the types of systems that get developed in house over time. They generally start off fairly simplistic and can grow into much more sophisticated solutions. Below we outline the different “role your own logging” stages we have come to see over the past few years: Stage 1 – The insurance policy The most basic logging solution is generally used as an insurance policy, “just in case” you need access to your logs at some point in the future. This usually involves a combination of Syslog and Logrotate to manage logs on each individual server and a mechanism to archive the logs periodically (e.g. daily). The most common approach we see is where companies simply use S3 for such archiving and which does the job nicely. This solution is pretty basic as it doesn’t really give you a good way to interrogate your logs if you need to and simply acts as a mechanism for storing logs in case of an emergency. That being said, this is often the first step people take as they enter the roll your own logging complexity elevator. Stage 2 – Searching for the needle Once logs are being kept around the question often arises – “hey, can we search across this data?” – usually in the face of some operational outage or customer support query. The log archiving solution above doesn’t really help here as it doesn’t provide any simple way to search your logs.

Transcript of 4 Stages of Rolling Your Own Logging Solution – When will you jump off the Complexity Elevator_ _...

Page 1: 4 Stages of Rolling Your Own Logging Solution – When will you jump off the Complexity Elevator_ _ Logentries Blog

3/25/2014 4 Stages of Rolling Your Own Logging Solution – When will you jump off the Complexity Elevator? | Logentries Blog

https://blog.logentries.com/2014/03/rolling-your-own-logging-solution-when-will-you-jump-off-the-complexity-elevator/ 1/3

4 Stages of Rolling Your Own Logging Solution – When will youjump off the Complexity Elevator?

Over the past few years we have brought on board many a company that has started out on the

path of rolling their own logging solution. Very often companies start down this path…largely

because they can (there’s a bunch of open source technologies)…and it’s free, so you can get

started with zero down.

But as we all know there’s no such thing as a free lunch, and rolling your own solution contains a

number of hidden costs, such as some beefy servers for when your log volumes grow, your

VALUABLE time wiring together the initial solution and, most costly of all, maintaining and

managing your solution as it begins to span a set of clustered servers. In some cases this works

well for organizations, and they manage all their logs in-house with their custom built solution,

often combining a number of open source components.

However, one trend we continue to see is that as companies grow, their logs grow, and so do their

in-house custom solutions along with all their complexities. As these systems become more

complex, more time is required by the engineers who built them in order to maintain and manage

these. The more hassle these systems become, the more likely organizations are to jump off the

“roll your own” complexity elevator.

We’ve also noticed a trend in the types of systems that get developed in house over time. They

generally start off fairly simplistic and can grow into much more sophisticated solutions. Below we

outline the different “role your own logging” stages we have come to see over the past few years:

Stage 1 – The insurance policy

The most basic logging solution is generally used as an insurance policy, “just in case” you need

access to your logs at some point in the future. This usually involves a combination of Syslog and

Logrotate to manage logs on each individual server and a mechanism to archive the logs

periodically (e.g. daily). The most common approach we see is where companies simply use S3 for

such archiving and which does the job nicely.

This solution is pretty basic as it doesn’t really give you a good way to interrogate your logs if you

need to and simply acts as a mechanism for storing logs in case of an emergency. That being said,

this is often the first step people take as they enter the roll your own logging complexity elevator.

Stage 2 – Searching for the needle

Once logs are being kept around the question often arises – “hey, can we search across this data?”

– usually in the face of some operational outage or customer support query. The log archiving

solution above doesn’t really help here as it doesn’t provide any simple way to search your logs.

Page 2: 4 Stages of Rolling Your Own Logging Solution – When will you jump off the Complexity Elevator_ _ Logentries Blog

3/25/2014 4 Stages of Rolling Your Own Logging Solution – When will you jump off the Complexity Elevator? | Logentries Blog

https://blog.logentries.com/2014/03/rolling-your-own-logging-solution-when-will-you-jump-off-the-complexity-elevator/ 2/3

Enter Logstash. Logstash is an open source tool that indexes log data and allows you to search

across it, so it’s often the first place people turn to start digging into their logs given there’s no

requirement for money down up front. The main issue we hear from users of open source solutions

is that, as log volumes increase, so does the amount of time spent having to manage it, in particular

if your back-end requires clustering. Note you’ll also have to host Logstash (e.g. on some AWS

instances) and this cost will grow over time also.

Stage 3 – Show me the Metrics

Logs are addictive! Once you start digging into them you’ll find more and more valuable info that

you can use for a range of different use cases. IMHO the real power of logs is when you can begin

to use your logs as data! At Logentries, we regularly see relatively small systems that produce in

the order of 10′s or 100′s of log events per day. These events can contain vital pieces of data for

understanding your systems (e.g. response time, memory usage, cpu usage). Where logs become

really powerful is when you can identify important field values in your data, and then role up these

values into a metrics dashboards to visualize and understand key trends. You can thus use your

logs to dynamically build reports that give you different views into your system for a range of

different use cases (e.g. performance monitoring, product usage, web analytics, etc…).

Using ‘logs as data’ is becoming more and more common and when rolling your own solution this

can again be achieved with something like Logstash by combining it with Statsd and Graphite.

Again this does not require any hard cash investment, but you will spend time managing and

configuring this, which can become more challenging in particular as your data volumes grow.

Stage 4 – The deep dive

The final type of the roll your own logging solution we see is where companies also write their log

data to something like HDFS – whereby they are running more complex queries against their data

(e.g. to identify correlations or associations between error events that lead to serious issues, or to

build reports such as funnel or cohort reports for web analytics).

This type of analysis can be super powerful and is really only limited by the quality of your data

(i.e. what data you decide to include in your log events) and the intelligence of your data scientist

However at this stage you really require deep expert skills and someone who can play the data

scientist role at your organization. So again, while your solution might not require a cash

investment upfront, you are going to require the some serious tech skills and someone with the

time to invest in building out your Hadoop cluster and queries.

blog.logentries.com

https://blog.logentries.com/2014/03/rolling-your-own-logging-solution-when-will-you-jump-off-the-comp

Page 3: 4 Stages of Rolling Your Own Logging Solution – When will you jump off the Complexity Elevator_ _ Logentries Blog

3/25/2014 4 Stages of Rolling Your Own Logging Solution – When will you jump off the Complexity Elevator? | Logentries Blog

https://blog.logentries.com/2014/03/rolling-your-own-logging-solution-when-will-you-jump-off-the-complexity-elevator/ 3/3

own-logging-solution-when-will-you-jump-off-the-complexity-elevator/

http://goo.gl/ffgS