Download - See through software

Transcript
Page 1: See through software

See-through softwareUsing logs, metrics and visualization to see your app at

runtime and share your vision with others

Page 2: See through software

The best technical solutions are ones that solve for human relationships

Page 3: See through software

Opaque software suffers from a lack of focus on the operations user experience

Page 4: See through software

Software is opaque by default

Page 5: See through software

Software is opaque by defaultWhat's it doing?

Page 6: See through software

Software is opaque by defaultIs it going well?

Page 7: See through software

Software is opaque by defaultWhen's it going to be done?

Page 8: See through software

Software is opaque by defaultDoes it need me to do anything?

Page 9: See through software

When you write opaque software

Page 10: See through software

This is the user experience of operations

Page 11: See through software

This is the user experience of support

Page 12: See through software

This is the user experience of your boss

Page 13: See through software

Opaque software leads to

• Misaligned priorities

• Loss of productivity

• A generally "unprofessional" experience for customers

• The "us vs them" attitude that is the antithesis of DevOps culture

Page 14: See through software

If you don’t provide facts, you encourage mythology

Page 15: See through software

See-through software acknowledges the operations user

experience of the entire organization

Page 16: See through software

–Jason Nemec

“A good user experience should make a user feel smart, powerful and safe.”

Page 17: See through software

We feel smart when

• We understand what's going on

• We understand how to change things

• Others share our understanding

Page 18: See through software

We feel powerful when

• We are able to change things

• We see the results of our changes

• We can find an answer to our questions

Page 19: See through software

We feel safe when

• We know there isn't a problem

• We know if there is a problem, we'll be able to understand it

• We can trust others to react on our behalf

Page 20: See through software

These principles help to develop a roadmap to

improving transparency

Page 21: See through software

Transparent software gets your attention

• Dashboarding

• Alerting

Page 22: See through software

Transparent software takes on a shape

• Graphing

• Modeling

Page 23: See through software

Transparent software tells stories

• Logging

• Auditing

• Reporting

Page 24: See through software

Transparent software responds interactively to questions

• Ad Hoc Queries

• Post Hoc Analytics

Page 25: See through software

Transparent software is democratic

• Wikis

• Shared visibility

• Persistent chat rooms

Page 26: See through software

Software does not become transparent as the result of any single project

• Software evolves; its UX needs to evolve with it

• Insight is rarely easy to produce, and easy to produce information is rarely insightful

• Insight is frequently driven from the bottom up or from the outside in

Page 27: See through software

Democratization means everybody benefits

And everybody has a role to play

Page 28: See through software

"Clearinghouse" services• Store data for people who

can’t get it themselves

• Collect and persist data from many different sources

• Provide a single engine for serving information

• Reduce pressure on critical infrastructure from interested users

Page 29: See through software

"Visualization" services

• Provide studio-like tools allow "non-technical" users feel safe to experiment

• Allow for rapid, real-time development of new insights on old data

• Allow for sharing and repurposing of insight

Page 30: See through software

The see-through system at runtime

Logging

Page 31: See through software

Good logs tell a story

• Each statement is a sentence: it needs verbs and nouns

• Each statement has a setting -- where, when and who

• It should be simple to reconstruct the story told by independent sentences

Page 32: See through software

Aggregate your logs to create an epic

• Discover systems that are acting aberrantly

• Correlate errors between coordinating systems

• Graph meaningful patterns in your stories

Page 33: See through software

Index your logs to find interesting stories quickly

• Audit individual chains of processing from start to finish

• Slice up your reports so they interest a specific group or team

• Build new reports quickly to solve unpredicted needs

Page 34: See through software

Aggregation architectureComponents: log to the fastest, convenient, least likely to

fail store available (e.g. local disk)

Page 35: See through software

Aggregation architectureLog shippers: asynchronously publish logs to an

aggregator

Page 36: See through software

Aggregation architectureAggregator: parse, clean, enrich and store logs

Page 37: See through software

Aggregation architectureClearinghouse: hold data and standardize access

Page 38: See through software

Aggregation architectureVisualization: Allows data to inform and be manipulated

by end users

Page 39: See through software

Log aggregations on private networksThe ELK stack

(Elasticsearch + Logstash + Kibana)

Page 40: See through software

Log aggregation in the Cloud

Page 41: See through software

Developing apps with log aggregation in mind

Page 42: See through software

• Use Correlation IDs throughout your system

• Don't log secrets

• Build log strategies with shipping and rolling in mind

• Have a way to capture crashes

• Log using techniques that preserve context, such as JSON

Page 43: See through software

The see-through system at runtime

Dashboarding

Page 44: See through software

Focus on UX

Page 45: See through software

Make users feel smart

• Dashboards should inform without a lot of explanation or prior knowledge

• Dashboards should direct the user to the next step

Page 46: See through software

Make users feel powerful

• Dashboards should update frequently (aim for <10s)

• Dashboards should help users perform their job

• Dashboards should respond to the user's needs

Page 47: See through software

Make users feel safe

• Dashboards should not overwhelm

• Dashboards accuracy should be known

• Thresholds should be meaningful

• Using a dashboard should not endanger the running software

Page 48: See through software

How to build a dashboard item

Page 49: See through software

Are you concerned with a technical or a business issue?

• Technical: Machine 123 is slow, West Coast users are slow, we’re moving 80 GB/s

• Business: Client ABC is slow, logins are slow, we’re moving 1000k orders/s

Page 50: See through software

How does a stressed system look?How can you tell it from an unstressed system?

Page 51: See through software

What kind of comparisons do you want to provide?

• Time series vs flat

• Machine vs Machine

• Current vs Previous

• Current vs Threshold

Page 52: See through software

Dashboarding architectureMetric source: a process within an app that can produce

a numeric value

Page 53: See through software

Dashboarding architectureMetrics collection API: decouples the collection of metrics

from their publishing; generally still part of the app

Page 54: See through software

Dashboarding architectureStats Aggregator: an out-of-process component that

creates aggregate data points from a stream of metrics

Page 55: See through software

Dashboarding architectureMetrics clearinghouse: hold data and standardize

access

Page 56: See through software

Dashboarding architectureVisualization: Allows a user to build and correlate graphs

Page 57: See through software

Dashboarding architectureDashboarding: Allows a user to share a distilled vision of

data

Page 58: See through software

Dashboarding on private networksStatsD + Graphite

Page 59: See through software

Dashboarding in the cloud

Page 60: See through software

Developing apps with dash boarding in mind

Page 61: See through software

• Collect and report everything that’s “free”

• Collect and report deep, valuable application metrics at runtime

• Understand aggregation and know when to apply it

• Be aware of multiplicative effects of metrics collection on bandwidth, storage and billing

Page 62: See through software

ScoreKeeperGather metrics from existing datasources into statsd/

Graphite

Page 63: See through software

See-through software

• Lets the people whose jobs depend on software understand what and how it's doing.

• Empowers people to ask their own questions and share their insights

Page 64: See through software

Help teams become more successful

• By understanding when there's a problem

• By focusing energy where it's needed most

• By talking to customers in a competent and informed way

Page 65: See through software

@DataMiller