Tick

35
TICK THE TIME-SERIES STACK

Transcript of Tick

Page 1: Tick

TICKTHE TIME-SERIES STACK

Page 2: Tick

WELCOMEFIRST THINGS FIRST

Page 3: Tick

WHO ARE WE?

GOLANG MEETUP MILANO

▸ we are gophers

▸ we produce golang-related events

▸ we’re looking for more gophers (I mean, speaking gophers!)

Page 4: Tick

WHO AM I?

VINCENZO {WILK} FERRARI

▸ early adopter

▸ fullstack developer

▸ software engineer @ ibuildings

▸ open source producer/consumer

Page 5: Tick
Page 6: Tick

IBUILDINGSWEB & MOBILE APP DEVELOPMENT

Page 7: Tick

TAGTALENT GARDEN

Page 8: Tick

GDGGOOGLE DEVELOPERS GROUP

Page 9: Tick

TIME SERIES

Page 10: Tick

WHAT ARE TIME SERIES?

TIME SERIES EXPLAINED

▸ A time series is a series of data points indexed (or listed or graphed) in time order

▸ Time series are used in statistics, signal processing, pattern recognition, weather forecasting and largely in any domain of applied science and engineering

▸ Time series are used for analysis and forecasting

Page 11: Tick

SOME QUESTIONS

TIME SERIES IN PRACTICE

▸ What offer is best to present to a visitor based on their behaviour, in real-time?

▸ What patterns can I detect in financial markets that I can use to execute faster, more intelligent transactions?

▸ Can I predict how long visitors will stay and why they’ll drop off?

▸ Can I track sensors on my fleet vehicles over time to optimise delivery schedules and fuel economy?

▸ Can I predict if my elastic infrastructure will scale on events like Black Friday?

▸ Can I weave through petabytes of machine to machine data over time to detect malicious patterns in my network?

▸ Can I increase the yield of my crops and lower my costs by adjusting water and fertiliser in real-time, based on environmental conditions?

Page 12: Tick
Page 13: Tick

USE CASESA CONCRETE APPLICATION

Page 14: Tick

USE CASES

IN THE REAL LIFE

▸ eBay’s Experimentation environment enables users to answer important analytics and business questions.

▸ Mozilla uses InfluxDB to store important metrics about the performance of applications and devices running Firefox OS

▸ Facile.it collects a huge amount of data from their customers that trigger a number of actions that are an order of magnitude bigger than the data itself

Page 15: Tick

INFLUXDATAINTRODUCING

Page 16: Tick

WHY INFLUXDATA?

INFLUXDATA FEATURES

▸ Ease of Use – up and running in minutes, not days or weeks

▸ Scalable – write thousand of points per second, store billions of points for analysis

▸ Open Source – MIT licensed, extensible by design

▸ Integrated – Data collection, storage, visualisation and alerting designed to work together seamlessly

▸ Highly Available – Platform components can be distributed and clustered

▸ Real-Time Downsampling – Continuous queries precompute large amounts of data on-the-fly before being written

▸ Efficient Storage – High compression and retention polices lower storage footprints and costs

▸ Purpose Built – InfluxData is designed from the ground up to do one thing, manage time-series data at scale

▸ Choice of Deployment – Run InfluxData in your datacenter, a public cloud or on our managed hosting service

Page 17: Tick

TICK STACKONE STACK TO RULE THEM ALL

Page 18: Tick

TICK: TELEGRAF

TELEGRAF

▸ Telegraf is an agent written in Go for collecting metrics and writing them into InfluxDB or other possible outputs.

▸ It’s a standalone module

▸ It can be customised with a configuration file

Page 19: Tick

TICK: TELEGRAF

TELEGRAF INPUTS

▸ Apache

▸ CouchDB

▸ Graylog

▸ httpjson

▸ phpfpm

▸ Varnish

▸ system

Page 20: Tick

TICK: TELEGRAF

TELEGRAF SERVICE INPUTS

▸ Kafka consumer

▸ MQTT consumer

▸ TCP listener

▸ UDP listener

▸ GitHub WebHooks

Page 21: Tick

TICK: TELEGRAF

TELEGRAF OUTPUTS

▸ InfluxDB

▸ AMQP

▸ AWS CloudWatch

▸ File

▸ Kafka

▸ MQTT

▸ Graylog

Page 22: Tick

TICK: TELEGRAF

TELEGRAF CONFIGURATION SAMPLE

[agent] interval = “10s” metric_batch_size = 1000 metric_buffer_limit = 10000 flush_interval = “10s” [[outputs.influxdb]] urls = [“http://influxdb:8086"] database = “kapacitor_example” timeout = “5s” [[inputs.cpu]] percpu = true totalcpu = true fielddrop = [“time_*”]

Page 23: Tick

TICK: INFLUXDB

INFLUXDB

▸ InfluxDB is a time series database built from the ground up to handle high write and query loads.

▸ InfluxDB is meant to be used as a backing store for any use case involving large amounts of timestamped data, including DevOps monitoring, application metrics, IoT sensor data, and real-time analytics.

Page 24: Tick

TICK: INFLUXDB

FEATURES

▸ Specific query language (Influx Query Language)

▸ NO-SQL and schemaless database

▸ Retention policies for single measurements

▸ Replication policies for each database

▸ HTTP REST API

Page 25: Tick

TICK: INFLUXDB

QUERY LANGUAGE

▸ InfluxDB has its own DSL (Domain Specific Language) to query and explore data

▸ It has a set of functions to use in conjunction with the common SELECT statement

▸ Mathematical operations can be done right into the query

▸ Continuous Queries are InfluxQL queries that run automatically and periodically on realtime data and store query results in a specified measurement.

Page 26: Tick

TICK: INFLUXDB

TOOLS

▸ REST API

▸ CLI/Shell

▸ Web Admin Interface

Page 27: Tick

TICK: CHRONOGRAF

CHRONOGRAF

▸ Chronograf is a graphing and visualization application that you use to perform ad hoc exploration of your InfluxDB data.

▸ It’s a standalone module used in conjunction with InfluxDB

Page 28: Tick

TICK: CHRONOGRAF

KEY FEATURES

▸ Simple installation

▸ Smart query builder designed to work with large datasets

▸ Collect multiple graphs into dashboards

▸ Support for templates

Page 29: Tick
Page 30: Tick

TICK: KAPACITOR

KAPACITOR

▸ Kapacitor is an open source data processing framework that makes it easy to create alerts, run ETL jobs and detect anomalies.

▸ It’s a standalone module

▸ It can be configured with a Domain Specific Language (DSL)

Page 31: Tick

TICK: KAPACITOR

KEY FEATURES

▸ Process both streaming data and batch data.

▸ Query data from InfluxDB on a schedule, and receive data via the line protocol and any other method InfluxDB supports.

▸ Perform any transformation currently possible in InfluxQL.

▸ Store transformed data back in InfluxDB.

▸ Add custom user defined functions to detect anomalies.

▸ Integrate with HipChat, OpsGenie, Alerta, Sensu, PagerDuty, Slack, and more.

Page 32: Tick

TICK: KAPACITOR

TICKSCRIPT

▸ The TICKscript language is an invocation chaining language. Each script has a flat scope and each variable in the scope defines methods that can be called on it.

▸ Kapacitor uses TICKscripts to define data processing pipelines. A pipeline is set of nodes that process data and edges that connect the nodes. Pipelines in Kapacitor are directed acyclic graphs (DAGs) meaning each edge has a direction that data flows and there cannot be any cycles in the pipeline.

Page 33: Tick

TICK: KAPACITOR

EXAMPLE

stream |from() .measurement(‘app’) |eval(lambda: "errors" / “total") .as(‘error_percent’) |influxDBOut() .database(‘mydb’) .retentionPolicy(‘myrp’) .measurement(‘errors’) .tag('kapacitor', ‘true’) .tag('version', ‘0.2’)

Page 34: Tick

TICK WORKFLOW

LET’S COMBINE EVERY PIECES

Page 35: Tick

DEMO TIMELET’S BREAK SOMETHING