Tick
-
Upload
vincenzo-ferrari -
Category
Software
-
view
71 -
download
3
Transcript of Tick
TICKTHE TIME-SERIES STACK
WELCOMEFIRST THINGS FIRST
WHO ARE WE?
GOLANG MEETUP MILANO
▸ we are gophers
▸ we produce golang-related events
▸ we’re looking for more gophers (I mean, speaking gophers!)
WHO AM I?
VINCENZO {WILK} FERRARI
▸ early adopter
▸ fullstack developer
▸ software engineer @ ibuildings
▸ open source producer/consumer
IBUILDINGSWEB & MOBILE APP DEVELOPMENT
TAGTALENT GARDEN
GDGGOOGLE DEVELOPERS GROUP
TIME SERIES
WHAT ARE TIME SERIES?
TIME SERIES EXPLAINED
▸ A time series is a series of data points indexed (or listed or graphed) in time order
▸ Time series are used in statistics, signal processing, pattern recognition, weather forecasting and largely in any domain of applied science and engineering
▸ Time series are used for analysis and forecasting
SOME QUESTIONS
TIME SERIES IN PRACTICE
▸ What offer is best to present to a visitor based on their behaviour, in real-time?
▸ What patterns can I detect in financial markets that I can use to execute faster, more intelligent transactions?
▸ Can I predict how long visitors will stay and why they’ll drop off?
▸ Can I track sensors on my fleet vehicles over time to optimise delivery schedules and fuel economy?
▸ Can I predict if my elastic infrastructure will scale on events like Black Friday?
▸ Can I weave through petabytes of machine to machine data over time to detect malicious patterns in my network?
▸ Can I increase the yield of my crops and lower my costs by adjusting water and fertiliser in real-time, based on environmental conditions?
USE CASESA CONCRETE APPLICATION
USE CASES
IN THE REAL LIFE
▸ eBay’s Experimentation environment enables users to answer important analytics and business questions.
▸ Mozilla uses InfluxDB to store important metrics about the performance of applications and devices running Firefox OS
▸ Facile.it collects a huge amount of data from their customers that trigger a number of actions that are an order of magnitude bigger than the data itself
INFLUXDATAINTRODUCING
WHY INFLUXDATA?
INFLUXDATA FEATURES
▸ Ease of Use – up and running in minutes, not days or weeks
▸ Scalable – write thousand of points per second, store billions of points for analysis
▸ Open Source – MIT licensed, extensible by design
▸ Integrated – Data collection, storage, visualisation and alerting designed to work together seamlessly
▸ Highly Available – Platform components can be distributed and clustered
▸ Real-Time Downsampling – Continuous queries precompute large amounts of data on-the-fly before being written
▸ Efficient Storage – High compression and retention polices lower storage footprints and costs
▸ Purpose Built – InfluxData is designed from the ground up to do one thing, manage time-series data at scale
▸ Choice of Deployment – Run InfluxData in your datacenter, a public cloud or on our managed hosting service
TICK STACKONE STACK TO RULE THEM ALL
TICK: TELEGRAF
TELEGRAF
▸ Telegraf is an agent written in Go for collecting metrics and writing them into InfluxDB or other possible outputs.
▸ It’s a standalone module
▸ It can be customised with a configuration file
TICK: TELEGRAF
TELEGRAF INPUTS
▸ Apache
▸ CouchDB
▸ Graylog
▸ httpjson
▸ phpfpm
▸ Varnish
▸ system
TICK: TELEGRAF
TELEGRAF SERVICE INPUTS
▸ Kafka consumer
▸ MQTT consumer
▸ TCP listener
▸ UDP listener
▸ GitHub WebHooks
TICK: TELEGRAF
TELEGRAF OUTPUTS
▸ InfluxDB
▸ AMQP
▸ AWS CloudWatch
▸ File
▸ Kafka
▸ MQTT
▸ Graylog
TICK: TELEGRAF
TELEGRAF CONFIGURATION SAMPLE
[agent] interval = “10s” metric_batch_size = 1000 metric_buffer_limit = 10000 flush_interval = “10s” [[outputs.influxdb]] urls = [“http://influxdb:8086"] database = “kapacitor_example” timeout = “5s” [[inputs.cpu]] percpu = true totalcpu = true fielddrop = [“time_*”]
TICK: INFLUXDB
INFLUXDB
▸ InfluxDB is a time series database built from the ground up to handle high write and query loads.
▸ InfluxDB is meant to be used as a backing store for any use case involving large amounts of timestamped data, including DevOps monitoring, application metrics, IoT sensor data, and real-time analytics.
TICK: INFLUXDB
FEATURES
▸ Specific query language (Influx Query Language)
▸ NO-SQL and schemaless database
▸ Retention policies for single measurements
▸ Replication policies for each database
▸ HTTP REST API
TICK: INFLUXDB
QUERY LANGUAGE
▸ InfluxDB has its own DSL (Domain Specific Language) to query and explore data
▸ It has a set of functions to use in conjunction with the common SELECT statement
▸ Mathematical operations can be done right into the query
▸ Continuous Queries are InfluxQL queries that run automatically and periodically on realtime data and store query results in a specified measurement.
TICK: INFLUXDB
TOOLS
▸ REST API
▸ CLI/Shell
▸ Web Admin Interface
TICK: CHRONOGRAF
CHRONOGRAF
▸ Chronograf is a graphing and visualization application that you use to perform ad hoc exploration of your InfluxDB data.
▸ It’s a standalone module used in conjunction with InfluxDB
TICK: CHRONOGRAF
KEY FEATURES
▸ Simple installation
▸ Smart query builder designed to work with large datasets
▸ Collect multiple graphs into dashboards
▸ Support for templates
TICK: KAPACITOR
KAPACITOR
▸ Kapacitor is an open source data processing framework that makes it easy to create alerts, run ETL jobs and detect anomalies.
▸ It’s a standalone module
▸ It can be configured with a Domain Specific Language (DSL)
TICK: KAPACITOR
KEY FEATURES
▸ Process both streaming data and batch data.
▸ Query data from InfluxDB on a schedule, and receive data via the line protocol and any other method InfluxDB supports.
▸ Perform any transformation currently possible in InfluxQL.
▸ Store transformed data back in InfluxDB.
▸ Add custom user defined functions to detect anomalies.
▸ Integrate with HipChat, OpsGenie, Alerta, Sensu, PagerDuty, Slack, and more.
TICK: KAPACITOR
TICKSCRIPT
▸ The TICKscript language is an invocation chaining language. Each script has a flat scope and each variable in the scope defines methods that can be called on it.
▸ Kapacitor uses TICKscripts to define data processing pipelines. A pipeline is set of nodes that process data and edges that connect the nodes. Pipelines in Kapacitor are directed acyclic graphs (DAGs) meaning each edge has a direction that data flows and there cannot be any cycles in the pipeline.
TICK: KAPACITOR
EXAMPLE
stream |from() .measurement(‘app’) |eval(lambda: "errors" / “total") .as(‘error_percent’) |influxDBOut() .database(‘mydb’) .retentionPolicy(‘myrp’) .measurement(‘errors’) .tag('kapacitor', ‘true’) .tag('version', ‘0.2’)
TICK WORKFLOW
LET’S COMBINE EVERY PIECES
DEMO TIMELET’S BREAK SOMETHING