PNDA - Platform for Network Data Analytics

22
PNDA

Transcript of PNDA - Platform for Network Data Analytics

Page 1: PNDA - Platform for Network Data Analytics

PNDA

Page 2: PNDA - Platform for Network Data Analytics

• Volume of network data into terabytes• Siloed data limits ability to perform

correlation and causal analysis• Relational databases limit the ability to • Application of big data analytics to the

network dataset is key to providing both real-time and historical insights

• Data science is driving the bifurcation of the OSS stack

Network data is becoming a big data problem

3-fold increase in total IP Traffic

>60% increase in devices and

connections

Telemetry data streamed in near

real-timeSource: Cisco VNI/GCI Global IP Traffic Forecast

Page 3: PNDA - Platform for Network Data Analytics

What changes?PMO FMO

Orientation Single domain Cross domain

Realisation Small data, tool driven Big data, data driven

Data collection Polled Streamed

Data aggregation and analysis

Coupled Decoupled

Domain data schema Schema-on-write Schema-on-read

Analysis Prescriptive Prescriptive + Stochastic + ML

Customisation Design time Run time

Page 4: PNDA - Platform for Network Data Analytics

• Tight coupling of data aggregation/store/analysis

• Multiple analytics pipelines implemented from open source components

• Common design patterns ~75% of effort wasted / duplicated

• Siloes limit the potential of big data analytics and lead to industry divergence

Today’s siloed analytics pipelines

Telemetry

Metrics

Data sources

HDFS

Data store

Spark Streaming

MapR

Data analysis

Hbase

Storm

Kafka

Streamsets

Data aggregation

Kafka

Impala

Query

Outputs

Dashboard & ReportingNiFi

Logs

Page 5: PNDA - Platform for Network Data Analytics

What is PNDA?PNDA brings together a number of open source technologies to provide a simple, scalable open big data analytics Platform for Network Data Analytics

Linux Foundation Collaborative Project based on the Apache ecosystem

Page 6: PNDA - Platform for Network Data Analytics

• Simple, scalable open data platform

• Provides a common set of services for developing analytics applications

• Accelerates the process of developing big data analytics applications whilst significantly reducing the TCO

• PNDA provides a platform for convergence of network data analytics

PNDA

PNDAPlugins

ODL

Logstash

OpenBPM

pmacct

XR Telemetry

Real-time

Data D

istribution

FileStore

Platform Services: Installation, Mgmt, Security, Data Privacy

App Packaging and Mgmt

Stream

Batch

Processing

SQL Query

OLAP Cube

Search/Lucene

NoSQL TimeSeries

DataExploration

Metric Visualisation

Event Visualisation PNDA

Mnged App

PNDA Mnged App

UnmngedApp

UnmngedApp

Query Visualisationand Exploration

PNDA Applications

PNDAProducer API

PNDAConsumer API

Page 7: PNDA - Platform for Network Data Analytics

• Horizontally scalable platform for analytics and data processing applications

• Support for near-real-time stream processing and in-depth batch analysis on massive datasets

• PNDA decouples data aggregation from data analysis

• Consuming applications can be either platform apps developed for PNDA or client apps integrated with PNDA

• Client apps can use one of several structured query interfaces or consume streams directly.

• Leverages best current practise in big data analytics

PNDA

PNDAPlugins

ODL

Logstash

OpenBPM

pmacct

XR Telemetry

Real-time

Data D

istribution

FileStore

Platform Services: Installation, Mgmt, Security, Data Privacy

App Packaging and Mgmt

Stream

Batch

Processing

SQL Query

OLAP Cube

Search/Lucene

NoSQL TimeSeries

DataExploration

Metric Visualisation

Event Visualisation PNDA

Mnged App

PNDA Mnged App

UnmngedApp

UnmngedApp

Query Visualisationand Exploration

PNDA Applications

PNDAProducer API

PNDAConsumer API

Page 8: PNDA - Platform for Network Data Analytics

Why PNDA?There are a bewildering number of big data technologies out there, so how do you decide what to use?

We've evaluated and chosen the best tools, based on technical capability and community support.

PNDA combines them to streamline the process of developing data processing applications.

Page 9: PNDA - Platform for Network Data Analytics

PNDA Technologies

Page 10: PNDA - Platform for Network Data Analytics

Why PNDA?Innovation in the big data space is extremely rapid, but combining multiple technologies into an end-to-end solution can be extremely complex and time-consuming

PNDA removes this complexity and allows you to focus on developing the analytics applications, not on developing the pipeline – significantly reducing the effort required and time-to-insight

Page 11: PNDA - Platform for Network Data Analytics

PNDA Software Components

Page 12: PNDA - Platform for Network Data Analytics

• Platform for data aggregation, distribution, processing and storage

• Automated installation, creation, and configuration• Openstack, AWS and baremetal• Typical install ~1hr• Modular install

• Open producer and consumer APIs• Avro platform schema

• Plugins for Logstash, pmacct, OpenBMP, OpenDaylight, Cisco XR-telemetry, bulk ingest …

• Data distribution – Apache Kafka

• Data store:• Automated data partitioning and storage

(HDFS)• OpenTSDB – time series analysis• Hbase - NoSQL

• Support for batch and stream processing:• Apache Spark and Spark Streaming

• Jupyter notebook server for app prototyping and data exploration

• Impala-based SQL query support

• Grafana for time series visualisation

• PNDA application packaging

• PNDA management and dashboard

PNDA 3.4 Capabilities

Page 13: PNDA - Platform for Network Data Analytics

• The PNDA console provides a dashboard across all components in a cluster

• Inbuilt platform test agents verify the operation of all components

• Active platform testing verifies the end-to-end data pipeline

PNDA Console

Page 14: PNDA - Platform for Network Data Analytics

• Ingested data should be encapsulated in PNDA Avro schema and published on a pre-defined Kafka topic or set of topics

Publishing Data to PNDA

Page 15: PNDA - Platform for Network Data Analytics

PNDA PluginsData Type Data Aggregator Data Aggregator Reference PNDA Producer Plugin ReferenceBGP (inc. BGP LS) OpenBMP http://www.openbmp.org/#!index.md#Usi

ng_Kafka_for_Collector_Integrationhttp://pnda.io/pnda-guide/producer/openbmp.html

BGP PMACCT (BGP listener) http://www.pmacct.net/ http://pnda.io/pnda-guide/producer/pmacct.html

Bulk Ingest PNDA Bulk Ingest Tool http://pnda.io/pnda-guide/bulkingest/ISIS PMACCT (ISIS listener) http://www.pmacct.net/ http://pnda.io/pnda-

guide/producer/pmacct.htmlCisco XR streaming telemetry Pipeline https://github.com/cisco/bigmuddy-

network-telemetry-collectorCollectD (CollectD supports multiple plugins as listed here https://collectd.org/wiki/index.php/Table_of_Plugins)

Logstash https://www.elastic.co/guide/en/logstash/current/plugins-codecs-collectd.html

http://pnda.io/pnda-guide/repos/prod-logstash-codec-avro/

IoT sensor via HTTP Node-RED https://nodered.orgLogstash (Logstash supports multiple plugins as listed here https://www.elastic.co/guide/en/logstash/current/input-plugins.html)

Logstash http://pnda.io/pnda-guide/repos/prod-logstash-codec-avro/

NETCONF Notifications ODL http://www.opendaylight.org/ http://pnda.io/pnda-guide/producer/opendl.html

Netflow / IPFIX Logstash https://www.elastic.co/guide/en/logstash/current/plugins-codecs-netflow.html

http://pnda.io/pnda-guide/repos/prod-logstash-codec-avro/

Netflow / IPFIX / sFlow pmacct http://www.pmacct.net/ http://pnda.io/pnda-guide/producer/pmacct.html

Openstack Work in progresssFlow Logstash https://github.com/ashangit/logstash-

codec-sflowhttp://pnda.io/pnda-guide/repos/prod-logstash-codec-avro/

SNMP Metrics and Traps ODL https://wiki.opendaylight.org/view/SNMP_Plugin:Getting_Started

http://pnda.io/pnda-guide/producer/opendl.html

SNMP Traps Logstash https://www.elastic.co/guide/en/logstash/current/plugins-inputs-snmptrap.html

http://pnda.io/pnda-guide/repos/prod-logstash-codec-avro/

Syslog Logstash https://www.elastic.co/guide/en/logstash/current/plugins-inputs-syslog.html

http://pnda.io/pnda-guide/repos/prod-logstash-codec-avro/

Syslog (RFC3164 or RFC5424 - needed for newer IOS/IOS XR/ NX OS etc.)

Logstash https://gist.github.com/donaldh/89b7304981f96497c94fe4d98bb03d71

http://pnda.io/pnda-guide/repos/prod-logstash-codec-avro/

Page 16: PNDA - Platform for Network Data Analytics

Design time vs. runtime

pico

standard

Page 17: PNDA - Platform for Network Data Analytics

BGP Analytics Pipeline

Open BPM Collector

BGPBGP

BMP

Logstash

PNDA Cluster

Gobblin

HDFS

Kaf

ka

Spark

OpenTSDB

BGP Data Service

UI

Impala

Page 18: PNDA - Platform for Network Data Analytics

PNDA Applied to NFV

Infra

struct

ure

Analytics

DataAggregators

OpenDataPlatform(PNDA)

AnalyticsApplications

OpenSource Custom Licensed

Alerts

Metrics

Telemetry

Logs

DataSources

InventoryOrchestration

NFVO

VNFM

VIM

NFVI

VNF

DataCenterCoreUser

State Data

Access Aggregation

Related as loosely coupledsystems

ContextNetworkControl

Page 19: PNDA - Platform for Network Data Analytics

Convergence of network data analytics

OperationalIntelligence

PlanningIntelligence

SecurityIntelligence

Page 20: PNDA - Platform for Network Data Analytics

• PNDA 3.5• ElasticSearch integration• CentOS / RHEL• Offline install

• Future• Apache Kylin• Apache Ambari• Containerisation• Deep-learning framework• Red PNDA – the smallest PNDA

yet!

What’s coming?

Page 21: PNDA - Platform for Network Data Analytics

Come and join us!

Page 22: PNDA - Platform for Network Data Analytics