Data Science for Effective Network Operations - cisco.com · Data Science for Effective Network ......
Transcript of Data Science for Effective Network Operations - cisco.com · Data Science for Effective Network ......
Cisco Knowledge Series
Sept 6th, 2017
Data Science for Effective Network Operations
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Fully Aware• Get more insight into your network and data out of your network
Open Intelligence• Constantly learn, develop and deploy solutions faster and more efficiently
Predictive Control• Make things happen rather than waiting for things to happen
Cisco AutomationYour Competitiveness
Speakers / Panelists
Kiran InampudiSolutions Leader- Cisco, GSP Services
Shelly CadoraPrincipal Engineer- Cisco, GSP Engineering
Patrick LinVP Product & Partnerships, SignalFx
Agenda
Relevance of Data Science for Network Ops
Cloud Scale Networking / Model Driven Telemetry
Modern Monitoring Approaches / SignalFx
Demo & Customer Use case
Q&A
What is Data Science ?
Data science involves using automated methods to analyze massive amounts
of data and to extract knowledge from them.
Data science is a broad discipline that includes
• Statistics
• Computer science
• Applied mathematics
• Machine learning/AI
• Visualization
Data Science is a methodology, not a feature!
Data Science for SP Network Operations
Advanced Security
Optimization
Predictive Maintenance
Forecasting
Reduce Alerts
Cloud Scale Networking
NFV/SDN
Real-time
Telemetry Cloud
Raw
Data
Cleaned
Data
Standard
Reports
Data Exploration &
Metrics Reporting
Predictive
Analytics
Prescriptive
Analytics
Big Data & Advanced Analytics
Why did it happen
What w ill happen
What is the best
w ay to do this
Statistics & Data
Visualization
What happened
Sense & Respond Predict &
Act
Co
mp
etitive
Ad
va
nta
ge
Business Intelligence Advanced Analytics
Cognitive
Analytics
Machine tells humans
w hat to do!
Source : SAP
What skills are required to succeed?
SP Operations
Transformation
SP Domain
Expertise
Software
Automation
Data
Science
Expertise
Math & Software
Background
SP Ops Workflow
Algorithms & Models
Software Dev
Expertise
Open Source Tools
Identify use cases
Poll Question
Agenda
Relevance of Data Science for Network Ops
Cloud Scale Networking / Model Driven Telemetry
Modern Monitoring Approaches / SignalFx
Demo & Customer Use case
Q&A
Current State of Network Operations
Troubleshooting
complexity
Network
outages
Network
planningPain points
Caused by
Leading to
Increasingly
dynamic and diverse traffic
patterns
Manual
operations
Coarse visibility
into networks
Over-
provisioned networks
Poor customer
experienceOpEx increase
Network Operations Must Evolve to Cloud Scale
Cloud Scale
Network Operations
SW
Modularity &
Extensibility
Automation
Visibility &
Control
Day 1
CONFIGURE
Day 2
MANAGE
& OPTIMIZE
Day 0
INSTALL
Traditional Network
Operations
Inflexible
SW
Manual
Provisioning
Fragmented
Topology View and
Complex
Routing
DevOps
• Automated services
• Agile, Simple to scale
• Data Driven, Real-time
SNMP
storage & analysis
sensing &
measurement
Network data is bottlenecked
Where Data Is Created Where Data Is Useful
CLI
Syslog
SNMP Server
Syslog Collector
Scripts
Non real
time
Strong burden on back-end
Must normalize different encodings, transports, data
models, timestamps
The New Paradigm – Free the Data
sensing &
measurement
Where Data Is Created Where Data Is Useful
T
T
T
Real
time
As Much Data
As Fast
As Useful
As Easy
As Possible
Storage & analysis
Kafka
Different Ways to Store and Analyze
Logstash
ElasticSearch
Kibana
Panda
BYO
Black
Box
S
S
T
Custom Open Source
SignalFx
SaaS
Prometheus
Grafana
Poll Question
Agenda
Relevance of Data Science for Network Ops
Cloud Scale Networking / Model Driven Telemetry
Modern Monitoring Approaches / SignalFx
Demo & Customer Use case
Q&A
@signalfx | www.signalfx.com23
SUMMARY TREND SOLUTION CRITERIA TECH REQUIREMENTS
1. Scale-out open source software
and microservices architectures
Monitor service-wide aggregate metrics
vs. element-specific metrics
Streaming aggregation, composite
metrics
2. Dynamic or ephemeral
infrastructure, e.g. containers
and serverless
Instant discovery with real-time
monitoring
Streaming high-resolution metrics
with zero-lag discovery
3. Developer choice, decentralized
teams
Shared context across distributed
teams and instant visibility into any
changes in environment
Centralized, self-service, real-time
operational intelligence solution
4. High-velocity release cycles Predictive analytics and proactive
alerting to identify emerging trends
Real-time streaming analytics,
alerts, anomaly detection, outlier
detection
5. Troubleshooting across
distributed teams
Advanced real-time correlation and
dimensional analysis.
Interactive, high-cardinality queries
and alerts
CLOUD NATIVE MONITORING REQUIREMENTS
@signalfx | www.signalfx.com24
DISCOVER & COLLECTOpen-source based collection and instrumentation that’s future proof for microservice environments
No agent lock-in, ever
100+ supported integrations and counting
New instances visible in SignalFx with minimal lag
@signalfx | www.signalfx.com25
MASSIVELY SCALABLEREAL-TIME ARCHITECTURE Built from the the ground up for containers, functions and frequent releases
Architected for real-time monitoring of dynamic and ephemeral environments
Powerful streaming analytics engine enables predictive monitoring and preventative operations
Highly secure, SOC 2 Type II compliance
Highly available (99.9%), 24/7 oncallservice operations Sam Eaton
Director of Engineering Operations
“For the vast amount of metrics we produced, SignalFx was the only
solution that didn’t fall over. You guys clearly know what
you are doing.”
EXPLORE AND VISUALIZE
Built-in dashboards and Navigator to get started
Explore and gain insights from metrics data using interactive, streaming UI
Customize and annotate metrics dashboard to ensure shared understanding and interpretation
DETECT AND ALERT
Built-in detectors and alert conditions to get started
Configure or customize to detect exactly the pattern, trend or outlier you need to know
Confirm selected alert condition works as intended with preview
Notify via email, Slack, PagerDuty, etc.
CORRELATE AND TRIAGE Shared visibility into apps and infra (and how they
are related)
Use SignalFx's interactive analytics capabilities to further correlate and troubleshoot
Collaborate with team, share work-in-progress
Create new detectors and dashboards (or edit existing ones) as part of post-mortem
ENTERPRISE ADMIN
Centralize, simplify subscription management
Enforce change control over custom content
Organize dashboards and alerts by team
Solution Overview- Cisco Telemetry + Signal FX
SignalFxTM
Streaming & HistoricalAnalytics
Visibility
Timely Alerts
Analytics
Advanced
Troubleshooting
Real time streaming
Temperature Power
Fan Speed CPU
Process
UptimeMemory
Metrics
Platform Stability- Metrics
Route FlapNeighbor
Counts
Convergence
TimeRoute Memory
VRF State Errors
Metrics
Routing Stability- Metrics
Agenda
Relevance of Data Science for Network Ops
Cloud Scale Networking / Model Driven Telemetry
Modern Monitoring Approaches / SignalFx
Demo & Customer Use case
Q&A
Demo
Topology/Setup
Cisco Pipeline
NSO
SignalFX(SaaS)
VisibilityAnalytics
Timely Alerts
Real time / Intelligent Alerting
Real time detection of problem
Summary
Data Science is a Methodology, Not a feature!
SDN/NFV & Data Science is a powerful new approach for making networks more reliable and secure
Data science capabilities should synergize, empower network Operations teams
Recommended Approach- Crawl, Walk & Run
Q/A