Post on 31-Mar-2015
Copyright © 2004 Micromuse Inc. All rights reserved.
From Consolidated Operations to Service Management with the Netcool Suite
General SessionDoug McClureSr. Manager, Service and Technology Monitoring, EarthLinkOctober 14, 2004
2Copyright © 2004 Micromuse Inc. All rights reserved.
Agenda
> EarthLink Overview
> Innovation, Technology, and Change
> And the need for open, flexible, adaptable monitoring solutions
> IT Operations and Business Maturity
> Challenges facing EarthLink and Roadmap to Improvement
> EarthLink Service and Technology Monitoring
> Improving Service, Customer, and Business Performance and Availability
> Enabling ITIL Best Practices with the Micromuse Suite
> Service Management Database, Change Management Dashboard
> Linking IT Operations with the Business
> Business Process/Activity Monitoring and Dashboards
> Continuous Improvement
3Copyright © 2004 Micromuse Inc. All rights reserved.
EarthLink Overview
> One of the Nation’s Largest ISPs
> Headquarters in Atlanta, GA
> Key facilities in Dallas, Pasadena, San Jose, Knoxville, and Seattle
> Profitable, strong balance sheet
> Largest DSL footprint
> First-to-market with products that provide the best possible Internet experience
> Customer Advocacy: Fighting SPAM, Abuse, and Fraud (Phishers)
> Technical solutions
> Litigation
> Legislative support
> Industry collaboration
> Consumer education
> 10th Anniversary (1994-2004)
> http://www.redefineyourworld.com
4Copyright © 2004 Micromuse Inc. All rights reserved.
EarthLink Overview
5.25M Customers
> ~4M Dialup (Premium ~3.5M, Value ~500K)
> ~1.2M Broadband (Cable, xDSL)
> ~160K Web Hosting (Unix, Windows)
> ~50K Wireless (Blackberry, PDA, Laptops, Wi-Fi)
> Dial Access Coverage > 90% of US Population
> ~16K Local Dial Access Numbers
> ~500K Active Modem Ports (~50% ELNK, ~50% Outsourced)
> ~250 PoPs (18 Core Backbone PoPs, four data centers)
> Broadband Coverage
> ~200 Markets with Broadband Offerings
Large and Diverse Infrastructure
> ~2300 Network Elements
> ~1600 Server Elements
> Thousands of Access Circuits, Hundreds of WAN Circuits
5Copyright © 2004 Micromuse Inc. All rights reserved.
EarthLink Overview
Access Technology Innovation
> Premium and Value Dial-up
> Broadband (Cable, xDSL, Satellite)
> Voice (Converged Devices, VoIP, SIP)
> Wireless (WiFi, CDMA, Blackberry, PDA)
> Broadband over Power Lines (BPL)
> IP Services (Triple Play)
Value Added Service and Product Innovation
> Blocker Family: spamBlocker, POP-UP Blocker, ScamBlocker, Virus Blocker, Spyware Blocker (www.blockoftheday.com)
> Parental Controls
> Webmail, Web Accelerator
6Copyright © 2004 Micromuse Inc. All rights reserved.
EarthLink Overview
Exceptional Customer Service
> 2004 J.D. Power and Associates Customer Satisfaction Award for High-Speed and Dial-Up Internet Service
> 2003 PC Magazine Readers' Choice Awards for both high-speed and dial-up services
> 2003 highest ranking in customer satisfaction for the second year in a row for high-speed Internet service by J.D. Power and Associates in its Internet Service Provider Residential Customer Satisfaction StudySM
> 2003 CNET Editors' Choice award
7Copyright © 2004 Micromuse Inc. All rights reserved.
“
”
A company can't outgrow its competitors unless it can
out-innovate them.
Source: Gary Hamel and Gary Getz, in ‘Funding Growth in an Age of Austerity’
Innovation, Technology, and Change
8Copyright © 2004 Micromuse Inc. All rights reserved.
Innovation = Constant Change
Drivers
> Customer Retention – Decrease Churn
> Speed to Market, Competition – Do more, faster
> Quality, Performance, Support Costs
> Compliance - Sarbanes-Oxley, Visa CISP
Operational Challenges
> Release Management
> Change Management
> Service Level Management
> Enterprise Security
9Copyright © 2004 Micromuse Inc. All rights reserved.
Leading Edge Technology = Constant Change
Drivers
> Voice – SIP
> Broadband
> Wireless (WiFi, Regulated, Unregulated)
> Content, Rich Internet Applications
> End-to-End Services
> Custom Applications
Operational Challenges
> Fault, Performance, Availability, Utilization Monitoring
> Vendor Lag in Support
> Lack of a Standard Fault, Performance, Availability, Utilization API
10Copyright © 2004 Micromuse Inc. All rights reserved.
“
”
It is not the strongest of the species that survive, nor the most intelligent, but the one most responsive to change.
Source: Charles Darwin
IT Operations and Business Maturity
11Copyright © 2004 Micromuse Inc. All rights reserved.
Operations Maturity: Growing Up, Focused on Four Areas
Service Level Management
> All Tier 1, 2, 3 Support Groups in Operations
> Set and manage expectations internal/external to Operations related to responsiveness and resolution of production issues
Change Management
> Provide oversight and control of the production environment
> Minimize risk and impact from change activities
Release Management
> Development Operations
> Minimize poor quality production releases
Enterprise Security
> Compliance, control, audit
12Copyright © 2004 Micromuse Inc. All rights reserved.
Operations Maturity: Common Language and Best Practices
Production Improvement Program (PIP)
> Foundation in IT Service Management, ITIL, CobIT
> Focusing on four main areas: Service Level Mgmt, Change Mgmt, Release Mgmt, and Production Security
> Over the past four months, 20% of Operations staff have now attended ITIL Training
> 1 Master Level Certified (two more pending results)
> 12 Practitioner Level Trained in CCR Quadrant
> 8 Change Management Practitioner Certified (more pending results)
> 4 Configuration Management Practitioner Certified
> Over 130 Foundation Level Trained and Certified
13Copyright © 2004 Micromuse Inc. All rights reserved.
Production Improvement Program
Release Planning
Dev / Procurement
Release Design, Build
Release Acceptance
Roll-out Planning
Comm, Prep, Training
Distribution/ Installation
Policy, Procedures, Standards & Guidelines
Security Consulting
Security Assessment
Security Monitoring
STATUS CHANGE (1)Prioritization, Risk Assessment and
Forward Schedule of Change
STATUS CHANGE (2)
Change Approval and Proj. Service
Availability
STATUS CHANGE (3)Final Change Approval and
Implementation
Corp Project
Ops Project
Non-Project
Pro
d S
ec
REQUEST FOR CHANGE (RFC)
CLOSED RFC
STATUS CHANGE (4)
Review Changes
Security Test & Sign off
Rel
ease
Mg
tC
han
ge
Mg
t
Mutual Benefit from EarthLink’s Innovation and Advanced Use of Micromuse Products
Micromuse OMNIbus, Impact, Webtop, RAD
Source: EarthLink SLM Group
Metrics &
Reporting
14Copyright © 2004 Micromuse Inc. All rights reserved.
“
”
Creativity involves breaking out of established patterns in
order to look at things in a different way.
Source: Edward de Bono
EarthLink Service and Technology Monitoring
15Copyright © 2004 Micromuse Inc. All rights reserved.
EarthLink and Micromuse Facts
Very Early Netcool Adopter
> EarthLink (Mindspring) was Micromuse’s first US customer
> Began evaluating Micromuse Netcool in 1996, official customer April 1997
Early Innovation
> Early joint innovation and development helped build foundation for many of Micromuse’s key products
Driving 3rd Party Vendor Integration & Partnerships
> Much more than just “sending SNMP TRAPs EarthLink requires in-depth integration with Micromuse suite
Current Deployment
> Netcool OMNIbus, Internet Service Monitors, SM Reporter, Desktop Clients, Webtop, Impact, numerous Gateways, Probes, Data Source Adaptors
> Preparing for OMNIbus v7 migration, RAD 2.0
> Plan to evaluate Precision
16Copyright © 2004 Micromuse Inc. All rights reserved.
Moving Beyond “MoM” and Apple Pie
EarthLink’s Early Micromuse Netcool Deployment
> Focused on Netcool as the “Manager of Managers” or “MoM”
> Needed during EarthLink’s rapid growth and expansion
> Enabled event management eliminated “swivel chair NOC”
“Apple Pie” is Event Correlation and Deduplication
> The Netcool sweet spot was providing EarthLink with event correlation and deduplication
> Enables Tier 1 and Tier 2 break/fix support groups to operate efficiently
Focus now on End-to-End Service Management
> Netcool Suite allows EarthLink to manage entire service
> We can understand service relationships, service levels, and service impact; perform service modeling and service discovery
> Enables impact assessment, prioritization, understanding full service delivery chain
> Eliminate “needle in the haystack” approach of event management
17Copyright © 2004 Micromuse Inc. All rights reserved.
The Service IS Important
End-to-End Service Management and Monitoring
> End-to-End service monitoring is my team’s #1 goal!
> Providing that all layers (L1-L7) of the infrastructure are thoroughly instrumented, real-time monitoring of the true end-to-end service is possible
> Service discovery, topology, dependency mapping, and change control ARE REQUIRED for highly accurate service monitoring
> “Intimate Service and Infrastructure Knowledge” can be instrumented
> Developers and support staff have deep understanding of how our services operate and their unique operational characteristics and dependencies
> This knowledge can be programmatically instrumented and monitored, correlated, analyzed, and presented in real-time
> Immediate notification to support groups when service infrastructure capabilities or performance degrades
18Copyright © 2004 Micromuse Inc. All rights reserved.
Service Management Complexity
S111
ANY WEB BROWSER
PALM CLIENT
CLIENTClientApplications
PresentationLayer
ApplicationServicesLayer
InfrastructureLayer
CoreServicesLayer
HTML
S86S84
APIs
APIs
APIs
StorageS110
S91
S112
Tickets
S102
ANY WEB BROWSER
S83
HTML
S81
IMAP
S108 S104
API 1
S82
API 4 API 7
API 2
S88
S106
S101S100
SMTP
API 5API 3
POP3
API 6
S109
S90
HTMLHTML
S103
S107
CLIENT
S87
S105
S80
S85
To Other Systems
Good Customer Experience? Performance?
Infrastructure Events to Netcool
Source: EarthLink Product Group
19Copyright © 2004 Micromuse Inc. All rights reserved.
Service Management Complexity
Number of Components
Time(24x7x365)
Infrastructure Changes
Infrastructure Events
D
D
D D D D
D
D
D D D D
D D D D
D D D D
D D D D
•Event information increases exponentially by amount of number of components, time (growth), and infrastructure changes
•Over 1500 Servers, 2300 Network Elements, and 20K Interfaces/Circuits
•Netcool/ObjectServer is a must have for effectively managing service event stream from end-to-end
•Impact 3.0’s cluster capability will greatly improve ability to analyze, enrich, suppress, and manage event stream regardless of our growth
Source: EarthLink Product Group
20Copyright © 2004 Micromuse Inc. All rights reserved.
The Customer IS Important
Customer Experience Management and Monitoring
> The Micromuse Netcool Suite enables consolidation and understanding of proactive, real-time monitoring of the customer’s experience for core EarthLink services
> Proactive, real-time monitoring of the customer’s experience
> Traditional Infrastructure Monitoring (SNMP, System Agents, Service Port Monitoring)
> Synthetic transaction monitoring
> Customer Agent based monitoring,
> Agentless application, transaction, and customer performance monitoring (Emerging)
> Becomes the “glue” that ties infrastructure monitoring together
> Powerful information when customer experience and infrastructure monitoring data is correlated, analyzed, and presented in real-time
> Immediate notification to support groups when customer’s experience degrades
21Copyright © 2004 Micromuse Inc. All rights reserved.
The Business IS Important
Business Activity Monitoring and Management
>Expands IT Operations visibility vertically and horizontally
>Ties IT Operations data and Business data together
> System Downtime vs. Contact Center Call Volume
> Real-Time Customer Subscriptions vs. Sales Forecasts
>Almost any process can be instrumented and monitored in real-time, have policies applied to it, and be presented in a dashboard or portal for presentation
>Enables Real Time Monitoring and Management of Business and IT processes
> Change and Downtime Management
> Customer Registration Management
22Copyright © 2004 Micromuse Inc. All rights reserved.
“
”
If you always do what you've always done, you'll always get what you always got.
Source: From a speech, unattributed
Enabling ITIL Best Practices with the Micromuse Suite
23Copyright © 2004 Micromuse Inc. All rights reserved.
Enabling ITIL Best Practices
Incident and Problem Management
> IM: Low level event classification, service dependencies, full integration with Remedy, Service Management DB (SMDB)
> PM: Long-term historical event database for trend research, Service Management DB (SMDB)
Change and Release Management
> CM: Change Management System (CMS/RFC), Service Management DB (SMDB), service dependencies, impact on infrastructure from changes or downtimes
> RM: Monitoring can greatly help in the development, test, and staging environments PRIOR to release to production
Performance and Availability Management
> PM/AM: Continuous low-level element and system level testing and data collection, trending, reporting, and alerting
Capacity Management
> CM: Continuous low-level element and system data collection, trending, reporting, and alerting
24Copyright © 2004 Micromuse Inc. All rights reserved.
SMDB
•Information about end-to-end service, service dependencies, relationships, topology, elements, production status, etc.
•Self-serve customer interfaces into the service management and monitoring process
•Auto-provision monitoring on all applications reduce administrative overhead
•Not a low-level configuration management database (CMDB), but could be the virtual high-level CMDB
SMDB Modules
•Change Management System (CMS) / Downtime Request (DTR)
•All RFC’s/DTR’s managed from within the SMDB complex, full lifecycle management, full risk and approval matrices, service management policies, interested parties
•Impact of changes/downtimes immediately known within infrastructure through Impact 3.0 integration, policy creation, and event management
•Element Management (network, server, application), ISM Creation, Agent Configuration, etc.
Service Management Policies
•Information about customer and business defined service management policies, SLA/OLAs, etc.
Service Management Database: ITIL/PIP & Service Management
25Copyright © 2004 Micromuse Inc. All rights reserved.
Service Management Database: ITIL/PIP & Service Management
Source: EarthLink Service and Technology Monitoring
26Copyright © 2004 Micromuse Inc. All rights reserved.
“
”
What gets measured, gets done!
Source: Tom Peters
Business Process Monitoring – ITIL Change Management
27Copyright © 2004 Micromuse Inc. All rights reserved.
Overview – Controlling Change and Benefits
Drivers
> Adoption of ITIL/COBIT Best Practices for Change Management
> Significant change for many groups – Fear, Uncertainty, Doubt (FUD)
> No Real-Time Visibility into Change/Downtime Management Activities
> Business Process
> Who, What, When, Where, Why, and How, Cost, Risk, and Impact
> Workflow – Monitor Lifecycle, SLAs, Bottlenecks – Is the process enabling Operations or is it a bottleneck?
> Impact on Infrastructure – False Positives, Contact Center Call Volume (COGS)
> Drive out False Positives from Production Monitoring Systems
> Huge burden on NOC and other support staff
> Desire to have Automated Remedy Trouble Ticket Creation
> Reduce time to address problems, reduces MTTR
28Copyright © 2004 Micromuse Inc. All rights reserved.
Enabling Change Management with Netcool Suite
Solution
> Provide Real-Time Visibility into Change/Downtime Process
> Create Actionable Information
> Ensure Business Rules are Guiding/Enabling the Process – Not Hindering It
> Eliminate FUD
> Report (dashboards, reports) on Process and Impact
> NOC and other support groups know what’s happening during change and downtime windows
> Management has oversight and visibility
> Business understands impact of change and downtime activity
29Copyright © 2004 Micromuse Inc. All rights reserved.Source: EarthLink Service and Technology Monitoring
Source: EarthLink Service and Technology Monitoring
30Copyright © 2004 Micromuse Inc. All rights reserved. Source: EarthLink Service and Technology Monitoring
31Copyright © 2004 Micromuse Inc. All rights reserved. Source: EarthLink Service and Technology Monitoring
32Copyright © 2004 Micromuse Inc. All rights reserved.
Business Activity Monitoring
Source: EarthLink Service and Technology Monitoring
33Copyright © 2004 Micromuse Inc. All rights reserved.
RAD 2.0 Presentation
Source: EarthLink Service and Technology Monitoring
34Copyright © 2004 Micromuse Inc. All rights reserved.
Netcool Event Management
Change/Downtime Request Events
Suppressed Change/Downtime Activity Events
Change / Downtime Status
Event Suppressed by Change / Downtime
Change / Downtime ID
Source: EarthLink Service and Technology Monitoring
35Copyright © 2004 Micromuse Inc. All rights reserved.
Future Enhancements
Planned Netcool/Impact Policies
> Impact on EarthLink
> COGS: Assess support cost impact due to change and downtime activities within Operations and Customer Support in Real-Time
> Tier 1, 2, 3 Support Cycles
> Better Change and Release Management Planning
> Data Gap Management
> A common question: Why does my chart or graph have gaps?
> The solution: Annotate graphs, charts, portals, etc. with the reason for data gaps caused by planned change/downtime activities
> How: Integrate change and downtime event information with all performance, utilization, and capacity monitoring solutions via Impact 3.0
36Copyright © 2004 Micromuse Inc. All rights reserved.
Business Activity Monitoring: Real-Time Customer Registration Dashboard
RAD 2.0 Joint Development
Source: EarthLink Service and Technology Monitoring
37Copyright © 2004 Micromuse Inc. All rights reserved.
“
”
We have a ‘strategic plan’. It’s called doing things.
Source: Herb Kelleher
Continuous Improvement
38Copyright © 2004 Micromuse Inc. All rights reserved.
Continuous Improvement
> Making Applications “Monitoring Aware and Netcool Ready”
> Work with developers on getting a monitoring API embedded into applications
> Every application and tier linked into Netcool directly (not through server agent)
> Discovery, Topology, Dependency Modeling
> Monitoring accuracy and root cause depend on this!
> Need solution for Layer 1-7, likely two solutions (L1-3 & L4-7)
> Application, Transaction and Customer Performance Monitoring
> Synthetic transactions only get us so far…but will continue to evolve
> Don’t forget about client-server – everything isn’t web enabled!
> Agentless technologies are emerging to accurately map out application and transaction flows, relationships, and topology
> Next (2nd/3rd) Generation Quality, Performance, Capacity, Utilization solution needed
> Services, Applications, Servers, Storage, Network
39Copyright © 2004 Micromuse Inc. All rights reserved.
Continuous Improvement
Building better Network and Systems Management
> Founded Atlanta Network and Systems Management Technical User Group (ANSMTUG) in January 2004
> http://www.ansmtug.org
> Metro-Atlanta Fortune 100, Service Providers, Enterprise, Media, and Emerging Technology Companies
> Bell South, The Home Depot, EarthLink, Southern Company, N2 Broadband, eDeltacom, Delta, CNN, Cingular, E*Trade, Knology Broadband, Cox Communications
> Customers helping Customers
> Use Micromuse and other NSM products better
> Collectively drive product requirements and features into Micromuse and other NSM vendors
40Copyright © 2004 Micromuse Inc. All rights reserved.
Closing and Questions and Answers
> EarthLink is a happy Micromuse customer
> EarthLink depends on the Netcool suite’s openness, flexibility and adaptability to keep up with innovation, technology, and constant change
> EarthLink will continue to push the Netcool suite beyond the sales and marketing slick
> EarthLink’s infrastructure, service, customer, and business performance and availability continues to improve because of our advanced use of the Netcool suite
> Q&A