Introduction to ITOA/media/HDIFusion/Files/speaker... · 2017-10-16 · Supply Chain Analytics...
Transcript of Introduction to ITOA/media/HDIFusion/Files/speaker... · 2017-10-16 · Supply Chain Analytics...
Introduction to ITOA EDSM-ITOA: IT Operational Analytics for Next Gen. Service Management
(Enterprise Digital Service Management or EDSM)
By: Rajesh Radhakrishnan
Chief Architect, EDSM
CSRA LLC
Objective of this meeting
When we have completed this meeting / web ex session you should be able to:
• Understand the basics (what, why and how) of IT Operations Analytics
• Basic understanding of different types of analytics
• Gain ability to identify IT & IT Operations problems and opportunities (Use Cases), where analytics can be applied
Content
What is ITOA? Context
Why do we need ITOA?
ITOA: How do we do it?
Backup Slides – More Details
ITOA Framework: Introduction
What is ITOA? Definitions
Data (Small & Big) and Data Science: Applications
Business Analytics
Marketing Analytics
Operations Analytics
Supply Chain Analytics
Financial Analytics
IT Analytics (incl. ITOA)
HR Analytics
Leveraging Data (Small & Big Data), Data Science and Business Transformation based On Operational Data and Operational Analytics, Businesses and Agencies can achieve Operational Excellence & Operations Led Business Transformation
Data Science
Health Care Analytics
Securities Analytics Environment
Analytics Intelligence
Analytics Travel & Tourism
Analytics Energy Analytics
Data Science Applied to Business Functions (Sample)
Data Science Applied to Specific Industries (Sample)
Business & IT Operations Strategy, Architecture, Analytics & Management
Operations Strategy & Planning
Operations Architecture &
Design
Operations Analytics
Operations Management
Operations Improvement
CEO Office
CIO Office
SMO Office
ITSM Architects
Process Architects
CTO Office
Chief Operations Architect
Chief Data / Big Data Architect
Chief Data Scientist
VMO Office
COO
Office
Digital Business & Digital Governance Strategies generally fuses Business & IT operations, requiring a holistic and integrated Operations Strategy, Architecture, Analytics and Management.
Operations
Operations Strategy & Planning
Operations Architecture &
Organization Design Operations Design Operations Analytics
Operations Management
Digital Business & Business
Transformation
Operations Analytics has to viewed in the context of Operations Strategy, Operations Architecture & Design, Operations Management & Business Transformation. Embedding Operational Analytics into Operational
Processes and Enabling Applications is key to achieving Operational Excellence.
Business & IT Operations Strategy, Architecture, Analytics & Management
IT Operations Analytics includes: Analytics for Service Management, Analytics for Systems Management, Cloud Analytics, Sourcing & Supply Chain Analytics, SIEM & Security Analytics, Event Analytics, Dev-Ops Analytics, Six Sigma for IT, IT Financial
Analytics, amongst others.
CSRA EDSM: Overview Customer Need: To adapt ITIL thinking to new trends in Business & IT.
Provide next generation service management capabilities based on clients unique needs and help clients manage Business & IT services with a focus on client value.
Current Capabilities:
EDSM, Hybrid Service Management, Cloud Service Management, Agile Service Management & Lean Service Management capabilities
Key Partners
Service Now
Amazon AWS & MSFT Azure
CliQr, CloudCheckr, Science Logic, Splunk, Perspica, other CMPs
Key shifts
Emerging Trends
▪ Shift to Digital Business and Digital Governance Strategies ▪ Shift to Multi-Sourcing strategy & Federated Service Management ▪ Shift to Cloud and need for Cloud Service Management ▪ Shift to Agile Agency, Operations and Agile Service Management ▪ Shift to Process Standardization and Automation
Components
People
▪ EDSM OCM or Organization Change Management. ▪ EDSM Training Content (CSRA built) ▪ Service Now, AWS and Azure and other CMP training. ▪ ITIL, SIAM, CSM, ITOA, Dev-Ops & Lean Sigma Training.
Process
▪ EDSM Core Processes (ITIL v3 based) ▪ SIAM (Service Integration & Mgmt.) & CSM (Cloud Mgmt.) Processes ▪ ASM and LSM (Agile & Lean Service Management) Processes ▪ Process & Method to Assess, Architect, Implement (Transition & Transformation) and Manage EDSM
capabilities (Playbooks & WPs).
Technology
▪ Process as a Service based on Service Now platform ▪ Multiple CMP platforms from CloudCheckr to CliQr to Science Logic ▪ Lean Service Management tools such as Lean-Kit ▪ Agile Service Management tools such as Service Now and JIRA ▪ Tools for Service Management Analytics such as ServiceNow, Science Logic, Splunk, Perspica, Tableau
& others (ITOA tools)
Delivery
▪ EDSM consulting; Assess, Architect, Implement and Manage. ▪ EDSM Sub-Capability Consulting such as Lean Service Management ▪ Service Now Platform Start Now Methodology ▪ Play Books & Demo for Key Platforms and Integrated Solutions ▪ Leverage Consulting Partners and CSRA ITC Services
Content
What is ITOA? Context
Why do we need ITOA?
ITOA: How do we do it?
Backup Slides – More Details
ITOA Framework: Introduction
What is ITOA? Definitions
EDSM ITOA Framework: Key Building Blocks
Data Sources Data Integration Data Warehouse
& Data Stores Analytics
Visualization, Decision &
Recommendation Engines
• Monitoring & Event Data • Log Data • Ticket Data • Audit & Change Tracking Data • Trace & Component Diagnostics Data • Context Data • Other Sources of Data
• Text & Semantic Analytics • Search Analytics • Web Analytics • Cloud Analytics • Descriptive Analytics • Predictive Analytics • Prescriptive Analytics • Data Mining • Machine Learning • Combinatorial Analytics • Other Analytic Methods
• Sense Making • Policy Making • Decision Making • Idea Making • Recommendation Engines • Decision Engines
EDSM ITOA Framework: Insights from Data
Network Monitoring
IaaS Monitoring
Other Specialized
Monitoring & Event Management\
(SIEM, ARM, …)
Extractor/ Monitor
Extractor/ Monitor
Extractor/ Monitor
Data Integration Services
Integrated (SQL, SQL Like & NoSQL)
End Users / Role based Intelligence (Self Service Reporting)
End Users / Roles based Intelligence (Self Service Analytics)
Metadata
Simple & Complex Query, Data Analysis, Data Mining, Semantic Analysis,
Data Visualization (Use Case Driven)….
Data Mart (Specialized Marts - Optional for 2018)
Integrated Data Layer (with Real Time , Near Real Time and Lagged Data)
Data Source & Data Integration Layer
Data Visualization & Reporting Layer
NoSQL SQL
ITSM Ticketing & Related Systems
Configuration, Asset & Domain Management
Systems
Service & Service Level Management
Application Monitoring
Extractor/ Monitor
CMP & Resource
Monitoring
Extractor/ Monitor
Monitoring (incl. Logs) & Event Management
Tooling
Systems or Resource Monitoring (incl. Logs) & Management
Tooling
Standard, Custom & Self Service Dashboards & Reports
Data Analytics Layer (Descriptive, Predictive & Prescriptive Analytics)
Insights
Information
Data
ITOA: Multi-Mode Architecture
Use Case Specific Service or Process Specific (Multiple Use Cases)
Simple & Complex Query, Data Analysis, Data Mining, Semantic Analysis,
Data Visualization (Use Case Driven)….
Dashboards & Reports
Simple & Complex Query, Data Analysis, Data Mining, Semantic Analysis,
Data Visualization (Use Case Driven)….
Dashboards & Reports
Multiple Service or Process (Multiple Use Cases)
Simple & Complex Query, Data Analysis, Data Mining, Semantic Analysis,
Data Visualization (Use Case Driven)….
Dashboards & Reports
Simple Silo Architecture Complex Silo Architecture Complex Integrated Architecture (See Previous Slide)
4 Stream Approach to Next Gen IT
•Mission & Business Aligned Operations
•Operational Agility and Adaptability
•Fused Business & IT Operations
•Continuous Operational Performance Improvement Optimization
•Observability of Application and Infrastructure
•End to End Visibility (Service Vertical and Domain Horizontal)
•Predictive Analytics for Proactive Management.
•Tactical & Operational Insights
•IT Optimization
•Software Defined Environment / Infrastructure as Code
•Pulling Forward NFR and Resilient Infrastructure.
•Shift Left Infrastructure (Technology) Management
•Self Managing Infrastructure
•Administrator Focus
•Agile Development
•Pulling Forward NFR
•Shift Left Application Management.
•Self Managing Applications
•Application extensibility and Platform thinking
•EA and Application Rationalization
•End User Focus
Application Service
Development
Infrastructure Service
Development
IT Operations IT Operations
Analytics
EDSM ITOA Component: Data Visualization Example Component
End Users / Role based Intelligence (Self Service Reporting)
End Users / Roles based Intelligence (Self Service Analytics)
Strategic Visualizations: Dashboards, Reports and Infographics Audience: CIO & Top Business & IT Management
Purpose:
Tactical Visualizations: Dashboards, Reports and Infographics Audience: IT Management, Process and Service Owners & Managers
Purpose: Service & Process Design Decisions, Service & Process Improvement Plans & Projects.
Operational Visualizations: Dashboards, Reports and Infographics Audience: Operations Team, App. & Tech. Management teams
Purpose: Operational Decisions, Operational Improvements
EDSM ITOA Component: Log Data & Analytics Example Component
Log Management: Process for configuring and managing log sources and logs through their life cycle with activities such as identifying and configuring log sources, storage and archival of logs, consolidation and integration of log data with other data, configuring rules for rotating logs, extraction of event data, cleaning of log data, etc., log analysis and reporting and configuring automated and semi-automated responses to extracted events, amongst others.
Log Sources (Operating System Logs, Application
Logs, Management System Logs, etc.
Consolidated & Integrated Storage of
Logs
Log Analytics & Analytics of Logs and Related Data - Data Mining, Machine Learning (Supervised & Unsupervised), Text & Semantic Analysis, etc.
Analytics Reports and Dashboards (Role Based
Intelligence)
What is Data Science, Analytics & ITOA?
• Data Science: Data science, also known as data-driven science, is an interdisciplinary field about scientific methods, processes and systems to extract knowledge or insights from data in various forms, either structured or unstructured, similar to Knowledge Discovery in Databases (KDD). (Source: Wiki)
• One way to consider data science is as an evolutionary step in interdisciplinary fields like business analysis that incorporate computer science, modeling, statistics, analytics, and mathematics. (Source: NYU)
• Analytics is the discovery and communication of meaningful patterns in data. Especially valuable in areas rich with recorded information, analytics relies on the simultaneous application of statistics, computer programming and operations research to quantify performance. Analytics often favors data visualization to communicate insight. (Source: Wiki)
• Applied Analytics where analytics and analytic models are applied to a specific functional area or domain, such as Operations Analytics, Risk Analytics, Marketing Analytics, among others or specific industry domain such as Health Care, Biotech or Energy or Intelligence.
• IT Operations Analytics: T operations analytics (ITOA) (also known as advanced operational analytics, or IT data analytics) technologies are primarily used to discover complex patterns in high volumes of often "noisy" IT system availability and performance data. Forrester Research defined IT analytics as "The use of mathematical algorithms and other innovations to extract meaningful information from the sea of raw data collected by management and monitoring technologies. (Source: Wiki)
• ITOA solutions collect the structured and unstructured data generated by IT environments, process that data, and display the information in an actionable way so operations teams can make better informed decisions in real-time. (Source: Marcus Sarmento, AppDynamics)
Unstructured Data
• Data can be designated as structured or unstructured data for classification within an organization. The term structured data refers to data that is identifiable because it is organized in a structure. The most common form of structured data -- or structured data records (SDR) -- is a database where specific information is stored based on a methodology of columns and rows. Structured data is also searchable by data type within content. Structured data is understood by computers and is also efficiently organized for human readers. In contrast, unstructured data has no identifiable structure (webopedia)
• Unstructured Data (or unstructured information) refers to information that either does not have a pre-defined data model and/or does not fit well into relational tables. Unstructured information is typically text-heavy, but may contain data such as dates, numbers, and facts as well. This results in irregularities and ambiguities that make it difficult to understand using traditional computer programs as compared to data stored in fielded form in databases or annotated (semantically tagged) in documents (wiki)
• UIMA (Unstructured Information Management Architecture)
Unstructured, Semi-Structured & Structured Data
Unstructured Data Semi-Structured Data Structured Data
Examples: • Text in a Chapter or a White Paper.
• Video or Audio File
• Columnar Data • Key Value Pairs • No-SQL stores
• SQL Database • SQL-Like DB
• Normalized Relational Data
• Multidimensional Cube Data
Conversion Some can be converted to Semi-Structured Data
Some can be converted to Structured Data
Organization Unorganized Moderately Organized Well Organized & Normalized
Share 80%-90% 5%-10% 5%-10%
Potential Work Streams (Work Streams are Columns) Analytics for Service Management Sample Set (Not Exhaustive)
Work Stream Event
Mgmt
Incident Mgmt Problem Mgmt. Configuration /
Asset Mgmt.
Knowledge
Mgmt.
Availability
Mgmt.
Capacity &
Performance Mgmt.
Change
Mgmt.
Release Mgmt.
1: Text and
Semantic Analytics
Auto population of
Incident Tickets
Automated
Generation of
Knowledge Articles
2: Web Analytics UUX Event Rules
based on Click
Stream and Web
Traffic Data
Automated Service and
Transaction Mapping
3: Sentiment
Analytics
Auto-Incident Ticketing
based on Social
Business Platform Data
4: Entity Analytics Automated RCA CI Entity Analytics Automated
Identification of
Knowledge Area
Authorities.
Analytics for Code
Accountability
5: Risk Analytics Analytics for
Configuration Audit
Estimation of
residual risks of SIP
(Service
Improvement Plans)
Automated Service
Availability Risk
Analysis
Automated
Change Risk
Analysis
Automated
Release Risk
Analysis
6: Statistical
Analysis
Incident Ticket
Clustering for
Problem Ticket
Logging.
Analytics for Utilization
based Scaling
Change Impact
Analysis
AB Testing
Analytics
7: Modeling &
Simulation
Simian Army in
Simulated
Environment
Capacity, Utilization and
Performance Simulation
with Synthetic Users
Potential Work Streams (Work Streams are Columns) Analytics for Systems Management Sample Set (Not Exhaustive)
Work Stream Storage
Mgmt.
Network Mgmt. Compute
Mgmt.
Converged
Systems Mgmt.
IaaS Mgmt. PaaS Mgmt. SaaS
Mgmt.
Dev-Ops
Mgmt.
Test-Ops Mgmt.
1: Text and
Semantic Analytics
2: Web Analytics
3: Sentiment
Analytics
4: Entity Analytics
5: Risk Analytics
6: Statistical
Analysis
7: Modeling &
Simulation
http://www.itoa-landscape.org/ Check this web site for use cases, vendor and related information.
Potential Work Streams (Work Streams are Columns) Analytics for Account Management Sample Set (Not Exhaustive)
Work stream Contract Sales
(RFI, Pre-RfP and RFP work)
Contract T&T Contract Steady State Contract Cross
Account – within
Industry
Contract Cross Account
– across Industry
1: Text and Semantic
Analytics
Summarization of Opportunity &
Opportunity Classification.
Account SB platform
2: Web Analytics Process Portals
3: Sentiment Analytics Account SB platform
4: Entity Analytics Entity Analytics for IAM (Identity
and Access Management) Data
Migration.
5: Risk Analytics Contract Risk – from a complexity and
profitability perspective
T&T Risk Analytics
6: Statistical Analysis Time Series Analysis of Event &
Ticketing Data
Time Series Analysis of
Event & Ticketing Data
Account to Account /
Group Comparisons
Account to Account /
Group Comparisons
7: Modeling & Simulation Modeling for estimating the number and
duration of planned change windows
based on # of managed CI, current # of
changes, currency levels, etc …
Modeling & Simulation for Event
Stabilization
Content
What is ITOA? Context
Why do we need ITOA?
ITOA: How do we do it?
Backup Slides – More Details
ITOA Framework: Introduction
What is ITOA? Definitions
Why do we need IT Operations Analytics? • Sense Making (understanding) …
• “Why did this service incident take so long to recover from and restore service ….” • Event Tree Analysis, Time Order of Event Analysis, RCA or Root Cause Analysis, Comparative analysis of data associated with similar incident
within account and across accounts …. • Does this account do a good job with regard to monitoring and event management tools capturing incidents before end users (client team members)
or service desk analyst spot an incident? • Cross tabs by service, account. Cross tabs with bench marks.
• Decision Making (better data driven decision making) … • How long should this planned change/maintenance window be, given the number of planned changes, their test results and data, the number of CIs
involved, the estimated impact of changes … etc. • Change Risk and Impact Analysis, Change Window Models & Estimation models, Comparison with past change windows that involved n
number of similar or identical changes, ….
• Policy Making (improved analytics driven policy making) … • What should my change window policies be for these 30 mission critical applications and their infrastructure that enable these applications? For
giving guidance to change analysts, change requestors and change implementers? • Change Impact Analysis, Change Window Models, Comparison with past change windows that involved n number of similar or identical
changes, ….
• Idea Making …. • Given the visualization of results from applied analytics, what can I do innovatively to optimize, improve efficiency, mistake proofing, predict events,
…and prevent a service incident (s). • Example: Analytics to automate incident diagnosis based on monitoring, event and related data and innovative application of analytics (data
mining).
Why do we need IT Operations Analytics?
Better Understanding OR
Sense Making
Decision Making
(Recommendation &
Decision Engines)
Policy Making (for strategic
guiding)
Idea Making new
services / processes ….)
Basic Analysis Are we having more incidents with
WinTel servers as compared to Unix
servers?
Do we need to improve M&EM with
regard to auto-ticketing?
Descriptive Analysis Is the difference significant? Are there
other factors that may explain the
difference (resource pool skill level …
etc.).
Prescriptive or
Normative Analysis
(Optimization)
Reliability analysis and modeling
indicates that Service Configuration A is
optimal for the given service resiliency
objectives (compared to B, C and D)
Analysis of historical RFC (Request for
Change) and Change Window
performance data – the director issued
a policy that changes that have not
been completed within 75% of the
change window time should be backed
out at that point (75% of scheduled
time)
Data & Analytics Driven Business
and IT Processes. Example:
Harness Business Data for Data as
a Service Platform
Predictive Analysis Should the system open an Incident
ticket based on event data (event
escalation decision)
Why do we need IT Operations Analytics?
A) Improve Value of IT Operations: (ROI and Risk ROI)
B) Improve quality of Operational Decisions & Operations
C) Reduce cost of Operations
Dimension Details Examples:
Improve Quality of Operations
• Data and Analytics Driven Decision Making substituting shooting from the hip and intentional or un-intentional human biases (heuristics).
• Data and Analytics Driven Policy Making for Operational Excellence.
• Rational, high quality accurate & speedy) determination of root cause (Automated RCA using data modeling and data mining techniques).
• Predictive scaling engine that scales contiguous and ambient resources based on service traffic data patterns.
• Analyzing SOA and Micro Services utilization patterns for change impact analysis (deployment of changes to micro services).
Reduce Cost of Operations
• Automating Process Decisions with Decision Engines reducing labor time and cost of labor
• Predicting Events & Incidents and there by eliminating cost of service outage (cost of impact of outage and cost of recovery from outage).
• Eliminating Human Error via Policy based Process Automation there by reducing cost of fault and error management.
• CI and CD of Micro-Services Architecture based Applications. • Simian Army (Chaos Monkey, etc.) and analytics of fault, error and failure
data for more resilient application and infrastructure design (cloud or other wise).
• Entity analytics for separation of environments and access control.
Why do we need IT Operations Analytics?
A) Improve Value of IT Service: (ROI and Risk ROI)
B) Improve quality of Service & Service Operations
C) Reduce cost of Service Operations
Dimension Details Examples:
Improve Quality of Service
• Service Quality Dimension: Usability & User Experience • Service Quality Dimension: Availability & Performance • Service Quality Dimension: Security
• Usability & User Experience Analytics • Availability Modeling • Security Analytics with SIEM and related Data
Reduce Cost of Service Design & Operations
• Automating Service Decisions with Decision & Recommendation Engines reducing labor time and cost of labor
• Predicting Events & Incidents and there by eliminating cost of service outage (cost of impact of outage and cost of recovery from outage).
• Eliminating Human Error via Policy based Service Request, Approval and Fulfillment Automation there by reducing cost of fault and error management.
• CI and CD of Micro-Services Architecture based Applications. • Simian Army (Chaos Monkey, etc.) and analytics of fault, error and failure
data for more resilient application and infrastructure design (cloud or other wise).
• Analytics for service orchestration and automated fulfillment.
SLM Flow: Generic to Client Specific Service Level Driven ITOA (Critical to Quality View)
EDSM: Generic Service Metrics Model
• SLO & SLM
EDSM: Base SLA & Service Class SLA (Generic)
• Cloud SLA – IaaS, PaaS and SaaS SLA
EDSM: Client / Service Specific SLA, Service Metrics & Measurement Method
EDSM: Client / Service SL-Driven
• Organization
• Design & Operations
EDSM: Client / Service Specific ITOA designed for Specific SLA/SLM
Service Definition & Service Level Agreement
Service Tiers and Service Metrics Model
Service Objectives & Metrics / SLO &
Metrics
SL Driven Architecture & Design
SL-Driven Transition & Deployment Model
SL Driven Operations
Management by Objectives & Metrics
(MBO & MBM)
OLA & Vendor Contracts aligned to
Service Level
SL Driven CSI
Service Reviews & Improvement Plans
Service Level Driven Organization Service Level Driven ITOA
Content
What is ITOA? Context
Why do we need ITOA?
ITOA: How do we do it? (How do we approach implementation)?
Backup Slides – More Details
ITOA Framework: Introduction
What is ITOA? Definitions
Identify, Prioritize and Select Use Cases Criteria for Selecting Use Cases (Decision Problem or Opportunity)
• Importance of problem addressed (example; resiliency problems may be rated higher than optimization problems)
• Rank Order Importance for IT Operations
• Essential
• Important
• Recommended
• Optional
• Importance Rating Based on:
• Fit with Business & IT Operations Strategy
• Likelihood of usage by clients
• Frequency of usage by clients
• Impact on SLA & OLA
• Expected ROI or Risk ROI of Use Case
• Novelty in approach (not a well known and well addressed problem) i.e. potential for differentiation
• Fit with Organization Capabilities (resources & tooling)
Importance →
Ease of Implementation ↓
Essential or Important Recommended or Optional
Difficult Start planning for it Defer for later
Not So Difficult Implement it (Low Hanging Fruit)
Think about it
Step by Step Approach to using Analytics Top Down Approach is Recommended
Top Down Approach: Process to Decision Problems to Analytics
1. What is the decision problem or opportunity (define use case)?
• Associated with which IT process / which process activity and tasks?
• A given process & process activity; example: Capacity Management &
Capacity on Demand (CoD) allocation and Auto-Scaling activities
2. What are the variables associated with this decision?
• Identify, define, prioritize and select
3. What are the sources of data for the selected variables?
4. What is the time frame involved?
5. Is the data clean? Can we improve the quality of the data?
6. What are the appropriate set of analytic tools?
7. Do we need different combination of analysis to improve the quality of analytic output?
8. Do we need one or more specific analytic models?
9. Do we need to do advanced modeling & simulation here?
10. How do we periodically refine & improve the analytics and models, involved?
11. Does the analysis support the business need? For sense making, decision making, policy making, idea making or other?
12. How should I embed the analytic use case into one or more processes and/or applications.
Team Work between Operations Guy and Data Scientist (Core Team)
Bottoms Up Approach: Analytics to Decision Problem to Process
1. Here is an analytics technique – where can I apply it?
▪ Example: Conjoint Analysis, OK can I apply it for software requirements management & prioritization of requirements?
▪ Frequent Item Set Mining & Logical Item Set Mining, OK, can I apply them for monitoring & event analytics for incident prediction using identification of leading events.
2. Which processes and process activities are relevant for this given analytic technique?
3. Prioritize and select first process activity for application – based on potential impact and availability of data for analysis (see slide on criteria for Use Case selection)
4. Can I embed this analytic into process or application?
Service Level Driven IT: EDSM & EDSM-ITOA Service Level Objectives (SLO) are SAO, SPO, SCO, etc…
Service Availability Management
- Service Availability Objectives (SAO) & Metrics
Availability Architecture &
Design
Availability Engineering & Deployment
Availability Monitoring, Analytics &
Improvement
Service Performance Management
- Service Performance Objectives (SPO) & Metrics
Performance Architecture &
Design
Performance Engineering & Deployment
Performance Monitoring, Analytics &
Improvement
Service Continuity Management
- Service Continuity Objectives (SCO) & Metrics
Continuity Architecture &
Design
Continuity Engineering & Deployment
Continuity Monitoring, Analytics &
Improvement
Service Utilization & Metering Management
- Service U-M Objectives (SU-MO) & Metrics
Utilization & Metering
Architecture & Design
Utilization & Metering
Engineering & Deployment
Utilization & Metering
Monitoring, Analytics &
Improvement
Service Usability & User Experience Management
- Service UUX Objectives & Metrics
Usability & User Experience
Architecture & Design
Usability & User Experience
Engineering & Deployment
Usability & User Experience Monitoring, Analytics &
Improvement
Service Security Management
- Service Security Objectives (SSO) & Metrics
Security Architecture &
Design
Security Engineering & Deployment
Security Monitoring, Analytics &
Improvement
EDSM-ITOA: Example Service Design Process Capacity, Utilization & Performance (CUP) Management as an Example Process
• Sizing service and resource (resource pool) capacity decisions
• Allocating resource capacity units to resource pools decisions
• Allocating resource CoD units to CoD pools (Capacity on Demand) decisions
• Service real-time and lagged scaling decisions (horizontal and vertical scaling decisions)
• Resource real-time and lagged scaling decisions (bottle neck analysis)
• Capacity, utilization and performance related event detection rules
• Utilization optimization decisions
• Among others
Time Series / Control Chart
analysis of service and
resource utilization metrics
Bottle Neck Analysis (Scenario
Analysis)
Time Series / Control Chart
analysis of service and
resource performance metrics
(example: response time)PCM Modeling (Block Diagrams,
Reliability Modeling …)
Data and Information about Service
Environment
Service Capacity
Risk Analysis
Service Capacity
and Performace
Plan (CAPP) or
Capacity &
Performance
Improvement Plan
(CAPIP)
• The analytic techniques in the brown boxes are related to each other and can feed each other (inputs and outputs) i.e. output of one analysis can be input to another. • Analytics involves both non-statistical and statistical analysis • Output of multiple analytics techniques can be aggregated and rolled as input into a more composite analytics method.
EDSM-ITOA: Example Service Design Process Capacity & Performance Management as an Example Process
EDSM-ITOA: CUP Analytics Use Cases
Use Case 1:
• Relate Application / Appl. Functionality Performance Metrics (ARM or Application Response Metrics) to performance metrics associated with enabling components
Use Case 2:
• Analytics for CAP related abnormal event detection. Control chart patterns as abnormal events.
Use Case 3:
• Analytics for detecting and presenting (visualization of) bottle necks (bottle neck resources or components) in real time.
Use Case 4:
• Analytic model to predict demand for Capacity on Demand (COD) for physical COD units by data center, machine room and by type of unit (disk drive, HBA, system board, physical-virtual server, etc.
Use Case 5:
• Analytics with modeling and simulation for understanding utilization patterns (say in private cloud environment)
Use Case 6:
• Analytics for optimizing resource utilization (say between 90% and 98%) via work load scheduling and load balancing (in a single or multi data center environment).
34
SPOF or Single Point of
Failure Analysis & CFIA
(Component Failure Impact
Analysis)
Fault Tree and Decision Tree
Analysis
FMECA or Failure Mode
Effect and Criticality AnalysisAvailability Modeling (Reliability Block
Diagrams, Reliability Modeling …)
Data and Information about Service
Environment
Service Availability
Risk Analysis
Service Availability
Plan (AP) or
Availability
Improvement Plan
(AIP)
• The analytic techniques in the brown boxes are related to each other and can feed each other (inputs and outputs) i.e. output of one analysis can be input to another. • Analytics involves both non-statistical and statistical analysis • Output of multiple analytics techniques can be aggregated and rolled into as input to more composite analytics method.
EDSM-ITOA: Example Service Design Process Availability Management as an Example Process
EDSM-ITOA: Example Service Transition Process Dev-Ops CI-CD for Application based on Micro-Services Architecture
Data Source: Analytics for Dev-Ops
• Analytics for Continuous Testing
• Analytics for Continuous Deployment
• Change & Release Risk Analytics
• Change & Release Impact Analytics
• Optimization of Continuous Deployment
Filter
Events
Relate /
Analyze
Events
Detect &
Log
Events
Classify
Events
Resolve or
Escalate
Event
Close
Event
Analytics and rules for detecting,
classifying & filtering events
– by service, technology domain and resource type.
- Analytics for Event Detection & Classification
Applied analytics and modeling of
event and related data for
- sense, decision and policy making
- relevant services, processes and domains
- Analytics to Determine Root Event (s)
EDSM-ITOA: Example Service Operational Process Event Management
Automating Incident Diagnosis & RCA with
Perspica & other ITOA tools (CSRA Atlas)
EDSM ITOA: Implementation View Managed Environment
Managed Services
Managed Resources
Managing Environment
Application Monitoring & Event Management
Application Management (Function)
SDLC Team
(Dev-Ops)
Infrastructure Monitoring & Event Management
Technology Management (Function)
IELC Team
(Inf.-Ops)
Incident Management Module Problem Management Module
Configuration & Asset Module Change, Release & Deployment
Module
Service Management Platform
SDLC – Software Development Life Cycle & IELC – Infrastructure Engineering Life Cycle
ITOA
Platform
EDSM ITOA: Maturity Levels Maturity
Level Salient Organizational Features Strategy & Architecture
Capabilities Design & Build Capabilities Run Capabilities
Level 1: Ad Hoc
No dedicated centralized ITOA Organization. No lead or analysts.
None None. Ad Hoc and Silo Use Cases with pseudo SME
Silo implementation of very few use cases or none.
Level 2: Defined
Informal Central Organization. Analyst or Analytics Community (Small)
ITOA Vision & Mission Defined & Documented. Some Capabilities in Silos. None at the Enterprise Level.
Some key use cases well implemented by SME (Analysts / Data Scientists). Still at the Silo level.
Few key uses for critical services & processes up and running in production. Process performance below Industry standards.
Level 3: Capable
Formal Centralized ITOA Organization. Dedicated Lead (Director Level) and Dedicated Data Scientists. Community of Interest with few active participants.
Level 2 Plus well defined ITOA Adoption, Design & Implementation Guide. Some enterprise level capabilities such as ITOA Portal & Tooling (Licenses), Self Service Analytics Guides, etc.
Some major ITOA uses cases implemented for Mission / Business Critical Services. Enterprise level adoption of few key Use Cases (across Silos).
Key processes at sub-optimal level based on a set of ITOA use cases. Majority of processes not efficient.
Level 4: Mature
Level 3 Plus - Larger Community of Practitioners and few Specialist (like IT Security Analytics lead or Predictive Modeling Lead).
Level 3 Plus significant enterprise level capabilities to leverage shared resources including technical skills, tooling, funding, for ITOA projects with proven business case.
All major ITOA uses cases implemented for Mission / Business Critical Services. Enterprise level mandated implementation of key use cases.
Key processes at near-optimal level based on consistent use of relevant ITOA capabilities. Majority of processes are effective and relatively efficient (near or above industry standard)
Level 5: Optimal
Level 4 Plus large number of ITOA specialists. Center of Excellence with Significant ITOA IP.
Level 4 Plus continuous investments to improve and optimize capabilities at the Enterprise Level.
Level 4 Plus continuous investments in use cases that incrementally optimize operational performance i.e. use cases that help with optimizing key Business & IT processes.
Business & IT Operations Optimized with ITOA. Most important Business & IT Processes are optimized and controlled with embedded ITOA capabilities.