The Storage Networking Company I N R A N G E T e c h n o l o g I e s C o r p o r a t I o n 1...
-
Upload
hunter-taylor -
Category
Documents
-
view
212 -
download
0
Transcript of The Storage Networking Company I N R A N G E T e c h n o l o g I e s C o r p o r a t I o n 1...
1Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nEffective Strategies for SAN Performance Monitoring
David SignoriProduct Marketing Manager, Software
SolutionsINRANGE Technologies Corporation
12/9/02
NTSMF User’s Group - CMG
with PerformanceVSN
2Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nCurrent Challenges in Storage Networking Administration• Planning network requirements for Business
Continuance applications.• Planning network requirements for the ever-increasing
size and complexity of the storage environment.• Lowering management cost while increasing storage
networking performance• Implementing a Service Provider model consisting of
charge back, reporting, and service level agreements to end users.
• Eliminating finger pointing with Server, Network, and Database administration groups.
• Managing heterogeneous environments.• Decreasing or eliminating downtime.Ultimately, how do I increase and guarantee performance while
lowering cost?
3Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nStorage Networking Performance Monitoring Solution
Requirements:Session Layer Traffic Flow MonitoringExternal to the Storage Networking EquipmentStandards-based Management, Collection, and
Reporting InterfacesSimple Plug-and-Play Configuration and OperationPersistence: Permanent Records of Traffic BehaviorFlexible Reporting CapabilitiesPolicy Monitoring and AlertingEnhance Storage Network SecurityScalable
A Comprehensive Storage Networking Performance Monitoring Solution will increase performance and lower cost.
4Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nWhat is PerformanceVSN?Product Overview
• Definition• INRANGE Storage Networking Performance Monitoring
Solution for Capacity Planning and Service Level Management.
• Components• PerformanceVSN Server (Appliance)• PerformanceVSN Server Software• Optional PerformanceVSN Probe
• Base Functionality• PerformanceVSN Server + Server Software• Port-level statistics collection both real-time and historical• Statistics gathered from INRANGE Directors & switches
• Advanced Functionality• PerformanceVSN Server + Server Software + Probe(s)• Session-level statistics collection both real-time and
historical• Statistics gathered from INRANGE Directors & switches +
Probe(s)
PerformanceVSNServer
PerformanceVSNProbe
5Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nPerformance Monitoring Requirements
Server_A
Server_B
RAID_C
RAID_B
RAID_A
LUNs 1..n
LUNs 1..n
LUNs 1..n
Server_C
Server_D
Server_E
Server_F
Server_G
Port statistics: ISL at 60% utilization
ISL
Session statistics: Total ISL utilization: 60% Server_A to RAID_B util: 35% Server_A to RAID_B / Lun 3 util: 10% Server_A to RAID_B / Lun 9 util: 15% Server_A to RAID_B / Lun 5 util: 10% Server_B to RAID_C util: 25% Server_B to RAID_C / Lun 2 util: 22% Server_B to RAID_C / Lun 7 util: 3%
Port vs. Session Layer Statistics
Session Layer Traffic Flow Monitoring
6Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
n
Port statistics: CU_B2 60% utilization
Session statistics: Total CU_B2 utilization: 60% Channel_A1 to CU_B2 util: 35% Channel_B2 to CU_B2 util: 20% Channel_C1 to CU_B2 util: 5%
Channel_A1
Server_B
Server_C
FICON_Storage_A
FICON_Storage_B
Server_A
Channel_A2
Channel_B1
Channel_B2
Channel_C2Channel_C1
CU_B2
CU_B1
CU_A2
CU_A1
Performance Monitoring RequirementsFICON Layer 2 Session Layer Traffic Flow Monitoring
7Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nChannel_A1
Server_B
Server_C
FICON_Storage_A
FICON_Storage_B
Server_A
Channel_A2
Channel_B1
Channel_B2
Channel_C2Channel_C1
CU_B2
CU_B1
CU_A2
CU_A1
Server_D
FICON_Storage_C
Channel_D2
Channel_D1 CU_C2
CU_C1
Port statistics: ISL 60% utilization
Session statistics: Total ISL utilization: 60% Channel_D1 to CU_B2 util: 35% Channel_A2 to CU_C1 util: 20% Channel_C1 to CU_C2 util: 5%
FICON Cascading – High Integrity Fabric
8Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
n
Port statistics: CU_B2 60% utilization
Session statistics: Total CU_B2 utilization: 60% Channel_A1 to CU_B2 util: 35% Channel_A1 to CUADD_B2B util: 20% Channel_A1 to Device _B2B1 util: 15% Channel_A1 to Device_B2B3 util: 5% Channel_A1 to CUADD_B2A util: 15%
Channel_A1 to Device_B2A1 util 10% Channel_A1 to Device_B2A2 util 5%
Channel_B2 to CU_B2 util: 20% Channel_C1 to CU_B2 util: 5%
Channel_A1
Server_B
Server_C
FICON_Storage_A
FICON_Storage_B
Server_A
Channel_A2
Channel_B1
Channel_B2
Channel_C2Channel_C1
CU_B2
CU_B1
CU_A2
CU_A1
LPAR_A1A
LPAR_A1B
CUADD_B2B
CUADD_B2A
Device_B2B1
Device_B2B2
Device_B2B3
Device_B2A1
Device_B2A2
Performance Monitoring
FICON ULP Session Layer Traffic Flow Monitoring
9Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nSession Layer Reporting
Examples
• Real-time Summary of the Selected LUNs in SCSI Read Mbytes/Sec being currently accessed by all hosts.
• Note that this is a system wide report across all servers on the network.
10Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
n
• Real-time Summary of the Top 5 LUNs in Total Mbytes/Sec being currently accessed by Host “Server_A”.
• Note LUNs 9, 5, 7, 6, and 8 on storage device “RAID_A”
Session Layer ReportingExamples
11Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nSession Layer Reporting
Examples
• Real-time Summary of the Top 5 LUNs in Read Duration for Host “Server_A”.
• Note that this is a measure of latency and is reporting on the 5 LUNs in which latency is a maximum the network.
12Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nSession Layer Reporting
Examples
• Trend of SCSI Exchanges/Sec between host “Server_A” and storage device “RAID_A” for the past 2 hours.
13Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nPerformance Monitoring Requirements
Servers
Storage
Servers
WAN
MetroDisk Mirroring
WANDisk Mirroring
RemoteStorage
RemoteStorage
• Resources in network devices should be dedicated to the distribution and handling of incoming and outgoing data streams.
• Many potential problems at the framing and upper layers are not reported.
• Although external, probe should be non-intrusive
External to Storage Networking Devices
Performance MonitoringProbe
14Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nPerformance Monitoring Requirements
Standards Based
SANManagement,
Data Management,Virtualization,
SRM,Enterprise Management
Collection
Management
Java GUI,Spreadsheets,
SAS,Home grown
Reporting
3rd Party Devices
Probes
SNMP Fibre Alliance MIB TCP/IP
SNMP,CIM/XML
CSV,SQL,HTTP
Switches/DirectorsRouters/Channel Extension
Performance MonitoringPlatform
15Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nPerformance Monitoring Requirements
Standards Based
• Should Support Heterogeneous Environments:• Multi-Vendor Equipment• FICON, FCP, IP, and VI• Fibre Channel and WAN
• Should Support Standalone Deployment or as a Plug-In to Chosen SAN Management Application
• Adds value to chosen storage management applications
• Should Function as a Plug-In to Chosen Enterprise Management System.
• Should Leverage Performance Monitoring Capabilities in Existing Equipment: Metrics and Access
• Service Provider-Type Reporting
16Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nPerformance Monitoring Requirements
Simple Plug-and-Play Configuration and Operation• Should Support Topology Rollup and Automatic Discovery of ports,
devices, and LUNs.
• Session and SCSI layer monitoring should be reported by human-readable logical port and device names
• Permanent Statistics Logging should start automatically and have easily configurable sampling periods
• Should Support a Dashboard for Quick Health Assessment
• Should Support Open Systems Management for Remote and Desktop Access.
17Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nPerformance Monitoring Requirements
Persistence: Permanent Records of Traffic Behavior
• Should support user-configurable historic sampling intervals
• Should support user-configurable rollup periods and retention times for efficient database usage
• Should support archival and export of database for long term capacity planning
• Persistent statistical storage enables capacity planning and trouble-shooting of problems that occurred in the past
• Should support historical trend reports for capacity planning and performance tuning.
• Should support historical summaries for Service Provider-Type Reporting.
• Should support bookmarks and pre-configured time durations for frequently viewed reports and Service Provider-Type Reporting
18Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nPerformance Monitoring RequirementsPersistence: Permanent Records of Traffic Behavior
Examples
• Trend of Total Mbytes/Sec In and Out for a selected port over the past 2 hours
• Note that report was requested at 18:30 and displayed historical data. This is not a trace that began at 16:30.
19Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nPerformance Monitoring Requirements
Persistence: Permanent Records of Traffic BehaviorExamples
• Trend of Total Mbytes/Sec In and Out for a selected port over the past 8 hours
• Note that in addition to customized time periods, pre-configured time periods like Today, Yesterday, Current Week, and Last Month should be possible.
20Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nPerformance Monitoring Requirements
Persistence: Permanent Records of Traffic BehaviorExamples
• Trend of SCSI Exchanges/Sec between host “Server_A” and storage device “RAID_A” for the past 2 hours.
21Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nPerformance Monitoring Requirements
Persistence: Permanent Records of Traffic BehaviorExamples
• Summary of the Top 5 LUNs in Total Mbytes/Sec being currently accessed by Host “Server_A” for Month of May, 2002
• Note LUNs 9, 5, 7, 6, and 8 on storage device “RAID_A”
May 2002
22Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nPerformance Monitoring Requirements
Flexible Reporting Capabilities
• Should Support Real-Time Monitoring• Should Support Collection of Hundreds of Metrics including
Diagnostics• Should Include Value-Added Derived Reports like TopN,
Rates, and Multiple Devices and Statistics in a Single Report• Should Support Configurable Sampling Intervals• Should Support Bookmarks to Easily Return to Frequently
Viewed Reports.
23Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nPerformance Monitoring Requirements
Flexible Reporting CapabilitiesHundreds of Metrics, Examples …Utilization:
•Frames (In/Out)
•FC-2 MB/Sec (In, Out)
•FC-4 MB/Sec (In, Out by ULP – SCSI, IP, VI, FICON, and others)
•Errors MB/Sec (In, Out)
•SCSI IO/Sec (Read, Write, Other)
•SCSI Read (avg, min, max, read percentage)
•SCSI Write (avg, min, max, write percentage)
•SCSI Other (other percentage)
•SCSI Read/Write Payload Size Ranges (percentage)
Throughput Errors:
•Busy Frames
•Rejected Frames
•Link Failures
•Aborts
•Primitive Seq Protocol Errors
•Invalid Tx Words
•Delimiter Errors
•Discarded Frames
•BSYs and RJTs (Port, Fabric)
•CRC Errors
Availability:
•Link Resets (In/Out)
•OLS (In/Out)
•LOGIs (Port, Fabric)
•%Available
Link Integrity:
•Sync Loss
•Sig Loss
Capacity:
•%capacity for all frames
•FC-4 %capacity (SCSI,IP,VI,FICON, other)
•% capacity link control
•% capacity link services
Latency:
•SCSI Read/Write Duration (ms)
24Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nPerformance Monitoring Requirements
Flexible Reporting CapabilitiesExamples
• Real-time Summary of Total Mbytes/Sec for 24 selected ports.• Note that multiple ports across multiple switches can be added to single
report.• Note Report is accessed using a Bookmark
25Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nPerformance Monitoring Requirements
Flexible Reporting CapabilitiesExamples
• Real-time Summary of percent read exchange size to storage device “RAID_A” from all hosts on the network.
• Real-time sampling interval can be modified.• Report can be toggled to trend by simply selecting tool bar button.• Multiple metrics in a single report
26Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nPerformance Monitoring Requirements
Policy Monitoring and Alerting
• Should support proactive troubleshooting to eliminate or decrease downtime
• Should support open real time alerting (i.e. SNMP, Email)• Should support multiple levels of thresholds• Should support pre-defined threshold definitions for quick and
easy configuration• Thresholds should be supported on all metrics collected
including errors, type of traffic, size of traffic, etc … and all objects including ports, devices, and logical units
• Ideal for Service Provider Model since administrator knows about potential problems before end-user.
27Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nPerformance Monitoring Requirements
Enhanced Security Policies• Role-Based Security
• Event Logging
• Security Policy Monitoring: Alerting on unauthorized Host to LUN access
28Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nPerformance Monitoring Requirements
Scalability
• Should Support a Combination of Software and Hardware to Suits your needs.
• Should Support an Inexpensive Entry Point that is easily Expandable as your Network Grows.
• Should Support a Roadmap around Future Storage Networking requirements (i.e. 10G, FC-IP, iSCSI, Infiniband)
• Should be Data Center ready (i.e. multiple interfaces in a single enclosure, rack-mountable)
29Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nPerformance Monitoring Life-Cycle
Putting it all together
Performance Profiling:• Record and Monitor Current Network Performance LevelsPerformance Thresholding:• Set Thresholds based on profiles for real-time alerting to
throughput and availability problems.Performance Tuning:• Adjust traffic flows based on profiles for better network
performance without spending for more resources.Capacity Planning:• Know exactly when and how much more resources are needed
without overspending.
30Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
n
Performance MonitoringPlatform
Case Study and ROILarge Financial Brokerage – Metro Area Disk Mirroring
StorageServers RemoteStorage
FICON FICON
FCP FCP FCP FCP
FICON
DWDM
31Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nCase Study and ROI
Performance Profiling
MAN extender usage across a selected week. Note spikes in traffic.
32Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nCase Study and ROI
Drilling into MAN extender usage across for specific day. Note spike in traffic between noon and 1PM.
Performance Profiling
33Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nCase Study and ROI
Drilling into Storage port usage identifies offending Storage Device
Performance Tuning
34Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nCase Study and ROILarge Financial Brokerage – Metro Area Disk MirroringGiven:
• DWDM Channel costs $16k/month.• Customer was considering going to 4 channels per fabric but justified that for time being, 3 per fabric was adequate.
Result:• ROI was less than 2 months for this particular solution.
Additional Benefits:Capacity Planning:• Visibility into utilization trends determine exactly when additional channels will be needed.
Performance Tuning:• Visibility into offending storage device provide load balancing feedback to re-map devices to lower utilized links thus optimizing channels.
Standards-Based:• Provides seamless visibility into the FICON portion of the fabric as well.
Real-Time Monitoring:• Reports on errors for trouble-shooting and diagnostics.
35Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nPerformance Monitoring Solutions to Current Challenges
• Planning network requirements for Business Continuance applications:
• Planning network requirements for the ever-increasing size and complexity of the storage environment:
• Answers question of how many MAN extender links you need.• Answers question of how much WAN extender bandwidth you
need.• Traces spikes in MAN/WAN extender link back to the device
and volume that caused it.• Enables you to know when you will need more bandwidth.• Reports on Latency
• Answers question of how many ISLs you need.• Answers question of what is the optimum server-to-storage
ratio.• Enables you to know when you will need more ports.• Traces spikes in ISL and storage port back to the device and
volume (LUNs) that caused it.
36Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nPerformance Monitoring Solutions to Current Challenges
• Lowering management cost while increasing storage networking performance
• Implementing a Service Provider model consisting of charge back, reporting, and service level agreements to end users.
• Eliminating finger pointing with Server, Network, and Database administration groups.
• Since Session Layer Monitoring correlates usage and errors to the individual server, storage device, and volume (LUN), accountability can be maintained at the department level.
• Session layer response time metrics allow you to distinguish between network, server, and storage device latency.
• Reports, both real-time and historical, are only a mouse click away. No need for tedious spreadsheet crunching.
• Command line launch and open APIs for seamless integration with 3rd party storage management application.
37Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nPerformance Monitoring Solutions to Current Challenges
• Managing heterogeneous environments.
• Decreasing or eliminating downtime with proactive policy-based monitoring.
• Because solution is external to networking devices and uses standard collection interfaces, it is independent of fabric vendor, ULP, and can extend to the WAN.
• Real-time and SNMP alerts on user-defined thresholds. You profile the network and define behavior. Solution provides real-time notification of policy violation.
• Combines the best of both worlds:• Level of visibility on par with expensive diagnostic tools• Ease of use and capacity planning of an Enterprise service
level management application.
38Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nAdvanced Performance Monitoring Solutions
•Capacity Planning/Modeling: Planning for network usage of resources yet added. For example, when adding a new department with 10 clients to access application X on Server A. Server A already has 100 clients. Throughput from Server A to what disks will increase 10%? ROI Potential: If you under-use ISLs you are over-spending.
•Service Duplication/Modeling: Planning for WAN usage of application yet added. For example, WAN will support disk mirror. How much bandwidth is needed to adequately support write I/O to particular disks or volumes? ROI Potential: If you under-use WAN links you are over-spending.
•Performance Tuning: An Application/Server consolidation example: Applications needing access to much of the same data are candidates to run on the same server or in the same cluster. ROI Potential: If you under-use servers you are over-spending.
39Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
nAdvanced Performance Monitoring Solutions
•Performance Tuning: Save cost by separating the types of transactions on the network. For instance, separating transaction (I/O) and data intensive operations will allow more transactions ($) and deeper data mining.
•Add value to storage management applications: Example: performance monitoring application feeds data backup/replication application so that backup time period is automatically selected and optimized.
•Performance Management: Automate actions based on conditions detected. Example: Feedback loop to switching devices for intelligent routing decisions.
•Life-Cycle Data Monitoring: Based on level of access over network, determine appropriate storage type for particular data or application. Provides feedback for HSM.
40Th
e S
tora
ge N
etw
ork
ing
Com
pan
yI N
R A
N G
E
T e
c h
n o
l o
g I
e s
C o
r p
o r
a t
I o
n
TM
Questionsor
For a Copy of the Presentation: