8/13/2019 Data storage management and retrieval
1/43
8/13/2019 Data storage management and retrieval
2/43
Section ObjectiveUpon completion of this section, you will be able to: Understand the concept of information availability
and its measurement
Describe the backup/recovery purposes andconsiderations
Discuss architecture and different backup/recoverytopologies
Describe local replication technologies and theiroperation Describe remote replication technologies and their
operation.
Introduction to Business Continuity - 2
8/13/2019 Data storage management and retrieval
3/43
8/13/2019 Data storage management and retrieval
4/43
Chapter ObjectiveAfter completing this chapter, you will be able to:
Define Business Continuity and Information Availability
Detail impact of information unavailability Define BC measurement and terminologies
Describe BC planning process
Detail BC technology solutions
Introduction to Business Continuity - 4
8/13/2019 Data storage management and retrieval
5/43
What isBusiness Continuity Business Continuity is preparing for, responding to,
and recovering from an application outage thatadversely affects business operations
Business Continuity solutions address unavailabilityand degraded application performance
BC is an integrated and enterprise wide process and set
of activities to ensure information availability
Introduction to Business Continuity - 5
8/13/2019 Data storage management and retrieval
6/43
What is Information Availability
(IA) IA refers to the ability of an infrastructure to function
according to business expectations during its specifiedtime of operation
IA can be defined in terms of three parameters: Accessibility
Information should be accessible at right place and to theright user
Reliability Information should be reliable and correct
Timeliness Information must be available whenever required
Introduction to Business Continuity - 6
8/13/2019 Data storage management and retrieval
7/43
Causes of Information Unavailability
Introduction to Business Continuit
Disaster (
8/13/2019 Data storage management and retrieval
8/43
Impact of Downtime
Introduction to Business Continuity - 8
Lost RevenueKnow the downtime costs (perhour, day, two days...) Number of employees
impacted (x hours out *
hourly rate)
Damaged Reputation
Customers
Suppliers
Financial markets
Banks
Business partners
Financial Performance
Revenue recognition
Cash flow
Lost discounts (A/P)
Payment guarantees
Credit ratingStock price
Other Expenses
Temporary employees, equipment rental, overtime
costs, extra shipping costs, travel expenses...
Direct loss
Compensatory payments
Lost future revenue
Billing losses
Investment losses
Lost Productivity
8/13/2019 Data storage management and retrieval
9/43
Measuring Information Availability
MTBF:Average time available for a system or component to perform itsnormal operations between failures
MTTR:Average time required to repair a failed component
IA = MTBF / (MTBF + MTTR) or IA = uptime / (uptime +
downtime)Introduction to Business Continuit
Detection
IncidentTime
Detectionelapsed
time
Diagnosis
Response Time
Repair
Recovery
Repair time
Restoration
Recovery Time
MTTRTime to repair or downtime
Incident
MTBFTime betweenfailures or uptime
8/13/2019 Data storage management and retrieval
10/43
Availability MeasurementLevels of 9s Availability
% Uptime % Downtime Downtime per Year Downtime per Week
98% 2% 7.3 days 3hrs 22 min
99% 1% 3.65 days 1 hr 41 min
99.8% 0.2% 17 hrs 31 min 20 min 10 sec
99.9% 0.1% 8 hrs 45 min 10 min 5 sec
99.99% 0.01% 52.5 min 1 min
99.999% 0.001% 5.25 min 6 sec
99.9999% 0.0001% 31.5 sec 0.6 sec
Introduction to Business Continuit
8/13/2019 Data storage management and retrieval
11/43
BC Terminologies Disaster recovery
Coordinated process of restoring systems, data, andinfrastructure required to support ongoing business
operations in the event of a disaster Restoring previous copy of data and applying logs to that
copy to bring it to a known point of consistency
Generally implies use of backup technology
Disaster restart Process of restarting from disaster using mirrored
consistent copies of data and applications
Generally implies use of replication technologies
Introduction to Business Continuity - 11
8/13/2019 Data storage management and retrieval
12/43
BC Terminologies (Cont.)Recovery Point Objective (RPO)
Point in time to which systemsand data must be recovered afteran outage
Amount of data loss that a
business can endure
Recovery Time Objective (RTO)
Time within which systems,applications, or functions mustbe recovered after an outage
Amount of downtime that a
business can endure and survive
Introduction to Business Continuity - 12Recovery-point objective Recovery-time objective
Seconds
Minutes
Hours
Days
Weeks
Seconds
Minutes
Hours
Days
Weeks Tape Backup
Periodic Replication
Asynchronous Replication
Synchronous Replication
Tape Restore
Disk Restore
Manual Migration
Global Cluster
8/13/2019 Data storage management and retrieval
13/43
Business Continuity Planning (BCP)
Process Identifying the critical business functions
Collecting data on various business processes within
those functions Business Impact Analysis (BIA)
Risk Analysis
Assessing, prioritizing, mitigating, and managing risk
Designing and developing contingency plans anddisaster recovery plan (DR Plan)
Testing, training and maintenance
Introduction to Business Continuity - 13
8/13/2019 Data storage management and retrieval
14/43
BC Technology Solutions Following are the solutions and supporting
technologies that enable business continuity anduninterrupted data availability:
Single point of failure
Multi-pathing software
Backup and replication
Backup recovery Local replication
Remote replication
Introduction to Business Continuity - 14
8/13/2019 Data storage management and retrieval
15/43
Introduction to Business Continuity - 15
Reso lv ing Sing le Points of Failure
FC Switches
Storage Array
Redundant Network
Clustered ServersRedundant Arrays
Remote Site
Redundant Ports
Redundant FC Switches
Redundant Paths
Heartbeat Connection
IP
Storage Array
Client
8/13/2019 Data storage management and retrieval
16/43
Multi-pathing Software Configuration of multiple paths increases data
availability
Even with multiple paths, if a path fails I/O will notreroute unless system recognizes that it has analternate path
Multi-pathing software helps to recognize and utilizes
alternate I/O path to data Multi-pathing software also provide the load balancing
Load balancing improves I/O performance and datapath utilization
Introduction to Business Continuity - 16
8/13/2019 Data storage management and retrieval
17/43
Backup and Replication Local Replication
Data from the production devices is copied to replica deviceswithin the same array
The replicas can then be used for restore operations in the
event of data corruption or other events Remote Replication
Data from the production devices is copied to replica deviceson a remote array
In the event of a failure, applications can continue to run fromthe target device
Backup/Restore Backup to tape has been a predominant method to ensure
business continuity Frequency of backup is depend on RPO/RTO requirements
Introduction to Business Continuity - 17
8/13/2019 Data storage management and retrieval
18/43
Chapter SummaryKey points covered in this chapter:
Importance of Business Continuity
Types of outages and their impact to businesses Information availability measurements
Definitions of disaster recovery and restart, RPO andRTO
Business Continuity technology solutions overview
Introduction to Business Continuity - 18
8/13/2019 Data storage management and retrieval
19/43
Concept in PracticeEMC PowerPath
Introduction to Business Continuity - 19
SE
RVER
STORAGE
SCSI
Driver
SCSI
Driver
SCSI
Driver
SCSI
Driver
SCSI
Driver
SCSI
Driver
SCSI
Controller
SCSI
Controller
SCSI
Controller
SCSI
Controller
SCSI
Controller
SCSI
Controller
PowerPath Host Based Software
Resides between
application and SCSI
device driver
Provides Intelligent I/O
path management
Transparent to the
application
Automatic detection
and recovery from
host-to-array path
failures
Host Application (s)
LUNLUN
LUNLUN
Storage Network
8/13/2019 Data storage management and retrieval
20/43
Check Your Knowledge Which concerns do business continuity solutions address?
Availability is expressed in terms of 9s. Explain therelevance of the use of 9s for availability, using examples.
What is the difference between RPO and RTO? What is the difference between Disaster Recovery and
Disaster Restart?
Provide examples of planned and unplanned downtime in
the context of storage infrastructure operations. What are some of the Single Points of Failure in a typical
data center environment?
Introduction to Business Continuity - 20
8/13/2019 Data storage management and retrieval
21/43
System Management
Introduction to Business Continuity - 21
Management systems in storage networks
Five basic services
Different service interface
8/13/2019 Data storage management and retrieval
22/43
System Management
Introduction to Business Continuity - 22
Requirements
User related
Component related
Architecture related
8/13/2019 Data storage management and retrieval
23/43
System Management
Introduction to Business Continuity - 23
Requirements
User related
Network administrator
Data transport functions properly
transmission capacity and protocols
Storage administrator
Allocation of LUNs to the server
RAID configuration
Industrialist economist
Wear and tear of devices
Related cost
Balanced management:
Conception of network
Implement of storage network
Easier management
8/13/2019 Data storage management and retrieval
24/43
System Management
Introduction to Business Continuity - 24
Requirements
Component related
Applications: These include all software that processes data in a
storage network.
Data: Data is the term used for all information that is processed by the
applications, transported over the network and stored on storage
resources.
Resources: The resources include all the hardware that is required for
the storage and the transport of the data and the operation of
applications.
Network: The term network is used to mean the connections between
the individual resources.
Individual component requirement: monitoring, availability, performance
or scalability.
8/13/2019 Data storage management and retrieval
25/43
System Management
Introduction to Business Continuity - 25
Requirements
Architecture related
servers and storage devices are decoupled by multiple virtualization
layers
assignment of storage capacity to servers
application will be impacted by maintenance.
host bus adapters, hubs, switches, gateways can each affect the data
flow.
Solution: Central management system
8/13/2019 Data storage management and retrieval
26/43
System Management
Introduction to Business Continuity - 26
Basic Services
Discovery
Monitoring
Central configuration
Analysis
Data management
8/13/2019 Data storage management and retrieval
27/43
System Management
Introduction to Business Continuity - 27
The discovery component detects the applications and resources
used in the storage network automatically.
It collects information about the properties, the current
configuration and the status of resources. The status comprises
performance and error statistics.
It correlates and evaluates all gathered information and supplies
the data for the representation of the network topology.
8/13/2019 Data storage management and retrieval
28/43
System Management
Introduction to Business Continuity - 28
The monitoring component compares continuously the current
state of applications and resources with their target state.
In the event of an application crash or the failure of a resource, it
must take appropriate measures to raise the alert based upon the
severity of the error that has occurred.
The monitoring components performs error isolation by trying to
find the actual cause of the fault in the event of the failure of part
of the storage network
8/13/2019 Data storage management and retrieval
29/43
System Management
Introduction to Business Continuity - 29
The central configuration component significantly simplifies the
configuration of all components.
For instance, the zoning of a switch and the LUN masking of a
disk subsystem for the setup of a new server can be configured
centrally where in the past the usage of isolated tools was
required.
Only a central management system can help the administrator to
coordinate and validate the single steps.
Furthermore it is desired to simulated the effects of potential
configuration changes in advance before the real changes areexecuted.
8/13/2019 Data storage management and retrieval
30/43
System Management
Introduction to Business Continuity - 30
The analysis component collects continuously current
performance statistics, error statistics and configuration
parameters and stores them in a data warehouse.
These historic data enables trend analysis to determine capacity
limits in advance to plan necessary expansions on time. This
supports operational as well as economic conclusions.
An further aspect is the spotting of error-prone components and
the detection of single point of failures.
8/13/2019 Data storage management and retrieval
31/43
System Management
Introduction to Business Continuity - 31
The data management component covers all aspects regarding
the data such as performance, backup, archiving and migration
and controls the efficient utilization and availability of data and
resources.
The administrator can define policies to control the placement
and the flow of the data automatically.
8/13/2019 Data storage management and retrieval
32/43
System Management
Introduction to Business Continuity - 32
Characteristics of Management
interfaces:
There are two main types of
device in the storage network:
connection devicesend-point devices.
Types of interfaces:
In-band
Out- band
Standardized
Proprietary
8/13/2019 Data storage management and retrieval
33/43
System Management
Introduction to Business Continuity - 33
In-bandInterfaces for the management of end-point devices
Management functions for discovery, monitoring and configuration of connection
devices and end-point devices are made available on this interface
Out- band
Most end point devices has one or more interfaces are not directly connected tothe storage network, but are available on a second, separate channel.
In general, these are LAN connections and serial cables. This channel is not
intended for data transport, but is provided exclusively for management purposes.
This interface is therefore called out-band.
Standardized
Proprietary
8/13/2019 Data storage management and retrieval
34/43
System Management
Introduction to Business Continuity - 34
Standardized : The standardisation and developmet for in-band management is found at two levels.
In-band transport levels: The management interfaces for Fibre Channel, TCP/IP andInfiniBand are defined on the in-band transport levels.
In-band upper layer protocols (ULP) : Primarily SCSI variants such as Fibre Channel FCP
and iSCSI are used as an upper layer protocol. SCSI has its own mechanisms for requesting
device and status information: the so-called SCSI Enclosure Services (SES). In addition to the
management functions on transport levels a management system can also operate these
upper layer protocol operations in order to identify an end device and request status
information.
Proprietary:
APIs: Proprietary interfaces are differentiated as application programming interfaces (APIs),
which are used to call special management functions.
Telnet and Secure Shell (SSH) based interfaces and element managers. A great number of
devices have an API over which special management functions can be invoked. These are
usually out-band, but can also be in-band, implementations.
Element manager: An element manager is a device-specific management interface. It is
frequently found in the form of a graphical user interface (GUI) on a further device or in the
form of a web user interface (WUI) implemented over a web server integrated in the device
itself. Since the communication between element manager and device generally takes place
via a separate channel next to the data channel, element managers are classified amongst the
out-band management interfaces.
8/13/2019 Data storage management and retrieval
35/43
System Management
Introduction to Business Continuity - 35
In-band Management:In-band management runs over the same interface as the one that connects
devices to the storage network and over which normal data transfer takes place.
This interface is thus available to every end device node and every connection
node within the storage network. The management functions are implemented as
services that are provided by the protocol in question via the nodes.
Two types of services:
Operational services: Operational services serve to fulfil the actual tasks of
the storage network such as making the connection and data transfer.
Management specific services : Management-specific services supply the
functions for discovery, monitoring and the configuration of devices.
In order to be able to use in-band services, a so-called management agent isnormally needed that is installed in the form of software upon a server connected
to the storage network. This agent communicates with the local host bus adapter
over an API in order to call up appropriate in-band management functions from an
in-band management service.
8/13/2019 Data storage management and retrieval
36/43
System Management
Introduction to Business Continuity - 36
In-band Management:
In-band management runs
through the same interface
that connects devices to the
storage network and via
which the normal datatransfer takes place. A
management agent
accesses the in-band
management services via
the HBA API.
8/13/2019 Data storage management and retrieval
37/43
System Management
Introduction to Business Continuity - 37
In-band Management: Fibre chanel SANServices for management: Each service defines so called one or more servers. Servers are split
into individual components and implemented in distributed form by connecting individual
components through fibre channel SAN.
Directory services
Management service
Types of servers
Name server: The name server is defined by the directory service. It is an example of an
operational service. Its benefit for a management system is that it reads out connection
information and the Fibre Channel specific properties of a port (node name, port type).
Configuration Server: The configuration server belongs to the class of management-specific
services. It is provided by the management service. It allows a management system to detect
the topology of a Fibre Channel SAN.
Zone server: The zone server performs both an operational and an administrative task. It
permits the zones of a Fibre Channel SAN fabric to be configured (operational) and detected
(management-specific).
8/13/2019 Data storage management and retrieval
38/43
System Management
Introduction to Business Continuity - 38
In-band Management: Fibre chanel SANDiscovery
The configuration server is used to identify devices in the Fibre Channel SAN and to
recognise the topology. The so-called function Request Node Identification Data (RNID) is
also available to the management agent via its host bus adapter API, which it can use to
request identification information from a device in the Fibre Channel SAN. The function
Request Topology INformation (RTIN) allows information to be called up about connected
devices.
Suitable chaining of these two functions finally permits a management system to discover the
entire topology of the Fibre Channel SAN and to identify all devices and properties. If, for
example, a device is also reachable out-band via a LAN connection, then its IP address can be
requested in-band in the form of a so-called management address. This can then be used by
the software for subsequent out-band management.
8/13/2019 Data storage management and retrieval
39/43
System Management
Introduction to Business Continuity - 39
In-band Management: Fibre chanel SAN
Monitoring
Since in-band access always facilitates communication with each node in a Fibre Channel
SAN, it is simple to also request link and port state information. Performance data can also be
determined in this manner. For example, a management agent can send a request to a node in
the Fibre Channel SAN so that this transmits its counters for error, retry and traffic. With the aid
of this information, the performance and usage profile of the Fibre Channel SAN can be
derived. This type of monitoring requires no additional management entity on the nodes inquestion and also requires no out-band access to them. The FC-GS-4 standard also defined
extended functions that make it possible to call up state information and error statistics of other
nodes. Two commands that realise the collection of port statistics are: Read Port Status Block
(RPS) and Read Link Status Block (RLS).
8/13/2019 Data storage management and retrieval
40/43
System Management
Introduction to Business Continuity - 40
In-band Management: Fibre chanel SAN
Messaging
In addition to the passive management functions described above, the Fibre Channel
protocol also possesses active mechanisms such as the sending of messages, so-called
events. Events are sent via the storage network in order to notify the other nodes of status
changes of an individual node or a link. Thus, for example, in the occurrence of the failure of a
link at a switch, a so-called Registered State Change Notification (RSCN) is sent as an event
to all nodes that have registered for this service. This event can be received by a registeredmanagement agent and then transmitted to the management system.
8/13/2019 Data storage management and retrieval
41/43
System Management
Introduction to Business Continuity - 41
In-band Management: Fibre chanel SAN
Zoning Problem:
In addition to the passive management functions described above, the Fibre Channel
protocol also possesses active mechanisms such as the sending of messages, so-called
events. Events are sent via the storage network in order to notify the other nodes of status
changes of an individual node or a link. Thus, for example, in the occurrence of the failure of a
link at a switch, a so-called Registered State Change Notification (RSCN) is sent as an event
to all nodes that have registered for this service. This event can be received by a registeredmanagement agent and then transmitted to the management system.
8/13/2019 Data storage management and retrieval
42/43
System Management
Introduction to Business Continuity - 42
In-band Management: Fibre chanel SAN
Services for management: Two services
Directory services
Management service
8/13/2019 Data storage management and retrieval
43/43
Top Related