Best Practices for designing, deploying, and administering ...€¦ · Best Practices for...

EMC Proven Professional Knowledge Sharing 2010

Best Practices for designing, deploying, and administering SAN using EMC CLARiiON Storage Systems Anuj Sharma

Anuj Sharma [email protected]

2010 EMC Proven Professional Knowledge Sharing 2

TABLE OF CONTENTS S.NO NAME OF THE TOPIC PAGE

NO. I. Abstract 4-5

II. Executive Summary 6-9 III. Introduction 10 IV. Essentials 11-13 A. Designing SAN 1. SAN Topology Considerations 14-16 2. SAN Topologies 16-22 3. Information Gathering 22-26 4. Choosing a Switch Type 26-27 5. Sample SAN Fabric 27

B. Implementation Phase 28

1. Switch and Zoning Best Practices 28-30 2. IP SAN Best Practices 30-32 3. RAID Group Best Practices 32-34 4. HBA Tuning 34-38 5. Hot Sparing Best Practices 38-39 6. Optimizing Cache 39 7. Vault Drive Best Practices 40 8. Virtual Provisioning Best Practices 40-43 9. Drive Spin Down Technology 43-44 10. Aligning File System 44-45 C. Post Implementation Phase 45 1 Health Checkup 45-48 2. Performance Monitoring 49-52


LIST OF FIGURES

S.NO Figure Page 1. Digital Universe 6 2. SAN Features 7 3. Deployment Phases 10 4. Single Switch Topology 17 5. Full Mesh Topology 18 6. Partial Full Mesh Topology 19 7. Core Edge Fabric 21 8. SAN FABRIC 27 9. Execution Throttle Change Snapshot 36 10. Extop screenshot 37 11. Extop screenshot 37 12. Changing Queue Depth 39 13. Thin Provisioning 41 14. Creating Storage Pools 41 15. Disk Drive Spin Down 43 16. Disk Crossing 44 17. DAE Checkup 46 18. LCC Checkup 46 19. Disk Module Checkup 47 20. DAE Checkup 47 21. SPE Checkup 48 22. SPE Checkup 48. 23. Navisphere Analyzer 49 24. Navisphere Analyzer 50

1. Single Switch Subjective Rating 17

2. Full Mesh Switch Subjective Rating 18

3. Partial Mesh Switch Subjective Rating 20

4. Core Edge Subjective Rating 21

5. IOPS Requirements 23

6. Switch Case Scenario 24

7. 64 Switch Case Scenario 25

8. Increased ISLs between core and edge switches 25

9. HBA Tuning Parameters 35

10. Hot Spare Provisioning Example 40

11. Cache Recommendations 40

12 Vault Drive Recommendations 40

13. Data Center Environment Requirements 48.

LIST OF TABLES


If your Windows or UNIX networks are expanding to keep pace with a growing business, you

need more than simple, server-based information storage solutions. You need an enterprise-

capable, fault-tolerant, high-availability solution and EMC® CLARiiON® products are the

answer. CLARiiON storage systems employ the industry’s most extensive set of data integrity

and data availability features, such as dual-active storage processors, mirrored write caching,

data verification, and fault-recovery algorithms and support complex cluster configurations,

including Oracle and Microsoft software. CLARiiON data storage solutions have long been

recognized as the most robust and innovative in the industry. Features of CLARiiON systems

include:

• Flash drives

• UltraFlex™ technology

• Fibre Channel/iSCSI connectivity

• Virtual provisioning

• Tiered storage

• Virtualization-aware management

• Virtual LUN technology

• Drive Spin-Down Technology

The performance and flexibility of CLARiiON storage systems has to be backed by superior

availability. This is where CLARiiON really shines. Its design has no single point of failure, all

the way down to the fans and power cords.

This paper includes the practices that I follow and recommend to realize the benefit of

CLARiiON’s features and optimally utilize its resources to maximize performance; from initial

solution design, to implementation, to SAN administration.

When thinking of designing a SAN, there are many important things that you need to consider

before you jump right into it. For starters, you need to know how the components fit together

in order to choose a SAN design that will work for you. Like most storage managers, you'll

want to design your SAN to fit today's storage needs as well as meet tomorrow's increased

storage capacity requirements. Aside from being scalable, the ideal SAN should also be

designed for resiliency and high availability with the least amount of latency. This article will

touch the following topics:

Initial SAN Solution Designing

I. ABSTRACT


SAN topologies to be considered for different SAN deployments Zoning Best Practices

Practices that should be followed while implementing the SAN Using thin provisioning the best way it should be How Thin provisioning brings utilization and capacity benefits VMware ESX Server using EMC CLARiiON storage systems. …and many more

This article will benefit anyone who implements, manages, or administers SAN using EMC

CLARiiON storage systems.


II. Executive Summary

As an enterprise grows and evolves, the data of the organization grows exponentially along

with data storage requirements. Meeting legal and organizational compliance requirements

has become more difficult as data retention periods have been extended. To avoid future

trouble, organizations have started following compliance more seriously.

According to an IDC survey, by 2011, the digital universe will be 10 times its size in 2006.

The diversity of the digital universe can be seen in the variability of file sizes, from six

gigabyte movies on DVD to 128-bit signals from RFID tags. Because of the growth of VoIP,

sensors, and RFID, the number of electronic information “containers” — files, images,

packets, tag contents — is growing 50% faster than the number of gigabytes. The information

created in 2011 will be contained in more than 20 quadrillion — 20 million billion — of such

containers, a tremendous management challenge for both businesses and consumers.

Meanwhile, media, entertainment, and communications industries will account for 10 times

their share of the digital universe in 2011 as their portion of worldwide gross economic output.

The picture related to the source and governance of digital information remains intact:

approximately 70% of the digital universe is created by individuals, but enterprises are

responsible for the security, privacy, reliability, and compliance of 85%. So the requirements

of storage media is increasing two fold, day by day.

Figure 1 Digital Universe


DAS and NAS implementations allow companies to store and access data effectively, but

often inefficiently. This leads to the isolation of storage to the specific devices, making it

difficult to manage and share. Storage area networks (SANs) have the advantage of

centralization, resulting in improved efficiencies. A SAN is a dedicated storage network that

solves many of the complex business data storage needs. Fiber Channel Switches enable

increased connectivity and performance allowing for interconnected SANs and ultimately,

enterprise-level data accessibility of SAN applications and accessibility.

As SANs continue to grow, many factors need to be considered to help scale and manage

them. A SAN should be designed with present and future needs in mind. A SAN should be

designed keeping in mind the Datacenter Manager needs.

For example:

• 24X7 Data Availability

• Flexible Architecture

• Resilient and Robust Architecture

• Cost Effectiveness

• Hassle-Free Information Management

• Scalable Infrastructure

• Optimally catering to the bandwidth requirements of different applications

Figure 2 SAN FEATURES


The storage system is the most important part in a SAN. The speed and efficiency with which

storage arrays can respond to I/O requests from the servers is critical for minimizing

transaction response times.

In addition to speed and efficiency, the storage system should meet the following critical requirements:

• Availability – Ensure that data is accessible at all times when needed. Loss of access to

data can have significant financial impact on businesses.

• Security – Prevent unauthorized access to data. Mechanisms to allow servers to access

only their allocated resources on storage arrays.

• Capacity – Ability to add storage capacity “on-demand”, without interruption to the

business. If a database runs out of physical storage space, it comes to a halt, thus impacting

the business.

• Scalability – The storage solution should be able to grow with the business. As the

business grows, more servers are deployed and new applications/databases developed.

• Performance – Service all the I/O requests at high speed. With the centralized model,

several servers connect to one storage array. The intelligence of the array, the processors,

and architecture should enable optimal performance.

• Data Integrity – Throughout the I/O chain, checks have to be in place to ensure that data is

not corrupted along the way. The storage system has to “guarantee” that the data that was

sent to it was indeed the data that was written to disk and is available for retrieval when

requested.

• Manageability – The operations and activities required to meet all of these requirements

should be performed seamlessly and with minimal disruption to business activity.

Also, a 2006 IDC study found that power and cooling costs are escalating rapidly as newer,

denser servers and storage come online. Customers building new data centers are planning

for “Green IT”, a hot topic in IT circles. Today’s storage systems should have the intelligence

to use power wisely. EMC CLARiiON addresses this concern with the new Drive Spin Down

technology. While CLARiiON – as an Intelligent Storage System – meets all the above critical

requirements, it may not always meet all the requirements of an administrator unless some of

the practices are followed at the pre-implementation and implementation phases. There are


practices that we should follow while implementing a SAN that maximize resource utilization

and optimize storage system performance.


III. INTRODUCTION

This paper focuses on large IP-SAN or FC-SAN deployments within a data center, and

provides best practices and design considerations when designing a reliable and efficient

SAN using EMC CLARiiON storage systems.

This paper comprises the following 3 sections:

Figure 3 Phases

Designing SAN Implementing SAN Administering SAN

Each section comprises best practices that I feel should be followed at different stages of

SAN deployment to optimally utilize the available resources. Having top-of-the-line SAN

equipment does not guarantee optimal performance. We need to focus on certain parameters

while designing, implementing, and administering a SAN to get optimal performance out of

the available resources.

Apart from best practices, this paper will also focus on the features that enable EMC

CLARiiON storage systems to stand tall among the competitors.


IV. ESSENTIALS

A byte-wide field in the three byte Fibre Channel address that uniquely identifies a switch in a

fabric. The three fields in a FCID are domain, area, and port. A distinct Domain ID is

requested from the principal switch. The principal switch allocates one Domain ID to each

switch in the fabric. A user may be able to set a Preferred ID which can be requested of the

Principal switch, or set an Insistent Domain ID. If two switches insist on the same DID one or

both switches will segment from the fabric.

Director Switches

An enterprise-class Fibre Channel switch, such as the Connectrix® ED-140M, MDS 9509, or

ED-48000B. Directors deliver high availability, failure ride-through, and repair under power to

insure maximum uptime for business-critical applications. Major assemblies, such as power

supplies, fan modules, switch controller cards, switching elements, and port modules, are all

hot-swappable. The term director may also refer to a board-level module in the Symmetrix®

that provides the interface between host channels (through an associated adapter module in

the Symmetrix) and Symmetrix disk devices. Interswitch link (ISL) a physical E_Port

connection between any two switches in a Fibre Channel fabric. An ISL forms a hop in a

fabric.

HBA

A bus card in a host system that allows the host system to connect to the storage system.

Typically, the HBA communicates with the host over a PCI or PCI Express bus and has a

single Fibre Channel link to the fabric. The HBA contains an embedded microprocessor with

on-board firmware, one or more ASICs, and a Small Form Factor Pluggable module (SFP) to

connect to the Fibre Channel link.

Fabric

One or more switching devices that interconnect Fibre Channel N_Ports, and route Fibre

Channel frames based on destination IDs in the frame headers. A fabric provides discovery,

path provisioning, and state change management services for a Fibre Channel environment.

In computer storage, a logical unit number (LUN) is simply the number assigned to a logical

unit. A logical unit is a SCSI protocol entity, the only one which may be addressed by the

actual input/output (I/O) operations. Each SCSI target provides one or more logical units, and

does not perform I/O as itself, but only on behalf of a specific logical unit.

Domain ID

LUN


NAS

Network-attached storage (NAS) is file-level computer data storage connected to a computer

network providing data access to heterogeneous network clients. A NAS unit is essentially a

self-contained computer connected to a network, with the sole purpose of supplying file-

based data storage services to other devices on the network. The operating system and other

software on the NAS unit provide the functionality of data storage, file systems, and access to

files, as well as the management of these functionalities.

Over Subscription

The ratio of bandwidth required to bandwidth available. The switch is oversubscribed when all

ports, associated pair-wise, in any random fashion, cannot sustain full duplex at full line-rate.

Port Fencing

Port fencing is a policy-based feature that allows you to protect your SAN from repeated

operational or security problems experienced by switch ports. Port fencing allows you to set

threshold limits on the number of specific port events permitted during a given time period. If

the port generates more events during the specified time period, the Connectrix Manager

(Port fencing feature) blocks the port, disabling transmit and allows you to receive traffic until

you have time to investigate, solve the problem, and manually unblock the port.

Principal Switch

Principal switch in a multiswitch fabric; the switch that allocates domain IDs to itself and to all

other switches in the fabric. There is always one principal switch in a fabric. If a switch is not

connected to any other switches, it acts as its own principal switch.

SAN

A storage area network (SAN) is an architecture to attach remote computer storage devices

(such as disk arrays, tape libraries, and optical jukeboxes) to servers in such a way that the

devices appear locally attached to the operating system. Although SAN cost and complexity

is dropping, they are still uncommon outside larger enterprises.

VSAN

An allocation of switch ports that can span multiple physical switches, forming a virtual fabric.

A single physical switch can sometimes host more than one VSAN.

World Wide Node Name


A unique identifier, even on global networks. The WWN is a 64-bit number

XX:XX:XX:XX:XX:XX:XX:XX). The WWN contains an OUI which uniquely determines the

equipment manufacturer. OUIs are administered by the Institute of Electronic and Electrical

Engineers (IEEE). The Fibre Channel environment uses two types of WWNs; a World Wide

Node Name (WWNN) and a World Wide Port Name (WWPN). Typically, the WWPN is used

for zoning (path provisioning function).

Zone

An information object implemented by the distributed Nameserver(dNS) of a Fibre Channel

switch. A zone contains a set of members which are permitted to discover and communicate

with one another. The members can be identified by a WWPN or port ID. EMC recommends

the use of WWPNs in zone management. Zoning allows an administrator to group several

devices by function or by location. All devices connected to a connectivity product, such as a

Connectrix switch, may be configured into one or more zones.

Zone Set

An information object implemented by the distributed Nameserver(dNS) of a Fibre Channel

switch. A Zone Set contains a set of Zones. A Zone Set is activated against a fabric, and only

one Zone Set can be active in a fabric.


A. Designing SAN

1. SAN Topology Considerations

The adoption of SANs is driven by a variety of objectives. Some examples are:

The need for more efficient use of enterprise storage arrays

Decreasing size of backup/restore windows

Increasing size of data set to be backed up

The need for improved high availability and disaster tolerance solutions

The need to enhance storage resource management

SAN design can appear to be a challenging task, due to the large number of variables

involved in picking an appropriate design strategy. Designing a fabric involves many variables

that require consideration. With each variable consideration comes a separate design

decision that must be made. Each design decision will help you create a fabric design that is

appropriate for your business information model. The following parameters will help in

choosing the right SAN topology according to the requirements.

a. Accessibility Accessibility refers to the ability of your hosts to access the storage that is required to service

their applications. Accessibility can be measured by your ability to physically connect and

communicate with the individual storage arrays, as well as your ability to provide enough

bandwidth resources to meet your full-access performance requirements. A storage array that

is physically accessible, but cannot be accessed within accepted performance limits because

of oversaturated paths to the device, may be just as useless as an array that cannot be

reached physically.

An example of a statistical bandwidth network is the telephone system. The telephone system

is not constructed with enough bandwidth resources to allow every subscriber to

communicate simultaneously. You may have heard "all lines are currently busy; please try

your call again later." This message indicates that the number of subscribers has saturated

the bandwidth currently available, so no new connections are possible until resources are

freed.

Similar issues can arise in the design and implementation of a fabric. You should also

consider the internal design of the switching devices used in your fabric when considering


accessibility. While switches may be designed for high levels of connectivity and allow many

physical attachments, their internal designs may cause internal bandwidth congestion.

b. Availability Availability is a measurement of the amount of time that your data can be accessed,

compared to the amount of time the data is not accessible because of issues in the

environment. Lack of availability might be a result of failures in the environment that cause a

total loss of paths to the device, or it might be an event that caused so much bandwidth

congestion that the access performance renders the device virtually unavailable. Availability

is impacted not only by your choice of components used to build the fabric, but also by your

ability to build redundancy into the environment.

Another concept that adds to the availability of an environment is sparing, which is the

process of dedicating resources to remain unused until they are needed to take the place of a

failed resource. The following must be considered in your redundancy and sparing plan:

◆ How much bandwidth do I need to preserve after a single event occurs?

◆ What other applications might be affected when the original storage resources move to a new

path or down to a single path?

◆ Do I need to plan for scenarios that include successive failures?

◆ Do I want redundancy built into my connectivity components (as seen with director-class

switching devices)?

◆ Do I want to build site redundancy and copy data to another site using CLARiiON

MirrorView™?

◆ Do I want to build redundancy at the host level with a load-balancing and paths failover

application (like PowerPath®)?

◆ How do I rank my business applications so that I can identify lower priority tasks, so these

resources can be used as spares during a failure event? An example of this would be if task one

had failed due to all of its fiber links being damaged and fiber links from task two were used to

bring up the resources associated with task one. When the resources were back online, both

tasks would be working at 50 percent efficiency.


c. Resource Consolidation Resource consolidation includes the concepts of both physical and logical consolidation.

Physical consolidation involves the physical movement of resources to a centralized location.

Now that these resources are located together, you may be able to more efficiently use

facility resources, such as HVAC (heating, ventilation, and air conditioning), power protection,

personnel, and physical security. The trade-off that comes with physical consolidation is the

loss of resilience against a site failure. Flexibility is a measure of how rapidly you are able to

deploy, shift, and redeploy new storage and host assets in a dynamic fashion without

interrupting your currently running environment. An example of flexibility is the ability to

simply connect new storage into the fabric and then zone it to any host in the fabric.

d. Security Security refers to the ability to protect your operations from external and internal malicious

intrusions, as well as the ability to protect accidental or unintentional data access by

unauthorized parties. Security can range from restriction of physical access to the servers,

storage, and switches by placing them in a locked room, to logical security associated with

zoning, volume accessing/ masking.

e. Supportability Supportability is the measure of how easy it is to effectively identify and troubleshoot issues,

as well as to identify and implement a viable repair solution in the environment. The ability to

troubleshoot may be enhanced through good fabric designs, purposeful placement of servers

and storage on the fabric, and a switch's ability to identify and report issues on the switch

itself or in the fabric. Fabric topologies can be designed so that data traffic patterns are

deterministic, traffic bandwidth requirements can be easily associated with individual

components, and placement policies can be documented so that troublesome components

can be identified quickly.

2. SAN Topologies

This section describes the SAN topologies that can be considered for designing a SAN

according to the importance of parameters explained above. This will help us logically decide

which topology best suits and meets our requirements.

2.a Simple Fibre Channel SAN topologies A simple Fibre Channel SAN consists of less than four directors and switches connected by

ISLs and which has no more than two hops. A single switch fabric is the simplest of the


simple Fibre Channel SAN topologies and consists of only a single switch. A simple Fibre

Channel SAN consists of less than four directors or Fibre Channel Switches.

Figure 4. Single Switch Topology

Most ---------------------------------------------------------------------- Least

Attribute 5 4 3 2 1

Accessibility

Availability

Consolidation

Flexibility

Scalability

Security

Supportability

Table 1: Single Switch Subjective Rating

The following best practices are specific for two switch fabrics.

ISL subscription best practice — While planning the SAN, keep track of how many

host and storage pairs utilize the ISLs between domains. As a general best practice,

if two switches are connected by ISLs, ensure that there is a minimum of two ISLs

between them and that there are no more than six initiator and target pairs per ISL.

For example, if 14 initiators access a total of 14 targets between two domains, a total

of three ISLs would be necessary. This best practice should not be applied blindly

when setting up a configuration. Consider the applications that will use the ISLs.

2.b Complex Fibre Channel SAN topologies

Storage

Management Station

Server

Fibre Switch


2.b.a Full-Mesh fabric

A full-mesh fabric is any collection of Fibre Channel switches in which each switch

is connected to every other switch in the fabric by one or more ISLs. For best host

and storage accessibility, it is recommended that a full-mesh fabric contain no

more than four switches. A mesh may contain departmental switches, directors,

or both, depending on your connectivity needs. When designing and

implementing a full-mesh fabric, it is recommended that you lay out the storage

and servers in a single-tier logical topology design and plan your ISL

requirements based on the assumption that 50% of the traffic on any one switch

will remain local and the other 50% will originate from the remaining remote

switches.

Figure 5. Full-Mesh Topology

Benefits

Full-mesh configurations give you, at most, one-hop access from any server to any storage

device on the fabric. This means that when you are adding or migrating storage or server

attachments, you have the greatest possibility of placing the server attachment and matching

storage attachments anywhere in the fabric and achieving the same response time. Meshes

also ensure that you always have multiple local and remote paths to the data even after fabric

events have occurred.

Limitations

Scaling a full-mesh solution becomes complicated and costly when increasing the number of

switches and required ISLs to guarantee traffic performance.

Most ------------------------------------------------------------------ Least

Attribute 5 4 3 2 1

Accessibility


Availability

Consolidation

Flexibility

Scalability

Security

Supportability

Table 2: Full-Mesh Switch Subjective Rating

2.b.b Partial Mesh fabric

Figure 6. Partial-Mesh Topology

A partial-mesh fabric is different from a full mesh in that each switch does not have to be

connected to all other switches. However, to be considered a partial mesh, the fabric must be

a configuration where splitting it results in each new sub-fabric being a full mesh. For best

fabric response times, both the managed switch (where zoning is activated) and the principal

switch should be at the logical center of the fabric.

Benefits

Partial-mesh designs offer extensive access to both local switch storage and single-hop

storage. A partial mesh also extends accessibility and provides many unique paths to the

storage. Increasing accessibility while maintaining the same level of robustness is a design

goal for every topology. Partial meshes also offer a simple progression into a core/edge

design. If you look at the center of the partial mesh as the core, you can create the new

infrastructure by simply removing some of the ISLs at the outer edges of the fabric.

Limitations


Increasing the fabric size always increases the dependencies within the fabric. This does not

cause a problem, but it does increase the complexity of troubleshooting and the impact on

unrelated processes during a fabric event.

Most ------------------------------------------------------------------ Least

Attribute 5 4 3 2 1

Accessibility

Availability

Consolidation

Flexibility

Scalability

Security

Supportability

Table 3: Partial-Mesh Switch Subjective Rating

2.b Core Edge Fibre Channel SAN topologies

1. Within the two-tier design, servers connect to the edge switches, and storage devices

connect to one or more core switches. This allows the core switch to provide storage services

to one or more edge switches, thus servicing more servers in the fabric. The interswitch links

(ISLs) will have to be designed so that the overall fabric maintains both the fan-out ratio of

servers to storage and the overall end-to-end oversubscription ratio.

2. Three-tier: Edge-core-edge design A three-tier design may be ideal in environments where future network growth will result in

the number of storage devices exceeding the number of ports available at the core switch.

This type of topology still uses a set of edge switches for server connectivity, but adds

another set of edge switches for storage devices. Both sets of edge switches connect to a

core switch via ISLs.


Figure 7. Core Edge Fabric

Benefits The compound core/edge model maintains a robust, highly efficient traffic model while

reducing the required ISLs, thus increasing the available ports for both storage and host

attachments. It also offers a simple method for the expansion of two or more simple

core/edge fabrics into a single environment. You can easily create a compound core topology

by connecting the core switches from simple core/edge fabrics into a full mesh. The

compound core/edge topology creates a robust back-end fabric that can extend the

opportunities for sharing of both backup and storage resources.

Limitations

Core/edge design models produce a physically larger, tiered fabric which could result in

slightly longer fabric management propagation times over smaller, more compact designs.

Neither compound nor complex core/edge fabrics provide for single-hop access to all storage.

Most ---------------------------------------------------------------------- Least

Attribute 5 4 3 2 1

Accessibility

Availability

Consolidation

Flexibility

Scalability


Security

Supportability

Table 4: Core Edge Subjective Rating

Best Practices for Core Edge Topology

Lay out the host and storage connectivity such that if a switch fails, not all of a

particular hosts’ storage becomes inaccessible.

The use of two separate management networks is more common with balanced

fabrics, but it can still be employed when only one fabric is used.

ISL subscription best practice — While planning the SAN, keep track of the number of

host and storage pairs that would be utilizing the ISLs between domains. As a

general best practice, if two switches are connected by ISLs, ensure that there is a

minimum of two ISLs between them, and that there are no more than six initiator and

target pairs per ISL. For example, if 14 initiators access a total of 14 targets between

two domains, a total of three ISLs are necessary.

3. Information Gathering

Designing a SAN begins with gathering the information about the infrastructure which will

help in choosing the right topology. Information that should be captured includes:

a) Details of the Servers

Server details along with the operating system and storage requirements should be gathered.

Server Name Operating System Storage Requirements

anujsanwin2008 Windows 2008 200 GB

anujaix61 AIX 6.1 100 GB

b) Applications running on Servers

Application details and LUN requirements should be captured.

Application LUN Size Number of LUNS

Oracle 9i 1TB 2

IBM DB2 2TB 4

c) Desired IOPS and Read Write Access Distributions


Application Bandwidth Utilization

Read/Write Max

Typical Access Typical I/O Size

OLTP, email, UFS ecommerce, CIFS

Light 80% read 20%wrire

Random 8KB

OLTP(raw) Light 80% read 20%wrire

Random 2KB to 4KB

Decision support, Seismic , Imaging

Medium to heavy 90% read10% write

Sequential 16KB to 128K

Video Server Heavy 98% read 2% Write

Sequential >64KB

SAN applications: LANFree backup, snapshots , thirdparty copy.

Medium to heavy Variable Sequential >64 KB

Table 5: IOPS requirements

d) Future Requirements

To get an idea about the future requirements of the customers, ask questions, such as:

• Number of servers likely to be commissioned in near future

• Storage growth trends

Gathering the above information will help you design the SAN for present and future needs.

Once you have an idea of what a SAN will be used for and the physical location of each piece

of equipment settled, you need to consider how many host and storage ports will be deployed

initially, and how the environment is expected to grow, before you can decide on the right

topology. These are just guidelines. Your actual implementation will vary as more or fewer

ISLs are needed between switches. Scalability information for full mesh with two and four

ISLs and full mesh core with edge switches has been included below to help you find the right

topology.

In Table 6, combinations that result in negative numbers or result in greater than 2048

available ports are grayed out and should not be considered for use.


Table 6

In table 6:

Number of switches The number of switches in the fabric.

Number of ISLs between switches indicates the number of ISLs that will connect

every switch to every other switch. Since this is a full mesh, all switches will connect

to each other.

Number of ports on switch (i.e. 16, 24, 32, etc.) Indicates the number of ports on each

switch in the fabric. It assumes all switches have the same port count.

#total The total raw port count. As of publication, the raw port count cannot exceed

2048 in any single fabric. This number can be determined by summing the port count

on each switch.

# avail The number of ports available for Nx_Ports to attach to.

% avail The amount of ports that were not consumed by E_Ports expressed as a

percentage of the total ports. Generally, you should not use a topology that has 50%

or less of the ports available.


Let’s take a case of using 64 port core switches.

Table 7

The core ports and the appropriate value under Number of ports on edge switch need to be

added to determine the total port count. When this is done, a few configurations fall outside of

the support envelope of 2048 Nx_Ports. If the number of ISLs is increased between cores

and edge switches, then some of the fabrics fall back into the supportable range. Table 8

shows the increase of ISLs between cores and edge switches.

Table 8: Number of ISLs increased between cores and edge switches


Note: Remember, if you are considering deploying a mirrored fabric (and you should,

because this is a best practice), the number of ports needed on each fabric will be roughly

half of the total ports needed.

4. Choosing a Switch Type

This section provides considerations for choosing a vendor and selecting a model.

Choosing a vendor:

If an environment has standardized on a switch vendor such as Brocade or Cisco, you

should use a switch from their product line. Although improvements to test coverage of

interop environments have been made, interop fabrics remain the least tested configurations

as switch vendors spend much more time verifying interop with their own products than

investing time in testing interop with another switch vendors’ products.

The subject of interoperability is raised because even if the fabrics are not connected when

installed, there is a chance that connecting them will be desired at some point in the future.

Training is an equally important reason for using the same vendor. A user who has

standardized on a particular vendor is less likely to need training on the product. Typically,

their expectation of the product’s performance is more realistic and any infrastructure

challenges (power, monitoring) have already been dealt with.

Sometimes it is not possible to keep the same vendor as the decision has already been

made to migrate to another vendor.

If a particular vendor has not been standardized, then determine which features will work

best for you.

Selecting a model: Once a vendor has been chosen, it is time to select a model. There are

many different aspects to consider, but this section is only in regard to port count. Switches

provide between 8 and 64 ports of connectivity. Directors provide between 8 and 528 ports of

connectivity. Keep in mind when ordering a director that they all have minimum shipping

configurations. For example, assuming 4 GB/s FC will be used:

For Cisco: 9513, 9509, and 9506, there is a minimum of one blade per chassis (16 ports).

For Brocade Silkworm 48000, there is a minimum of two blades per chassis (64 ports).

For Brocade M Series Intrepid 6140, there is a minimum of two Universal Port Modules

(UPMs) (8 ports).

Factor in these minimums when considering which switch and how many of each to

purchase.


5. Sample SAN Fabric

Figure 8

The deployment shown in Figure 8 allows scaling to nearly 1500 devices in a single fabric.

The actual production environment has approximately 190 storage ports and roughly 1050

host ports. The environment required a minimum of 12:1 oversubscription within the network,

which required each host edge switch to have a 36-Gbps port channel, using nine physical

links. Storage ports will not grow quite as rapidly and the core switch has room to grow to add

more host edge switches. With data centers continually growing, SAN administrators must

design networks that meet their current needs and can scale for demanding growth.

Administrators deploying large SAN fabrics can use the design parameters and best practices

discussed in this paper to design optimized and scalable SANs.

B. IMPLEMENTATION PHASE


1. Switch and Zoning Best Practices

Connect the host and storage ports in such a way as to prevent a single point of failure

from affecting redundant paths. For example, if you have a dual-attached host and each HBA

accesses its storage through a different storage port, do not place both storage ports for the

same server on the same line card or ASIC.

Use two power sources for host and storage layout.

To reduce the possibility of congestion, and maximize ease of management, connect hosts

and storage port pairs to the same switch where possible.

Use a port fencing policy.

Use the latest supported firmware version and ensure that the same version of firmware is

used throughout the fabric. In homogeneous switch vendor environments, all switch firmware

versions inside each fabric should be equivalent, except during the firmware upgrade

process.

Periodically (or following any changes) back up switch configurations.

Use persistent Domain IDs.

A zoneset can be managed and activated from any switch in the fabric, but it is

recommended that it be managed from a single entry switch within a fabric to avoid

complications with multiple users accessing different switches in a fabric to make concurrent

zone changes.

While it is possible to see and share tapes and disks over the same HBA in a Fibre

Channel fabric, it is not best practice to do so. The reason for this is simple. Tape devices

tend to send out a lot of SCSI rest commands on rewind and this can wreak havoc on disk

data streams. Also, tape traffic, since it is usually one long continuous data stream, will try to

hog the bandwidth of the link. If you are trying to do backups while production is running, it

will affect performance.

A better method is to zone tape ports to dedicated HBAs used for tape backup.

The system administrators should coordinate zoning configuration activity to avoid running

into a situation where two administrators are making changes simultaneously.

To avoid lengthy outages due to errors in Connectrix B SAN configurations, it is

recommended to backup the existing configuration before making any changes.

To avoid the high risk involved in adding a new unauthorized switch to a Connectrix B

fabric, it is advisable to limit the creation of switch-to-switch ports. This can be done by

locking the already connected switch-to-switch ports in the SAN using the portCfgEport

command. Such locking down of E_Ports is persistent across reboots. A portCfgEport <port


number>,0 <disable> must be run on ports that are not connected to other switches in the

fabric to block them from forming ISLs between switches.

The administrator configuring a Connectrix B SAN must be aware that the frame-level

trunking for Connectrix B switches requires all ports in a given ISL trunk reside within an

ASIC group on each end of the link.

On 2 Gb/s switches, port groups are built on contiguous 4-port groups, called quads. For

example, on a Connectrix DS-8B2, there are two quads: ports 0-3 and ports 4-7.

IVR NAT port login (PLOGI) requests received from hosts are delayed for a few seconds to

perform the rewrite on the FC ID address. If the host's PLOGI timeout value is set to a value

less than five seconds, it may result in the PLOGI being unnecessarily aborted and the host

being unable to access the target. EMC recommends that you configure the host bus adapter

for a timeout of at least ten seconds (most HBAs default to a value of 10 or 20 seconds). On

4 Gb/s switches like the Connectrix DS-4100B, trunking port groups are built on contiguous 8-

port groups called octets. In this product, there are four octets: ports 0-7, 8-15, 16-23, and 24-

31. The administrator must use the ports within a group specified above to form an ISL trunk.

It is also possible to configure multiple trunks

If using FC ID or Domain port zone member types, it is recommended that the Domain ID

of each switch in the fabric be locked.

When a new switch is installed in a fabric, it is recommended not to have a configured

zoning database or an Active zoneset. Run the reset zoning command to clear zone.

Host and storage layout

The best practice placed hosts on edge switches and high-use storage ports on core

switches. This was recommended because high-use storage ports are sometimes accessed

by many different hosts on different parts of the fabric. If this is the case in your environment,

this configuration would still be the best option. However, if you have high-use storage ports

that are only accessed by a couple of hosts and it is possible to locate them all on the same

switch, this is the preferred configuration instead of forcing the use of ISLs. Fibre Channel

SAN Topologies resource should be reserved for providing connectivity between ports that

are unable to be placed on the same switch.

With this in mind, the following information provides helpful general guidelines:

Whenever practical, locate HBAs and the storage ports they will access on the same

switch. If it is not practical to do this, minimize the number of ISLs the host and storage need

to traverse.

Some of the switch class products being produced today only contain a single ASIC. If this

is the case, then the positioning of the host and storage ports is strictly a matter of personal


preference. However, if the switch being used contains multiple ASICs, try to connect host

and storage pairs to the same ASIC. This prevents using the shared internal data transfer bus

and reduces switch latency. In addition to performance concerns, consider fault tolerance as

well. For example, if a host has two HBAs, each one accessing its own storage port, do not

attach both HBAs, both storage ports, or all of the HBA and storage ports to the same ASIC.

When working with hosts that have more than one connection to more than one storage

port, always connect the HBAs and, if possible, the storage ports that it accesses to different

FC switches. If a completely separate fabric is available, connect each HBA and storage port

pair to different fabrics. For homogeneous Brocade M Series fabrics: • If Enterprise Fabric

mode is available, enable it. If Enterprise Fabric mode is not available, enable:

Fabric Binding

Switch Binding

Port Binding

For heterogeneous fabrics containing Brocade M Series switches, enable Switch

Binding and Port Binding.

Security

It is important to secure your fabric. General security best practices for an FC SAN include:

Implement some form of zoning

Change default password

Disable unused or infrequently used management interfaces

Use SSL or SSH if available

Limit physical access to FC switches

2. IP SAN Best Practices

Jumbo frames

When supported by the network, we recommend using jumbo frames to increase bandwidth.

Jumbo frames can contain more iSCSI commands and a larger iSCSI payload then normal

frames without fragmenting (or less fragmenting depending on the payload size). If using

jumbo frames, all switches and routers in the paths to the storage system must support and

be capable of handling and configured for jumbo frames.

The following general recommendations apply to iSCSI usage:


iSCSI is not recommended with applications having the highest bandwidth requirements,

including high performance remote replication.

When possible, use a dedicated LAN for storage traffic, or segregate storage traffic to its

own virtual LAN (VLAN).

Use the most recent version of the iSCSI initiator supported by EMC, and the latest version

NIC driver for the host.

Configure iSCSI 1 Gb/s (GigE) and 10 Gb/s (10 GigE) ports to Ethernet full duplex on all

network devices in the initiator-to-target path.

Use CAT6 cabling on the initiator-to-target path whenever possible to ensure consistent

behavior at GigE and 10 GigE Ethernet speeds.

Use jumbo frames and TCP flow control for long distance transfers or with networks

containing low-powered servers.

Use a ratio of 1:1 SP iSCSI ports to NICs on GigE SANs for workloads with high read

bandwidths. 10 GigE SANs can use higher ratios of iSCSI ports to NICs.

Ensure the Ethernet connection to the host is equal to or exceeds the bandwidth rating of

the host NIC.

Ensure the Ethernet connection to the CLARiiON is equal to or exceeds the bandwidth of

the CLARiiON’s iSCSI FlexPort. It is recommended to use a dedicated storage network for iSCSI traffic. If you do not use a

dedicated storage network, iSCSI traffic should be either separated onto a separate physical

LAN, separate LAN segments, or a virtual LAN (VLAN). With VLANs, you can create multiple

virtual LANs, as opposed to multiple physical LANs in your Ethernet infrastructure. This

allows more than one network to share the same physical network while maintaining a logical

separation of the information. FLARE release 29.0 and later support VLAN tagging (IEEE

802.1q) on 1 Gb/s and 10 Gb/s iSCSI interfaces. Ethernet switch-based VLANs are

supported by all FLARE revisions. VLAN tagging with the compatible network switch support

isolates iSCSI traffic from general LAN traffic; this improves SAN performance by reducing

the scope of the broadcast domains.

Network latency

Both bandwidth and throughput rates are subject to network conditions and latency. It is

common for network contention, routing inefficiency, and errors in VLAN configuration to

adversely affect iSCSI performance. It is important to profile and constantly monitor the

network carrying iSCSI traffic to ensure the best iSCSI connectivity and SAN performance. In

general, simple network topologies offer the best performance. Latency can contribute

substantially to iSCSI system performance. As the distance from the host to the CLARiiON

increases, a latency of about 1 millisecond per 200 kilometers (125 miles) is introduced. This


latency has a noticeable effect on WANs supporting sequential I/O workloads. For example, a

40 MB/s 64 KB single stream would average 25 MB/s over a 200 km distance. EMC

recommends increasing the number of streams to maintain the highest bandwidth with these

long-distance, sequential I/O workloads.

3. RAID Group Best Practices

RAID performance characteristics

Ddifferent RAID levels have different performance and availability depending on the type of

RAID and the number of drives in the RAID group. Certain RAID types and RAID group sizes

are more suitable for particular workloads than others, so choosing the appropriate RAID

implementation is a crucial task in the implementation phase.

When to use RAID 0

We do not recommend using RAID 0 for data with any business value.

RAID 0 groups can be used for non-critical data needing high speed (particularly write

speed) and low cost capacity in situations where the time to rebuild will not affect business

processes. Information on RAID 0 groups should be already backed up or replicated in

protected storage. RAID 0 offers no level of redundancy. Proactive hot sparing is not enabled

for RAID 0 groups. A single drive failure in a RAID 0 group will result in complete data loss of

the group. An unrecoverable media failure can result in a partial data loss. A possible use of

RAID 0 groups is scratch drives or temporary storage.

When to use RAID 1

We do not recommend using RAID 1. RAID 1 groups are not expandable. Use RAID 1/0

(1+1) groups as an alternative for single mirrored RAID groups.

When to use RAID 3

For workloads characterized by large block sequential reads, RAID 3 delivers several MB/s of

higher bandwidth than the alternatives. RAID 3 delivers the highest read bandwidth under the

following conditions:

Drives create the bottleneck, such as when there are a small number of drives for each

back-end loop.

Sequential streams are larger than 2 MB.

The file system is not fragmented or is using raw storage.

The block size is 64 KB or greater.


RAID 3 can be used effectively in backup-to-disk applications. In this case, configure RAID

groups as either (4+1) or (8+1). Do not use more than five backup streams per LUN.

In general, RAID 5 usage is recommended over RAID 3. RAID 3 should only be used for

highly sequential I/O workloads, because RAID 3 can bottleneck at the parity drive on random

writes. Also, when more than one RAID 3 group is actively running sequential reads on a

back-end bus, the bus can rapidly become the bottleneck and performance is no different

from RAID 5.

When to use RAID 5

RAID 5 is favored for messaging, data mining, medium-performance media serving, and

RDBMS implementations in which the DBA is effectively using read-ahead and write-behind.

If the host OS and HBA are capable of greater than 64 KB transfers, RAID 5 is a compelling

choice. These following applications are ideal for RAID 5:

Random workloads with modest IOPS-per-gigabyte requirements

High performance random I/O where writes are less than 30 percent of the workload

A DSS database in which access is sequential (performing statistical analysis on sales

records)

Any RDBMS tablespace where record size is larger than 64 KB and access is random

(personnel records with binary content, such as photographs)

RDBMS log activity

Messaging applications

Video/media

When to use RAID 6

RAID 6 offers increased protection against media failures and simultaneous double drive

failures in a parity RAID group. It has similar performance to RAID 5, but requires additional

storage for the additional parity calculated. This additional storage is equivalent to adding a

drive that is not available for data storage to the RAID group.

RAID 6 can be used as an alternative to RAID 5 when the need for increased reliability

outweighs the overhead of the additional parity drive.

RAID 6 groups can be four to 16 drives. A small group is up to six drives (4+2). A medium

group is up to 12 drives (10+2), with large groups being the remainder. Small groups stream

well. However, small random writes destage slowly and can adversely affect the efficiency of

the system write cache. Medium-sized groups perform well for both sequential and random

workloads. The optimal RAID 6 groups are 10 drive (8+2) and 12 drive (10+2) sized.

When to use RAID 1/0


RAID 1/0 provides the best performance on workloads with small, random, write-intensive

I/O. A write-intensive workload’s operations consist of greater than 30 percent random writes.

Some examples of random, small I/O workloads are:

High transaction rate OLTP

Large messaging installations

Real-time data/brokerage records

RDBMS data tables containing small records, such as frequently updated account

balances. RAID 1/0 also offers performance advantages during certain degraded modes,

including when write cache is disabled or when a drive has failed in a RAID group. RAID 1/0

level groups of (3+3) and (4+4) have a good balance of capacity and performance.

4. HBA Tuning

Latest drivers for the HBA should be installed and firmware should be updated.

Transaction-based and throughput-based processing are types of workload. The workload is

the total amount of work that is performed at the storage server, and is measured through the

following formula:

Workload = [transactions (number of host IOPS)] * [throughput (amount of data sent in one I/O)]

Since a storage server can sustain a given maximum workload, the above formula shows that

when the number of host transactions increases, the throughput decreases. Conversely, if the

host is sending large volumes of data with each I/O, the number of transactions decreases.

A workload characterized by a high number of transactions (IOPS) is called a

transaction-based workload.

A workload characterized by large I/Os is called throughput-based workload.

These two workload types are conflicting in nature, and consequently require different

configuration settings across all parts of the storage solution.


Table 9: HBA Tuning parameters.

In Microsoft Windows environments, the following three HBA parameters affect HBA

performance: Execution Throttle, Frame Size, and Fibre Channel Data Rate. Of these, the

Frame Size and Fibre Channel Data Rate default settings are pre-set to 2112 bytes (2048

bytes + headers) and auto-negotiate to provide the best possible performance in any

environment. Therefore, Execution Throttle is the only HBA parameter that you can tune to

improve HBA performance in a Windows environment.

Tuning Execution Throttle

In a SAN configuration with three or more servers accessing the same storage array, QLogic

recommends changing the default Execution Throttle value for each HBA. By default, all

QLogic EMC 4Gb FC HBAs have their Execution Throttle value set to maximum; if you

decide to change this from its default value, use the guidelines below to derive a new value.

To calculate the new execution throttle, first determine if all servers carry the same I/O load. If

all servers carry the same I/O load, calculate the value by dividing 250 by the number of

servers in the SAN. Set each HBA in the SAN to the calculated value. For example, in a four-

server configuration, divide 250 by 4 to arrive at 62.5. The Execution Throttle value for each

HBA is 62. Assign the value of 62 to all HBAs.

If some of the servers carry heavier I/O loads, first calculate the Execution Throttle value by

dividing 250 by the number of servers, and then adjust the values so that servers with higher

I/O loads have higher Execution Throttle values and servers with lower I/O loads have lower

Execution Throttle values. For example, in a four-server configuration, you can assign the

value of 72 to the HBAs in the server with the highest I/O load, the value of 52 to the HBAs in

the second server, and the value of 62 to the HBAs in the remaining two servers.


How to Change the Execution Throttle

The Execution Throttle value for each port of a HBA can be easily changed with the QLogic

SANsurfer FC HBA Manager application (or the SANsurfer command line interface (CLI)) on

Windows Environments (see the figure below).

Figure 9

Tuning HBA Queue Depth for your ESX Environment Use the VMware ESX tool esxtop to view the current HBA queue utilization while I/O is active

on the HBA ports. Navigate your way through esxtop to find disk statistics. A man esxtop

command issued from the ESX console provides detailed information on its usage.

Figure 10 shows the output of the storage statistics section of esxtop while I/O is active.

LQLEN shows the current HBA queue depth set in the QLogic HBA driver. A value of LOAD >

1 indicates that the host application is placing more data in the HBA queue than its current

size can handle. A system with this issue can benefit from an increase in the HBA queue

depth.


Figure 10: extop screenshot

Figure 11 shows the effect of increasing the HBA queue depth. This result of esxtop has been

captured after increasing the HBA queue depth from its default value of 32 to 64. Note that

the LOAD is < 1 and there is a significant increase in the READS/s operations, which means

that performance has increased.

Figure 11: extop screenshot

How to Change the HBA Queue Depth

To change the queue depth of a QLogic HBA in VMware ESX, follow these steps:

• Log on to the VMware ESX Console as root.

• Create a copy of /etc/vmware/esx.conf so you have a backup copy.

• Edit the file /etc/vmware/esx.conf in your favorite editor.


• Locate the following entry

/vmkmodule[0002]/module = "qla2300_707.o"

/vmkmodule[0002]/options = ""

• Modify the entry as shown, where xx is the queue depth value:

/vmkmodule[0002]/module = "qla2300_707.o"

/vmkmodule[0002]/options = "ql2xmaxqdepth=xx"

Figure 12: Changing Queue Depth

• Save the file.

• Reboot the VMware ESX Server.

5. EMC CLARiiON Hot Sparing Best Practices

The following summarizes hot spare best practices:

The storage system should have at least one hot spare of every speed, maximum

needed capacity, and hard drive type.

Position hot spares on the same buses containing the drives they may be required to

replace.

Maintain a minimum 1:30 ratio (round to one per two DAEs) of hot spares to data hard

drives.


EFD storage devices can only be hot spares for, and be hot spared by, other EFD

devices.

DAE Backend bus

Vault Drives Data Drives Hot Spares Total DAE drives

DAE 0 0 5X FC 10X FC 0 15 DAE 1 1 0 14X SATA 1X SATA 15 DAE 2 0 0 15X FC 0 15 DAE 3 0 0 10X FC 1X FC 11 Total FC data

drives 49

Total FC hot spares

1

Total SATA data drives

14

Total SATA hot spares

1

Total Drives 65

Table 10: Hot spare provisioning example

6. Optimizing EMC CLARiiON Cache

Generally, for storage systems with 2 GB or less of available cache memory, use about 20

percent of the memory for read cache and the remainder for write cache. For larger capacity

cache configurations, use as much memory for write cache as possible while reserving about

1 GB for read cache. Specific recommendations are as follows:

CX4120 CX4240 CX4480 CX4960 WRITE CACHE (MB) 498 1011 3600 9760 READ CAHE (MB) 100 250 898 1000 TOTAL CACHE (MB) 598 1261 4498 10760

Table 11: Cache Recommendations

7. EMC CLARiiON Vault Drive Best Practices

It is recommended that no LUNS be bound to the vault drives with high IOPS requirement, as

it can lead to performance degradation. The IOPS supported by the vault drives are given

below in the table and can be used to decide upon the LUN Provisioning on the vault drives.

VAULT HARD DRIVE TYPE MAX IOPS MAX BANDWIDTH (MB/s) FC 100 10SAS 100 10 SATA 50 5 EFD 1500 69


Table 12: Vault Drive Performance Parameters

It is not recommended that the vault drives (0.0.0 to 0.0.4) be left unbound. If drives are

unbound, they are not being regularly verified by flare. This means there would be no early

warning of drive faults which could cause booting problems for the SP. Therefore, if no user

data needs to be bound on the first five drives, then a small (e.g. 1 GB) test LUN should be

bound across any unbound vault drives. This LUN should be named as a verification LUN

and should not be placed in a storage group.

8. EMC CLARiiON Virtual Provisioning Best Practices

Virtual Provisioning provides for thin provisioning of LUNs. Thin LUNs present more storage

to an application than is physically available. The presentation of storage not physically

available avoids over-provisioning the storage system and under-utilizing its capacity. When a

thin LUN eventually requires additional physical storage, capacity is non-disruptively and

automatically added from a storage pool. In addition, the storage pool’s capacity can be non-

disruptively and incrementally added to with no effect on the pool’s thin LUNs.

Figure 13: Thin Provisioning

Recommendations for creating pools are as follows:

We recommend Fibre Channel hard drives for thin storage pools due to their overall

higher performance and availability.

Create pools using storage devices that are the same type, speed, and size. In

particular, keep Fibre Channel and SATA hard drives in separate pools.

Usually, it is better to use the RAID 5 level for pools. It provides the highest user data

capacity per number of pool storage devices.

Use RAID 6 if the pool is composed of SATA drives and will eventually exceed a total

of 80 drives. Pools made up of large capacity (>500 GB) drives should use RAID 6.

Initially provision the pool with the largest number of hard drives as is practical. For

RAID 5 pools, the initial drive allocation should be at least five drives and a quantity


evenly divisible by five. RAID 6 pool initial allocations should be evenly divisible by

eight.

• If you specify 15 drives for a RAID 5 pool, Virtual Provisioning creates three 5-

drive (4+1) RAID groups. This is optimal provisioning.

In a thin LUN pool, the subscribed capacity is the amount of capacity that has been

assigned to LUNs. When designing your system, make sure that the expected

subscribed capacity does not exceed the capacity that is provided by the maximum

number of drives allowed in a storage system’s pool.

Ffigure 14 shows the options to be selected to create a storage pool from Navisphere

Manager.

Figure 14: Creating Storage Pools

Expanding storage pools

For best performance expand storage pools infrequently, maintain the original character of

the pool’s storage devices, and make the largest practical expansions.

Following are recommendations for expanding pools:

Adjust the % Full Threshold parameter (default is 70%) to the pool size and the rate

applications are consuming capacity. A pool with only a few small capacity drives will

quickly consume its available capacity. For this type of pool you should have lower


alerting thresholds. For larger pools slowly consuming capacity you should use higher

thresholds. For example, for the largest pools, a good initial % Full Threshold

parameter value is 85%.

Expand the pool using the same type and same speed hard drives used in the original

pool.

Expand the pool in large increments. For RAID level 5 pools, use increments of drives

evenly divisible by five, not less than five. RAID 6 pools should be expanded using

eight-drive evenly divisible increments.

Creating thin LUNs

The largest capacity thin LUN that can be created is 14 TB.

The number of thin LUNs created on the storage system subtracts from the storage

system’s total LUN hosting budget.

Avoid trespassing thin LUNs. Changing thin LUN SP ownership may adversely affect

performance. After a thin LUN trespass, a thin LUN’s private information remains

under control of the original owning SP. This will cause the trespassed LUN’s I/Os to

continue to be handled by the original owning SP. This results in both SPs being used

in handling the I/Os. Involving both SPs in an I/O increases the time used to complete

an I/O.

When planning to use a thin LUN in a bandwidth-intensive workload, the required

storage for the thin LUN should be pre-allocated. This pre-allocation results in

sequential addressing within the pool’s thin LUN ensuring high bandwidth

performance. Pre-allocation can be performed in several ways including migrating

from a traditional LUN; performing a full format of the file system, performing a file

write from within the host file system; or creating a single Oracle table from within the

host application. In addition, only one concurrent pre-allocation per storage pool

should be performed at any one time. More than one thin LUN per pool being

concurrently pre-allocated can reduce overall SP performance.

There is a fixed capacity overhead associated with each thin LUN created in the pool.

Take into account the number of LUNs anticipated to be created, particularly with

small allocated capacity pools.

A thin LUN is composed of metadata and user data, both of which come from the

storage pool. A thin LUN’s metadata is a capacity overhead that subtracts from the

pool’s user data capacity. Any size thin LUN will consume about 3 GB of pool

capacity: slightly more than 1 GB of capacity for metadata, an initial 1 GB of pool

capacity for user data. An additional 1 GB of pool capacity is prefetched before the


first GB is consumed in anticipation of more usage. This totals about 3 GB. This

allocation of metadata remains about the same from the smallest to the largest (>2 TB

host-dependent) LUNs. Additional metadata is allocated from the first 1 GB of user

data as the LUN’s user capacity increases.

To estimate the capacity consumed, follow this rule of thumb:

Consumed capacity = (User Consumed Capacity * 0.02) + 3GB.

Plan ahead for metadata capacity usage when provisioning the pool. With small capacity

pools, the percentage of capacity used by metadata may be large. Create pools with enough

initial capacity to account for metadata usage and any initial user data for the planned

number of LUNs.

9. EMC CLARiiON Drive Spin Down Technology

Disk-drive Spin Down conserves power by spinning down drives in a RAID group when the

RAID group is not accessed for 30 minutes, and allowing the drives to enter an idle state. In

the idle state, the drives do not rotate and thus use less power. (A RAID group that is idle for

30 minutes or longer uses 60 percent less electricity.) Figure 15 shows how to enable spin

drive technology from Navisphere Manager when creating a RAID Group.

Figure 15: Drive Spin Down

When an I/O request is made to a LUN whose drives are in spin down (idle) mode, the drives

must spin up before the I/O request can be executed. A RAID group can be on idle state for

any length of time. The storage system periodically verifies that idle RAID groups are ready


for full-powered operation. RAID groups failing the verification are rebuilt. Spin Down can be

configured at either the storage system or the individual RAID group level. We recommend

the storage system level. Storage-system level Spin Down will automatically put unbound

drives and hot spares into idle.

Spin Down is recommended for storage systems that support development, test, and training

because these hosts tend to be idle at night. Spin Down is also recommended for storage

systems that back up hosts. A host application will see an increased response time for the

first I/O request to a LUN with RAID group(s) in standby. It takes less then two minutes for the

drives to spin up. The storage system administrator must consider this and the ability of the

application to wait when deciding to enable the disk-drive Spin Down feature in a RAID group.

10. Aligning File System

File System Fragmentation

Fragmented File Systems decreases the opportunity for sequential I/O, which reduces overall

throughput. Therefore, the file systems should be defragmented at a fixed interval of time

(maybe a month) using host utilities. Note: if the file system is NTFS, the file system cannot

be formatted at anything but default extent size.

File System Alignment affects performance in two ways:

• Misalignment causes disk crossings , i.e. an I/O broken across two disk drives.

• Misalignment makes it hard to stripe-align large uncached writes.

Figure 16: Disk Crossing

Figure 16 depicts a single 64 KB I/O split across two disk drives causing the write

operation to access two disk drives for a write operation to be completed. Similarly, for a

read operation, two disk drives need to be accessed for completing the write operation.

In an aligned system, the 64 KB write would have been serviced by a single disk drive.


It is recommended that operating system disk utility should be used to adjust partitions.

For Oracle and OLTP applications, the volume manager stripe element should be set to

the CLARiiON Stripe Size, typically 128 KB or 512 KB.

Linux file-system alignment procedure

The following procedure using fdisk may be used to create a single aligned partition on a

second Linux file sda or sdc file system LUN utilizing all of the LUN’s available capacity. In

this example, this partition will be:

/dev/nativedevicename.

The procedure is:

fdisk /dev/nativedevicename # sda and sdc

n # New partition

p # Primary

1 # Partition 1

<Enter> # 1st cylinder=1

<Enter> # Default for last cylinder

x # Expert mode

b # Starting block

1 # Partition 1

128 # Stripe element = 128

w # Write

C. Post Implementation Phase

Following the implementation phase, we need to monitor the SAN continuously to rectify any

problem before it becomes crucial. The status lights on the CLARiiON Storage System can

be used to monitor the CLARiiON array and check the health of CLARiiON system.


1. Health Checkup

DAE Checkup

Figure 17: DAE Checkup

LCC Checkup

Figure 18: LCC Checkup

Disk Module Checkup


Figure 19: Disk Module Checkup

Power Supply Checkup

Figure 20: DAE Checkup

SPE Checkup

Figure 21: SPE Checkup


Figure 22: SPE Checkup

Monitor the following parameters on a daily basis inside the data center so that they are kept

well within the operating limits of EMC CLARiiON.

Table 13: Data center Environment Requirements

2. Performance Monitoring

We can use Navisphere Analyzer to monitor CLARiiON performance. We need to enable the

logging on the respective Storage Processors of the CLARiiON by right clicking on the SP

and then choosing the Enable the Statistical Logging tab.

It will take some time for nar files to be made so that the actual data can be collected about

the performance over a period of period of time. Ideally, nar files should be retrieved after

enabling the logging for 2-3 days to get the actual information about the performance of the


CLARiiON. Then we need to retrieve the nar files as shown in Figure 23 by going to Tools

Analyzer Archive Retrieve.

Figure 23: Navisphere Analyzer

Open the nar file for analysis by going in the Tools Analyzer Open option and selecting the

recently retrieved nar file. Now we can see the different parameters related to each SP for

individual LUNs as well as the cumulative report for all the LUNS.

Parameters that can be monitored include:

Utilization

Queue Length

Total Throughput (IO/s)

Read Bandwidth

Read Size

Read Throughput

Response Time (ms)

Write Size

Write Throughput


Figure 24 shows the options that can be chosen and the respective graph so that we can see

whether or not the CLARiiON is optimally utilized.

Figure 24: Navisphere Analyzer

We can see that the maximum IOPS being serviced by SPA is 6267 IOPS. We can zoom the

graph to also see the LUN IOPS and similarly can choose the different parameters from the

left-hand window so see their respective values. This will help us determine the actual status

of our EMC CLARiiON.

TROUBLESHOOTING

Troubleshooting a SAN is complex, but you can save yourself a lot of work if you do two

things. First, verify that you have a SAN issue and not a generic storage issue. Second, begin

the troubleshooting process at the center of the SAN so that you can quickly locate the

general area of the problem.

When you're troubleshooting a SAN, you'll find that most problems aren't actually related to

the SAN. Suppose that you're suddenly unable to read data from the SCSI disk on your

standalone PC. Several things could be causing the problem. The hard disk might have gone

out. Maybe you've got a bad cable or a bad disk controller. Maybe the data on the drive has


been accidentally erased, or the partition has been deleted or corrupted. Just because you

can't access your data does not mean that a hardware failure has occurred.

Let's look at this same situation in the context of a SAN. A SAN is basically a way of linking a

server to a logical device on a disk array or some other storage mechanism. The SAN works

by allowing the server to communicate with the storage device using SCSI commands.

Suppose the server is suddenly unable to read data off the SAN. You may have a SAN

problem, but the problem might be not to the SAN but to the data itself. It could be that

connectivity is functional between the server and the storage unit, but that the data has been

erased, corrupted, or disassociated with the server. In that case, you'd troubleshoot the

problem the same way you would if the storage mechanism were directly attached to your

server.

But what if the SAN were the problem? Your best strategy is to start the troubleshooting

process in the center of the SAN and work out toward the edges.

Step 1: Start troubleshooting at the fabric level. The reason for this is that the switches are

located in the center of your SAN and should have connectivity to both the server and to the

storage device.

Verify that the switch can communicate with the server and the storage device. If you can

verify communications, you can rule out the fiber as being at fault. While examining the fiber,

you should look for things like unstable links, missing devices, incorrect zoning

configurations, and incorrect switch configurations.

Step 2: Use diagnostic software to test switch connection. This will verify whether the storage

device is connected to the switch. If not, you know the problem has to do with the storage

device. It may be a physical connection issue between the switch and the storage device, or it

could be that the storage software configuration is incorrect.

If the switch can communicate with the storage device, but the server can't, then you know

that the problem lies somewhere between the switch and the server. This is why you start

troubleshooting at the center of the SAN. A few simple tests and you eliminate half of the

SAN as a possible cause of the problem (either the server side or the storage side of the

network).

Step 3: If the problem lies with server and switch, check out these possible causes. If you do

determine that the problem is between the server and the switch, you've got your work cut out

for you.


Possible causes of the problem are a bad host bus adapter or a missing or incorrectly

configured driver. The problem may also be related to the way that your server is configured

to access the virtual storage device. You can start by using your hardware manufacturer's

diagnostic utility. You can also run a protocol analyzer to verify that the network interface card

(NIC) is functional and that the driver is working. If the NIC appears to be functional, then the

problem almost has to be configuration related.

Many times it’s found that we encounter the host agent not reachable error in the Navisphere

Manager and host is displayed as unmanaged. For troubleshooting, we can check the

following:

Ensure that the Agent IP address listed in Navisphere is routable from the CLARiiON

array. To test this, try to ping the host from the array. If the ping responds, the Agent

service is not running. Restart the Agent service on the host if it is installed on the

host.

Verify that port 6389 is open if there exists a Hardware firewall and, in case of

Software Firewall such as Windows Firewall, add port 6389 as an exception.

Restart Management service if the host IP has been changed.

The host NIC settings should be set on auto negotiate.


References

www.google.com

www.brocade.com

www.cisco.com

www.emc.com

www.emulex.com

www.storagewiki.com

Best Practices for designing, deploying, and administering ...€¦ · Best Practices for...

Documents

Transcript of Best Practices for designing, deploying, and administering ...€¦ · Best Practices for...