iSCSI BeSt PraCtICeS: ChooSIng Between Software and ...€¦ · iSCSI HBA removes the protocol...

5
STORAGE DELL POWER SOLUTIONS | December 2009 1 Reprinted from Dell Power Solutions, December 2009. Copyright © 2009 Dell Inc. All rights reserved. T he Internet SCSI (iSCSI) protocol offers tre- mendous advantages in enterprise data cen- ters. By allowing SCSI commands to be sent over standard Ethernet networks, iSCSI enables the deployment of storage area networks (SANs) using existing infrastructure and familiar components— providing performance and efficiency comparable to traditional Fibre Channel SANs, but without requiring the levels of hardware investment, training, and spe- cialized knowledge typically required of those environ- ments (see the “Comparing iSCSI and Fibre Channel” sidebar in this article). Systems such as the Dell EqualLogic PS Series, Dell PowerVault MD3000i, Dell PowerVault NX1950, and Dell/EMC CX4 Series arrays offer flexible, simplified, cost-effective ways to deploy iSCSI storage while helping to meet a wide range of performance and capacity requirements. Within an iSCSI SAN, network communication takes place between initiators (which typically reside on a host server) and targets (typically the storage resources). When implementing an iSCSI SAN in their own environments, however, administrators immedi- ately face a key choice: whether to deploy software initiators or hardware initiators. Software initiators such as the Microsoft® iSCSI Software Initiator depend on host server resources to process iSCSI traffic through standard network interface cards (NICs), while hardware initiators such as iSCSI host bus adapters (HBAs) offload this protocol processing to the hardware itself. Administrators also have the option of deploying NICs with TCP/IP Offload Engine (TOE) technology, enabling them to use software initiators while still offloading some of the protocol processing to the NIC hardware. Of course, the most appropriate choice depends on the specifics of the environment. Although there are certain workload and network types where dedicated hardware makes sense, software initiators alone are designed to meet the needs of most workloads in typi- cal Gigabit Ethernet (GbE) environments—offering a cost-effective option that can support levels of through- put comparable to hardware initiators while still main- taining acceptable levels of processor utilization. UNDERSTANDING iSCSI INITIATORS IN THE DATA CENTER iSCSI implementations typically use one of three ini- tiator configurations (see Figure 1): Software initiator and standard NIC: A software initiator in the host server OS establishes the iSCSI connection through a standard NIC. Software initiator and NIC with TOE: The software initiator still handles the iSCSI connection, but When deploying Internet SCSI (iSCSI) networks, administrators face a key choice: should they use software initiators or hardware initiators? As testing across a range of workloads demonstrates, software initiators can provide levels of throughput and efficiency comparable to hardware initiators for most types of applications— offering a cost-effective way to implement iSCSI without the need to invest in additional hardware components. By Ujjwal Rajbhandari Dan McConnell iSCSI BEST PRACTICES: CHOOSING BETWEEN SOFTWARE AND HARDWARE INITIATORS

Transcript of iSCSI BeSt PraCtICeS: ChooSIng Between Software and ...€¦ · iSCSI HBA removes the protocol...

Page 1: iSCSI BeSt PraCtICeS: ChooSIng Between Software and ...€¦ · iSCSI HBA removes the protocol process-ing from the host server to help maximize efficiency. This approach is often

Storage

DELL POWER SOLUTIONS | December 20091 Reprinted from Dell Power Solutions, December 2009. Copyright © 2009 Dell Inc. All rights reserved.

The Internet SCSI (iSCSI) protocol offers tre-

mendous advantages in enterprise data cen-

ters. By allowing SCSI commands to be sent

over standard Ethernet networks, iSCSI enables the

deployment of storage area networks (SANs) using

existing infrastructure and familiar components—

providing performance and efficiency comparable to

traditional Fibre Channel SANs, but without requiring

the levels of hardware investment, training, and spe-

cialized knowledge typically required of those environ-

ments (see the “Comparing iSCSI and Fibre Channel”

sidebar in this article). Systems such as the Dell™

EqualLogic™ PS Series, Dell PowerVault™ MD3000i, Dell

PowerVault NX1950, and Dell/EMC CX4 Series arrays

offer flexible, simplified, cost-effective ways to deploy

iSCSI storage while helping to meet a wide range of

performance and capacity requirements.

Within an iSCSI SAN, network communication

takes place between initiators (which typically reside

on a host server) and targets (typically the storage

resources). When implementing an iSCSI SAN in their

own environments, however, administrators immedi-

ately face a key choice: whether to deploy software

initiators or hardware initiators. Software initiators

such as the Microsoft® iSCSI Software Initiator depend

on host server resources to process iSCSI traffic

through standard network interface cards (NICs),

while hardware initiators such as iSCSI host bus

adapters (HBAs) offload this protocol processing to

the hardware itself. Administrators also have the

option of deploying NICs with TCP/IP Offload Engine

(TOE) technology, enabling them to use software

initiators while still offloading some of the protocol

processing to the NIC hardware.

Of course, the most appropriate choice depends on

the specifics of the environment. Although there are

certain workload and network types where dedicated

hardware makes sense, software initiators alone are

designed to meet the needs of most workloads in typi-

cal Gigabit Ethernet (GbE) environments—offering a

cost-effective option that can support levels of through-

put comparable to hardware initiators while still main-

taining acceptable levels of processor utilization.

UnderStanding iSCSi initiatorS in the data CenteriSCSI implementations typically use one of three ini-

tiator configurations (see Figure 1):

Software initiator and standard NIC:■■ A software

initiator in the host server OS establishes the iSCSI

connection through a standard NIC.

Software initiator and NIC with TOE:■■ The software

initiator still handles the iSCSI connection, but

When deploying Internet SCSI (iSCSI) networks, administrators face a key choice: should they use software initiators or hardware initiators? As testing across a range of workloads demonstrates, software initiators can provide levels of throughput and efficiency comparable to hardware initiators for most types of applications—offering a cost-effective way to implement iSCSI without the need to invest in additional hardware components.

By Ujjwal Rajbhandari

Dan McConnell

iSCSI BeSt PraCtICeS:ChooSIng BetweenSoftware andhardware InItIatorS

Page 2: iSCSI BeSt PraCtICeS: ChooSIng Between Software and ...€¦ · iSCSI HBA removes the protocol process-ing from the host server to help maximize efficiency. This approach is often

2DELL.COM/PowerSolutionsReprinted from Dell Power Solutions, December 2009. Copyright © 2009 Dell Inc. All rights reserved.

TCP/IP processing is offloaded from the

host processor to a TOE-equipped NIC.

iSCSI HBA: ■■ The HBA hardware handles

both iSCSI and TCP/IP processing.

Each configuration offers its own

advantages and disadvantages. For exam-

ple, the first approach is generally the most

cost-effective of the three—software initia-

tors are often provided by OS vendors at

no additional cost, and administrators can

use them with standard NICs without the

need to purchase additional specialized

hardware. This configuration can support

similar levels of throughput to the other

two approaches, but can also increase host

processor utilization beyond acceptable

levels in some environments. The second

approach—combining a software initiator

with a TOE-equipped NIC—requires going

beyond standard NIC hardware, but has

the advantage of mitigating some of the

processor impact typical of a pure soft-

ware initiator configuration.

In the third approach, a specialized

iSCSI HBA removes the protocol process-

ing from the host server to help maximize

efficiency. This approach is often unneces-

sary in typical GbE environments because

the processing overhead is not noticeable.

However, it can be an important option

for specific types of workloads or environ-

ments, particularly when deploying iSCSI

with 10 Gigabit Ethernet (10GbE) net-

works (see the “Evaluating iSCSI initiators

in 10 Gigabit Ethernet environments” sec-

tion in this article).

evalUating iSCSi initiatorS in gigabit ethernet environmentSTo demonstrate how each type of initiator

performs in typical GbE environments, in

February 2009 Principled Technologies

performed Dell-commissioned benchmark

tests on a GbE-based iSCSI SAN under a

variety of workloads that were created to

simulate typical I/O patterns of frequently

used applications or storage environ-

ments. The test environment used the

Iometer benchmarking tool to measure

throughput and processor utilization on a

Dell PowerEdge™ 2950 server using the

Microsoft iSCSI Software Initiator and a

Broadcom NetXtreme II BCM5708C net-

work adapter with TOE and iSCSI HBA

functionality. The server was configured

with two quad-core Intel® Xeon® E5405

processors at 2.0 GHz; 16 GB of RAM; an

80 GB, 7,200 rpm Serial ATA (SATA) hard

drive; and the Microsoft Windows Server®

2008 Enterprise x64 Edition OS. It con-

nected through a Dell PowerConnect™

6248 switch to four Dell EqualLogic

PS5000XV iSCSI SAN arrays configured

with a total of sixty-four 146 GB, 15,000 rpm

Serial Attached SCSI (SAS) drives.

The tests evaluated three configura-

tions. The first configuration used the soft-

ware initiator with the Broadcom adapter

operating as a standard Layer 2 NIC (TOE

disabled), the second used the software

initiator with TOE enabled on the

Broadcom adapter, and the third used the

Broadcom adapter in iSCSI HBA mode

with iSCSI Offload Engine (iSOE) technol-

ogy enabled.

The test team utilized 10 Iometer

access specifications that were created to

simulate the typical I/O patterns of fre-

quently used applications or storage envi-

ronments for the tests: 2 large-block

specifications that evaluated video-on-

demand and decision support system

(DSS) performance on 512 KB and 1 MB

I/Os, and 8 small-block specifications that

evaluated Web file server, media streaming,

Microsoft SQL Server® log, OS paging, Web

server log, database online transaction pro-

cessing (OLTP), Microsoft Exchange e-mail,

and OS drive performance on 4 KB, 8 KB,

and 64 KB I/Os. The large-block tests

measured throughput in megabytes per

second along with processor utilization,

while the small-block tests measured

throughput in I/Os per second (IOPS)

along with processor utilization. For the

purposes of these tests, 8 percent proces-

sor utilization was set as the acceptable

threshold, based on an estimation of

the level that could significantly affect

processor-intensive applications.

Figure 2 shows the results for the

large-block tests. Each test was run three

times; the figure shows the throughput

from the median run along with the aver-

age processor utilization during this run.

In these tests, all three configurations

returned comparable throughput results

while maintaining processor utilization

well below the 8 percent threshold, indi-

cating that a software initiator with a stan-

dard NIC can offer an effective and

economical choice for these types of

workloads, without the need to invest in

additional hardware.

Figure 3 shows the results for the

small-block tests, which similarly report

throughput from the median of three runs

along with average processor utilization

during that run. As with the large-block

Figure 1. Typical iSCSI initiator configurations

Software initiatorand NIC with TOE

Application

SCSI

iSCSI

TCP/IP

Network

Software initiatorand standard NIC

Application

SCSI

iSCSI

TCP/IP

Network

iSCSI HBAApplication

SCSI

iSCSI

TCP/IP

Network

Storage array

Uses server processor Offloaded to NIC or HBA

Page 3: iSCSI BeSt PraCtICeS: ChooSIng Between Software and ...€¦ · iSCSI HBA removes the protocol process-ing from the host server to help maximize efficiency. This approach is often

Storage

DELL POWER SOLUTIONS | December 20093 Reprinted from Dell Power Solutions, December 2009. Copyright © 2009 Dell Inc. All rights reserved.

To demonstrate the comparable performance and

efficiency of Internet SCSI (iSCSI) and Fibre Channel

offerings, in May 2009 Principled Technologies

performed Dell-commissioned benchmark tests

evaluating the throughput and processor utilization

in iSCSI and Fibre Channel storage area networks

(SANs) under a variety of workloads. The test envi-

ronment used 10 custom access specifications with

the Iometer benchmarking tool to measure through-

put and processor utilization on a Dell PowerEdge

2950 server connected to a Dell/EMC CX4-120 array

with a total of thirty 146 GB, 15,000 rpm drives. The

PowerEdge 2950 was configured with two quad-

core Intel Xeon E5405 processors at 2.0 GHz; 16 GB

of RAM; an 80 GB, 7,200 rpm Serial ATA (SATA)

hard drive; and the Microsoft Windows Server 2008

Enterprise x64 Edition OS.

In the iSCSI configuration, the server connected

to the storage over a 10 Gigabit Ethernet (10GbE)

network using a Broadcom NetXtreme II BCM57710

converged network interface card (iSCSI HBA) and

two Dell PowerConnect 6248 switches; in the Fibre

Channel configuration, the server connected to the

storage over a 4 Gbps Fibre Channel network using a

QLogic QLE2460 HBA and a Brocade SilkWorm 200E

Fibre Channel switch (see Figure A). To help ensure

similar configurations, the iSCSI network was con-

strained to use four Gigabit Ethernet (GbE) connec-

tions between the switches and the storage.

Figures B and C show the results. Each test was

run three times; the figures show the throughput

from the median run of each workload along with

the average processor utilization during this run,

with throughput measured in megabytes per second

for the large-block tests and I/Os per second (IOPS)

for the small-block tests. As these results show,

both configurations provided comparable levels

of throughput and processor utilization across all

workloads—indicating that iSCSI can be as efficient

and capable as Fibre Channel across a wide range of

application types.*

ComParIng iSCSI and fIBre Channel

*For detailed information on the test environment, benchmark workloads, and results, see “10Gb iSCSI Initiators,” by Principled Technologies, June 2009, www.principledtechnologies.com/clients/reports/dell/ 10Gb_iSCSI_initiators_06_09.pdf.

Dell PowerEdge 2950 server

10GbEStacking cableDell

PowerConnect6248 switches

GbE GbE

Dell/EMC CX4-120 array

iSCSI configuration

BrocadeSilkWorm

200E switch

Fibre Channel configurationDell PowerEdge 2950 server

4 Gbps Fibre Channel

4 Gbps Fibre Channel

Dell/EMC CX4-120 array

Figure A. iSCSI and Fibre Channel test configurations

Thro

ughp

ut (M

B/s

ec)

Video on demand(512 KB)

DSS(1 MB)

iSCSIFibre ChannelProcessor utilization

1,000

800

600

400

200

0

Proc

esso

r util

izat

ion

(per

cent

)

25

20

15

10

5

0Acceptableutilizationthreshold

Acceptableutilizationthreshold

Thro

ughp

ut (I

OPS

)

Web file se

rver (4

KB)

25,000

20,000

15,000

10,000

5,000

0

Proc

esso

r util

izat

ion

(per

cent

)50

40

30

20

10

0

Web file se

rver (8

KB)

Web file se

rver (6

4 KB)

Media streaming (6

4 KB)

SQL Serve

r log (6

4 KB)

OS paging (64 K

B)

Web se

rver lo

g (8 KB)

Database OLT

P (8 KB)

Exchange e-mail (

4 KB)

OS drive (8

KB)

iSCSIFibre ChannelProcessor utilization

Figure B. Throughput and processor utilization for the iSCSI and Fibre Channel test configurations under large-block workloads

Figure C. Throughput and processor utilization for the iSCSI and Fibre Channel test configurations under small-block workloads

Page 4: iSCSI BeSt PraCtICeS: ChooSIng Between Software and ...€¦ · iSCSI HBA removes the protocol process-ing from the host server to help maximize efficiency. This approach is often

4DELL.COM/PowerSolutionsReprinted from Dell Power Solutions, December 2009. Copyright © 2009 Dell Inc. All rights reserved.

tests, the three configurations provided

comparable throughput across all 10

tests, and maintained processor utiliza-

tion below the 8 percent threshold for

5 of the 10 tests. The processor utilization

for the two software initiator configura-

tions, however, was higher than this

threshold for the other 5 tests. As the

results show, the throughput levels cor-

relate with processor utilization; for these

types of small-block workloads, an iSCSI

HBA may be a more effective option than

a software initiator in environments sen-

sitive to utilization levels.1

evalUating iSCSi initiatorS in 10 gigabit ethernet environmentSAs technologies such as virtualization and

multi-core processing enable escalating

levels of server and storage consolidation

within the data center, high-bandwidth

networks become increasingly important.

10GbE networks hold the promise of

both meeting these bandwidth demands

and enabling a truly unified network

infrastructure—including taking advan-

tage of emerging standards such as the

Data Center Bridging (DCB) specification.

Such standards are expected to enable

deterministic latency and enhanced

quality-of-service functionality for shar-

ing single 10GbE links between multiple

classes of Ethernet traffic.2

The increased bandwidth in 10GbE-

based iSCSI networks will influence how

software and hardware initiators affect

the performance of host servers, given a

higher protocol processing load. To eval-

uate initiator performance in this type of

environment, in May 2009 Principled

Technologies performed Del l -

commissioned benchmark tests on a

10GbE-based iSCSI SAN. The workloads

and test environment were the same as

the GbE tests, but this time compared a

10GbE Intel NIC with and without Intel

I/O Acceleration Technology (I/OAT)

enabled for the software initiator tests

against a Broadcom NetXtreme II

BCM57710 network adapter functioning

as an iSCSI HBA. As with the GbE tests,

each workload was run three times, with

throughput and average processor utili-

zation from the median run serving as the

final results.

Figures 4 and 5 show the results. Both

configurations returned comparable

throughput results, but unlike the GbE

tests, the software initiator had an average

processor utilization above the 8 percent

threshold across all workloads—including

topping out at over 25 percent in the

small-block media streaming test. The

iSCSI HBA, in contrast, maintained a pro-

cessor utilization well below this thresh-

old for all workloads except the Web

1 For the complete report, including detailed information on the test environment, benchmark workloads, methodology, and results, see “iSCSI 1Gb Software Initiator Performance Analysis,” by Principled Technologies, February 2009, DELL.COM/Downloads/Global/Products/Pwcnt/En/software-initiator-performance-analysis.pdf.

2 For more information, see “10 Gigabit Ethernet: Unifying iSCSI and Fibre Channel in a Single Network Fabric,” by Achmad Chadran, Gaurav Chawla, and Ujjwal Rajbhandari, in Dell Power Solutions, September 2009, DELL.COM/Downloads/Global/Power/ps3q09-20090392-Chadran.pdf.

Thro

ughp

ut (M

B/s

ec)

Video on demand(512 KB)

DSS(1 MB)

Software initiator with TOE disabledSoftware initiator with TOE enabled

iSCSI HBAProcessor utilization150

120

90

60

30

0

Proc

esso

r util

izat

ion

(per

cent

)

25

20

15

10

5

0Acceptableutilizationthreshold

Acceptableutilizationthreshold

Thro

ughp

ut (I

OPS

)

Web file se

rver (4

KB)

25,000

20,000

15,000

10,000

5,000

0

Proc

esso

r util

izat

ion

(per

cent

)

50

40

30

20

10

0

Software initiator with TOE disabledSoftware initiator with TOE enabled

iSCSI HBAProcessor utilization

Web file se

rver (8

KB)

Web file se

rver (6

4 KB)

Media streaming (6

4 KB)

SQL Serve

r log (6

4 KB)

OS paging (64 K

B)

Web se

rver lo

g (8 KB)

Database OLT

P (8 KB)

Exchange e-mail (

4 KB)

OS drive (8

KB)

Figure 3. Throughput and processor utilization for Gigabit Ethernet–based iSCSI configurations under small-block workloads

Figure 2. Throughput and processor utilization for Gigabit Ethernet–based iSCSI configurations under large-block workloads

Page 5: iSCSI BeSt PraCtICeS: ChooSIng Between Software and ...€¦ · iSCSI HBA removes the protocol process-ing from the host server to help maximize efficiency. This approach is often

Storage

DELL POWER SOLUTIONS | December 20095 Reprinted from Dell Power Solutions, December 2009. Copyright © 2009 Dell Inc. All rights reserved.

server log (8 KB) test, which came in just

over the threshold at 8.88 percent.

As these results illustrate, effective

deployment of iSCSI in a 10GbE environ-

ment can require a different approach

than doing so in a GbE environment.

Because of the higher bandwidth offered

by 10GbE network fabrics, using an HBA

would typically be a better option than

using a software initiator, helping main-

tain acceptable levels of processor utiliza-

tion on the host server across a wide

variety of workload types.3

When considering these results,

administrators should also keep in mind

that they reflect iSCSI throughput and

utilization using typical current hardware

components. Because successive

advances in processor technology have

tended to reduce utilization for a given

level of IOPS, future processor genera-

tions may reduce the need for HBAs

under these types of workloads—enabling

the effective use of software initiators as

10GbE networking becomes standardized

in enterprise data centers.

ChooSing an optimal iSCSi ConfigUrationIn today’s economic climate, controlling

costs is an ongoing challenge for data

center administrators. Although some

10GbE networks may require a different

approach to handle the intensified pro-

tocol processing load, in typical GbE

environments, a standard software ini-

tiator alone may provide comparable

performance to a hardware initiator

while still maintaining acceptable levels

of processor utilization for many differ-

ent application workloads—offering a

simple, cost-effective way for organiza-

tions to implement iSCSI without the

need to invest in additional hardware

components.

Ujjwal Rajbhandari is a product marketing

consultant for Dell storage solutions. He

has a B.E. in Electrical Engineering from

the Indian Institute of Technology, Roorkee,

and an M.S. in Electrical Engineering from

Texas A&M University.

Dan McConnell is a product marketing

strategist in the Dell Enterprise Storage

Group with over 10 years of experience in

server and storage planning and product

development. He graduated from the

Georgia Institute of Technology, special-

izing in Computer Engineering.

Thro

ughp

ut (M

B/s

ec)

Video on demand(512 KB)

DSS(1 MB)

Intel software initiator with I/OAT enabledBroadcom iSCSI HBAProcessor utilization

1,000

800

600

400

200

0

Proc

esso

r util

izat

ion

(per

cent

)

25

20

15

10

5

0Acceptableutilizationthreshold

Acceptableutilizationthreshold

Thro

ughp

ut (I

OPS

)

Web file se

rver (4

KB)

40,000

30,000

20,000

10,000

0 Proc

esso

r util

izat

ion

(per

cent

)80

60

40

20

0

Web file se

rver (8

KB)

Web file se

rver (6

4 KB)

Media streaming (6

4 KB)

SQL Serve

r log (6

4 KB)

OS paging (64 K

B)

Web se

rver lo

g (8 KB)

Database OLT

P (8 KB)

Exchange e-mail (

4 KB)

OS drive (8

KB)

Intel software initiator with I/OAT enabledBroadcom iSCSI HBAProcessor utilization

3 For detailed information on the test environment, benchmark workloads, and results, see “10Gb iSCSI Initiators,” by Principled Technologies, June 2009, www.principledtechnologies.com/clients/reports/dell/ 10Gb_iSCSI_initiators_06_09.pdf.

QUiCK linK

Dell iSCSI storage solutions:DELL.COM/iSCSI

Figure 5. Throughput and processor utilization for 10 Gigabit Ethernet–based iSCSI configurations under small-block workloads

Figure 4. Throughput and processor utilization for 10 Gigabit Ethernet–based iSCSI configurations under large-block workloads