Post on 23-Apr-2020
Storage
DELL POWER SOLUTIONS | December 20091 Reprinted from Dell Power Solutions, December 2009. Copyright © 2009 Dell Inc. All rights reserved.
The Internet SCSI (iSCSI) protocol offers tre-
mendous advantages in enterprise data cen-
ters. By allowing SCSI commands to be sent
over standard Ethernet networks, iSCSI enables the
deployment of storage area networks (SANs) using
existing infrastructure and familiar components—
providing performance and efficiency comparable to
traditional Fibre Channel SANs, but without requiring
the levels of hardware investment, training, and spe-
cialized knowledge typically required of those environ-
ments (see the “Comparing iSCSI and Fibre Channel”
sidebar in this article). Systems such as the Dell™
EqualLogic™ PS Series, Dell PowerVault™ MD3000i, Dell
PowerVault NX1950, and Dell/EMC CX4 Series arrays
offer flexible, simplified, cost-effective ways to deploy
iSCSI storage while helping to meet a wide range of
performance and capacity requirements.
Within an iSCSI SAN, network communication
takes place between initiators (which typically reside
on a host server) and targets (typically the storage
resources). When implementing an iSCSI SAN in their
own environments, however, administrators immedi-
ately face a key choice: whether to deploy software
initiators or hardware initiators. Software initiators
such as the Microsoft® iSCSI Software Initiator depend
on host server resources to process iSCSI traffic
through standard network interface cards (NICs),
while hardware initiators such as iSCSI host bus
adapters (HBAs) offload this protocol processing to
the hardware itself. Administrators also have the
option of deploying NICs with TCP/IP Offload Engine
(TOE) technology, enabling them to use software
initiators while still offloading some of the protocol
processing to the NIC hardware.
Of course, the most appropriate choice depends on
the specifics of the environment. Although there are
certain workload and network types where dedicated
hardware makes sense, software initiators alone are
designed to meet the needs of most workloads in typi-
cal Gigabit Ethernet (GbE) environments—offering a
cost-effective option that can support levels of through-
put comparable to hardware initiators while still main-
taining acceptable levels of processor utilization.
UnderStanding iSCSi initiatorS in the data CenteriSCSI implementations typically use one of three ini-
tiator configurations (see Figure 1):
Software initiator and standard NIC:■■ A software
initiator in the host server OS establishes the iSCSI
connection through a standard NIC.
Software initiator and NIC with TOE:■■ The software
initiator still handles the iSCSI connection, but
When deploying Internet SCSI (iSCSI) networks, administrators face a key choice: should they use software initiators or hardware initiators? As testing across a range of workloads demonstrates, software initiators can provide levels of throughput and efficiency comparable to hardware initiators for most types of applications—offering a cost-effective way to implement iSCSI without the need to invest in additional hardware components.
By Ujjwal Rajbhandari
Dan McConnell
iSCSI BeSt PraCtICeS:ChooSIng BetweenSoftware andhardware InItIatorS
2DELL.COM/PowerSolutionsReprinted from Dell Power Solutions, December 2009. Copyright © 2009 Dell Inc. All rights reserved.
TCP/IP processing is offloaded from the
host processor to a TOE-equipped NIC.
iSCSI HBA: ■■ The HBA hardware handles
both iSCSI and TCP/IP processing.
Each configuration offers its own
advantages and disadvantages. For exam-
ple, the first approach is generally the most
cost-effective of the three—software initia-
tors are often provided by OS vendors at
no additional cost, and administrators can
use them with standard NICs without the
need to purchase additional specialized
hardware. This configuration can support
similar levels of throughput to the other
two approaches, but can also increase host
processor utilization beyond acceptable
levels in some environments. The second
approach—combining a software initiator
with a TOE-equipped NIC—requires going
beyond standard NIC hardware, but has
the advantage of mitigating some of the
processor impact typical of a pure soft-
ware initiator configuration.
In the third approach, a specialized
iSCSI HBA removes the protocol process-
ing from the host server to help maximize
efficiency. This approach is often unneces-
sary in typical GbE environments because
the processing overhead is not noticeable.
However, it can be an important option
for specific types of workloads or environ-
ments, particularly when deploying iSCSI
with 10 Gigabit Ethernet (10GbE) net-
works (see the “Evaluating iSCSI initiators
in 10 Gigabit Ethernet environments” sec-
tion in this article).
evalUating iSCSi initiatorS in gigabit ethernet environmentSTo demonstrate how each type of initiator
performs in typical GbE environments, in
February 2009 Principled Technologies
performed Dell-commissioned benchmark
tests on a GbE-based iSCSI SAN under a
variety of workloads that were created to
simulate typical I/O patterns of frequently
used applications or storage environ-
ments. The test environment used the
Iometer benchmarking tool to measure
throughput and processor utilization on a
Dell PowerEdge™ 2950 server using the
Microsoft iSCSI Software Initiator and a
Broadcom NetXtreme II BCM5708C net-
work adapter with TOE and iSCSI HBA
functionality. The server was configured
with two quad-core Intel® Xeon® E5405
processors at 2.0 GHz; 16 GB of RAM; an
80 GB, 7,200 rpm Serial ATA (SATA) hard
drive; and the Microsoft Windows Server®
2008 Enterprise x64 Edition OS. It con-
nected through a Dell PowerConnect™
6248 switch to four Dell EqualLogic
PS5000XV iSCSI SAN arrays configured
with a total of sixty-four 146 GB, 15,000 rpm
Serial Attached SCSI (SAS) drives.
The tests evaluated three configura-
tions. The first configuration used the soft-
ware initiator with the Broadcom adapter
operating as a standard Layer 2 NIC (TOE
disabled), the second used the software
initiator with TOE enabled on the
Broadcom adapter, and the third used the
Broadcom adapter in iSCSI HBA mode
with iSCSI Offload Engine (iSOE) technol-
ogy enabled.
The test team utilized 10 Iometer
access specifications that were created to
simulate the typical I/O patterns of fre-
quently used applications or storage envi-
ronments for the tests: 2 large-block
specifications that evaluated video-on-
demand and decision support system
(DSS) performance on 512 KB and 1 MB
I/Os, and 8 small-block specifications that
evaluated Web file server, media streaming,
Microsoft SQL Server® log, OS paging, Web
server log, database online transaction pro-
cessing (OLTP), Microsoft Exchange e-mail,
and OS drive performance on 4 KB, 8 KB,
and 64 KB I/Os. The large-block tests
measured throughput in megabytes per
second along with processor utilization,
while the small-block tests measured
throughput in I/Os per second (IOPS)
along with processor utilization. For the
purposes of these tests, 8 percent proces-
sor utilization was set as the acceptable
threshold, based on an estimation of
the level that could significantly affect
processor-intensive applications.
Figure 2 shows the results for the
large-block tests. Each test was run three
times; the figure shows the throughput
from the median run along with the aver-
age processor utilization during this run.
In these tests, all three configurations
returned comparable throughput results
while maintaining processor utilization
well below the 8 percent threshold, indi-
cating that a software initiator with a stan-
dard NIC can offer an effective and
economical choice for these types of
workloads, without the need to invest in
additional hardware.
Figure 3 shows the results for the
small-block tests, which similarly report
throughput from the median of three runs
along with average processor utilization
during that run. As with the large-block
Figure 1. Typical iSCSI initiator configurations
Software initiatorand NIC with TOE
Application
SCSI
iSCSI
TCP/IP
Network
Software initiatorand standard NIC
Application
SCSI
iSCSI
TCP/IP
Network
iSCSI HBAApplication
SCSI
iSCSI
TCP/IP
Network
Storage array
Uses server processor Offloaded to NIC or HBA
Storage
DELL POWER SOLUTIONS | December 20093 Reprinted from Dell Power Solutions, December 2009. Copyright © 2009 Dell Inc. All rights reserved.
To demonstrate the comparable performance and
efficiency of Internet SCSI (iSCSI) and Fibre Channel
offerings, in May 2009 Principled Technologies
performed Dell-commissioned benchmark tests
evaluating the throughput and processor utilization
in iSCSI and Fibre Channel storage area networks
(SANs) under a variety of workloads. The test envi-
ronment used 10 custom access specifications with
the Iometer benchmarking tool to measure through-
put and processor utilization on a Dell PowerEdge
2950 server connected to a Dell/EMC CX4-120 array
with a total of thirty 146 GB, 15,000 rpm drives. The
PowerEdge 2950 was configured with two quad-
core Intel Xeon E5405 processors at 2.0 GHz; 16 GB
of RAM; an 80 GB, 7,200 rpm Serial ATA (SATA)
hard drive; and the Microsoft Windows Server 2008
Enterprise x64 Edition OS.
In the iSCSI configuration, the server connected
to the storage over a 10 Gigabit Ethernet (10GbE)
network using a Broadcom NetXtreme II BCM57710
converged network interface card (iSCSI HBA) and
two Dell PowerConnect 6248 switches; in the Fibre
Channel configuration, the server connected to the
storage over a 4 Gbps Fibre Channel network using a
QLogic QLE2460 HBA and a Brocade SilkWorm 200E
Fibre Channel switch (see Figure A). To help ensure
similar configurations, the iSCSI network was con-
strained to use four Gigabit Ethernet (GbE) connec-
tions between the switches and the storage.
Figures B and C show the results. Each test was
run three times; the figures show the throughput
from the median run of each workload along with
the average processor utilization during this run,
with throughput measured in megabytes per second
for the large-block tests and I/Os per second (IOPS)
for the small-block tests. As these results show,
both configurations provided comparable levels
of throughput and processor utilization across all
workloads—indicating that iSCSI can be as efficient
and capable as Fibre Channel across a wide range of
application types.*
ComParIng iSCSI and fIBre Channel
*For detailed information on the test environment, benchmark workloads, and results, see “10Gb iSCSI Initiators,” by Principled Technologies, June 2009, www.principledtechnologies.com/clients/reports/dell/ 10Gb_iSCSI_initiators_06_09.pdf.
Dell PowerEdge 2950 server
10GbEStacking cableDell
PowerConnect6248 switches
GbE GbE
Dell/EMC CX4-120 array
iSCSI configuration
BrocadeSilkWorm
200E switch
Fibre Channel configurationDell PowerEdge 2950 server
4 Gbps Fibre Channel
4 Gbps Fibre Channel
Dell/EMC CX4-120 array
Figure A. iSCSI and Fibre Channel test configurations
Thro
ughp
ut (M
B/s
ec)
Video on demand(512 KB)
DSS(1 MB)
iSCSIFibre ChannelProcessor utilization
1,000
800
600
400
200
0
Proc
esso
r util
izat
ion
(per
cent
)
25
20
15
10
5
0Acceptableutilizationthreshold
Acceptableutilizationthreshold
Thro
ughp
ut (I
OPS
)
Web file se
rver (4
KB)
25,000
20,000
15,000
10,000
5,000
0
Proc
esso
r util
izat
ion
(per
cent
)50
40
30
20
10
0
Web file se
rver (8
KB)
Web file se
rver (6
4 KB)
Media streaming (6
4 KB)
SQL Serve
r log (6
4 KB)
OS paging (64 K
B)
Web se
rver lo
g (8 KB)
Database OLT
P (8 KB)
Exchange e-mail (
4 KB)
OS drive (8
KB)
iSCSIFibre ChannelProcessor utilization
Figure B. Throughput and processor utilization for the iSCSI and Fibre Channel test configurations under large-block workloads
Figure C. Throughput and processor utilization for the iSCSI and Fibre Channel test configurations under small-block workloads
4DELL.COM/PowerSolutionsReprinted from Dell Power Solutions, December 2009. Copyright © 2009 Dell Inc. All rights reserved.
tests, the three configurations provided
comparable throughput across all 10
tests, and maintained processor utiliza-
tion below the 8 percent threshold for
5 of the 10 tests. The processor utilization
for the two software initiator configura-
tions, however, was higher than this
threshold for the other 5 tests. As the
results show, the throughput levels cor-
relate with processor utilization; for these
types of small-block workloads, an iSCSI
HBA may be a more effective option than
a software initiator in environments sen-
sitive to utilization levels.1
evalUating iSCSi initiatorS in 10 gigabit ethernet environmentSAs technologies such as virtualization and
multi-core processing enable escalating
levels of server and storage consolidation
within the data center, high-bandwidth
networks become increasingly important.
10GbE networks hold the promise of
both meeting these bandwidth demands
and enabling a truly unified network
infrastructure—including taking advan-
tage of emerging standards such as the
Data Center Bridging (DCB) specification.
Such standards are expected to enable
deterministic latency and enhanced
quality-of-service functionality for shar-
ing single 10GbE links between multiple
classes of Ethernet traffic.2
The increased bandwidth in 10GbE-
based iSCSI networks will influence how
software and hardware initiators affect
the performance of host servers, given a
higher protocol processing load. To eval-
uate initiator performance in this type of
environment, in May 2009 Principled
Technologies performed Del l -
commissioned benchmark tests on a
10GbE-based iSCSI SAN. The workloads
and test environment were the same as
the GbE tests, but this time compared a
10GbE Intel NIC with and without Intel
I/O Acceleration Technology (I/OAT)
enabled for the software initiator tests
against a Broadcom NetXtreme II
BCM57710 network adapter functioning
as an iSCSI HBA. As with the GbE tests,
each workload was run three times, with
throughput and average processor utili-
zation from the median run serving as the
final results.
Figures 4 and 5 show the results. Both
configurations returned comparable
throughput results, but unlike the GbE
tests, the software initiator had an average
processor utilization above the 8 percent
threshold across all workloads—including
topping out at over 25 percent in the
small-block media streaming test. The
iSCSI HBA, in contrast, maintained a pro-
cessor utilization well below this thresh-
old for all workloads except the Web
1 For the complete report, including detailed information on the test environment, benchmark workloads, methodology, and results, see “iSCSI 1Gb Software Initiator Performance Analysis,” by Principled Technologies, February 2009, DELL.COM/Downloads/Global/Products/Pwcnt/En/software-initiator-performance-analysis.pdf.
2 For more information, see “10 Gigabit Ethernet: Unifying iSCSI and Fibre Channel in a Single Network Fabric,” by Achmad Chadran, Gaurav Chawla, and Ujjwal Rajbhandari, in Dell Power Solutions, September 2009, DELL.COM/Downloads/Global/Power/ps3q09-20090392-Chadran.pdf.
Thro
ughp
ut (M
B/s
ec)
Video on demand(512 KB)
DSS(1 MB)
Software initiator with TOE disabledSoftware initiator with TOE enabled
iSCSI HBAProcessor utilization150
120
90
60
30
0
Proc
esso
r util
izat
ion
(per
cent
)
25
20
15
10
5
0Acceptableutilizationthreshold
Acceptableutilizationthreshold
Thro
ughp
ut (I
OPS
)
Web file se
rver (4
KB)
25,000
20,000
15,000
10,000
5,000
0
Proc
esso
r util
izat
ion
(per
cent
)
50
40
30
20
10
0
Software initiator with TOE disabledSoftware initiator with TOE enabled
iSCSI HBAProcessor utilization
Web file se
rver (8
KB)
Web file se
rver (6
4 KB)
Media streaming (6
4 KB)
SQL Serve
r log (6
4 KB)
OS paging (64 K
B)
Web se
rver lo
g (8 KB)
Database OLT
P (8 KB)
Exchange e-mail (
4 KB)
OS drive (8
KB)
Figure 3. Throughput and processor utilization for Gigabit Ethernet–based iSCSI configurations under small-block workloads
Figure 2. Throughput and processor utilization for Gigabit Ethernet–based iSCSI configurations under large-block workloads
Storage
DELL POWER SOLUTIONS | December 20095 Reprinted from Dell Power Solutions, December 2009. Copyright © 2009 Dell Inc. All rights reserved.
server log (8 KB) test, which came in just
over the threshold at 8.88 percent.
As these results illustrate, effective
deployment of iSCSI in a 10GbE environ-
ment can require a different approach
than doing so in a GbE environment.
Because of the higher bandwidth offered
by 10GbE network fabrics, using an HBA
would typically be a better option than
using a software initiator, helping main-
tain acceptable levels of processor utiliza-
tion on the host server across a wide
variety of workload types.3
When considering these results,
administrators should also keep in mind
that they reflect iSCSI throughput and
utilization using typical current hardware
components. Because successive
advances in processor technology have
tended to reduce utilization for a given
level of IOPS, future processor genera-
tions may reduce the need for HBAs
under these types of workloads—enabling
the effective use of software initiators as
10GbE networking becomes standardized
in enterprise data centers.
ChooSing an optimal iSCSi ConfigUrationIn today’s economic climate, controlling
costs is an ongoing challenge for data
center administrators. Although some
10GbE networks may require a different
approach to handle the intensified pro-
tocol processing load, in typical GbE
environments, a standard software ini-
tiator alone may provide comparable
performance to a hardware initiator
while still maintaining acceptable levels
of processor utilization for many differ-
ent application workloads—offering a
simple, cost-effective way for organiza-
tions to implement iSCSI without the
need to invest in additional hardware
components.
Ujjwal Rajbhandari is a product marketing
consultant for Dell storage solutions. He
has a B.E. in Electrical Engineering from
the Indian Institute of Technology, Roorkee,
and an M.S. in Electrical Engineering from
Texas A&M University.
Dan McConnell is a product marketing
strategist in the Dell Enterprise Storage
Group with over 10 years of experience in
server and storage planning and product
development. He graduated from the
Georgia Institute of Technology, special-
izing in Computer Engineering.
Thro
ughp
ut (M
B/s
ec)
Video on demand(512 KB)
DSS(1 MB)
Intel software initiator with I/OAT enabledBroadcom iSCSI HBAProcessor utilization
1,000
800
600
400
200
0
Proc
esso
r util
izat
ion
(per
cent
)
25
20
15
10
5
0Acceptableutilizationthreshold
Acceptableutilizationthreshold
Thro
ughp
ut (I
OPS
)
Web file se
rver (4
KB)
40,000
30,000
20,000
10,000
0 Proc
esso
r util
izat
ion
(per
cent
)80
60
40
20
0
Web file se
rver (8
KB)
Web file se
rver (6
4 KB)
Media streaming (6
4 KB)
SQL Serve
r log (6
4 KB)
OS paging (64 K
B)
Web se
rver lo
g (8 KB)
Database OLT
P (8 KB)
Exchange e-mail (
4 KB)
OS drive (8
KB)
Intel software initiator with I/OAT enabledBroadcom iSCSI HBAProcessor utilization
3 For detailed information on the test environment, benchmark workloads, and results, see “10Gb iSCSI Initiators,” by Principled Technologies, June 2009, www.principledtechnologies.com/clients/reports/dell/ 10Gb_iSCSI_initiators_06_09.pdf.
QUiCK linK
Dell iSCSI storage solutions:DELL.COM/iSCSI
Figure 5. Throughput and processor utilization for 10 Gigabit Ethernet–based iSCSI configurations under small-block workloads
Figure 4. Throughput and processor utilization for 10 Gigabit Ethernet–based iSCSI configurations under large-block workloads