Storage IssueManaging the information that drives the
enterprise
STORAGE Vol. 10 No. 2 April 2011
INSIDE THIN PROVISIONING • STORAGE FOR EXCHANGE 2010
Virtual DR Disaster recovery can be tough, but virtualized servers
and storage virtualization can make DR easier and a lot more
effective.
ALSO INSIDE Massive amounts of data will bury us!
Cast of cloud storage players taking shape
Optimize bandwidth for backup
STORAGEinside |April 2011
STORAGEinside | april 2011
Gotta yottabyte? 5 EDITORIAL Four different news reports all point
to the same fact:
Data is growing uncontrollably. It’s time for storage shops to
start cleaning house. by RICH CASTAGNA
Some clarity for enterprise cloud storage 9 STORWARS Cloud storage
is a next-generation IT infrastructure that’s
altering the data storage landscape. And its cast of key players is
beginning to take shape. by TONY ASARO
Virtual disaster recovery 12 Whether used singly or combined,
server virtualization and storage
virtualization are making an impact on the ability of IT to deliver
disaster recovery, and to do so cost effectively. by LAUREN
WHITEHOUSE
Thin provisioning in depth 21 Thin provisioning can help you use
your disk capa city much more
efficiently, but you need to get under the hood to understand how
the technology might work in your environment. by STEPHEN
FOSKETT
Exchange 2010 and storage systems 32 With Exchange Server 2010,
Microsoft made some significant changes
to the email app’s database structure, and those changes may also
affect the storage it resides on. by BRIEN M. POSEY
Backing up to the cloud requires new approach to bandwidth 43 HOT
SPOTS A lot of attention has been focused on security issues
related to cloud backup, but bandwidth and transfer issues may be
bigger problems. by LAUREN WHITEHOUSE
Don’t let the cloud obscure good judgment 47 READ/WRITE Cloud
storage is likely to become a significant part of
your data storage infrastructure. But test the waters before
locking into a vendor. by ARUN TANEJA
Fibre Channel still top dog among disks 51 SNAPSHOT What kind of
data drives are you using? Are they 6 Gig SAS?
Solid state? Or good old Fibre Channel (FC)? More than half of the
companies in our survey favor FC for their top tier. by RICH
CASTAGNA
From our sponsors 53 Useful links from our advertisers.
EMC2 and EMC are registered trademarks or trademarks of EMC
Corporation in the United States and other countries. © Copyright
2011 EMC Corporation. All rights reserved.
EMC BACKUP SHOWCASE
on Wednesday, April 27.
New product advancements, industry insights, and live interaction —
it’s all here.
MORE INFORMATION / REGISTER >>
Dealing with data growth
Cloud storage shaping up
5
tHIS JUST IN: Earth knocked off its axis due to weight of 295
exabytes of data! OK, maybe we’re just wobbling on our axis a
little bit, but that’s a heckuva lot of data, and you’re going to
need an awful lot of disks, chips, tape, paper and anything else
that might hold a petabyte here and there to accommodate it
all.
That number—295 exabytes—was reported in an article in Science
Express, a journal published by the American Association for the
Advancement of Science. The authors used some pretty complex
computations to come up with that number, which they actually
define as the amount of data we were able to store in 2007. Science
Express looks like a pretty serious pub—among the other articles in
the same issue were “Tomography of Reaction-Diffusion
Microemulsions Reveals Three-Dimensional Turing Patterns” and
“Dynamic Control of Chiral Space in a Catalytic Asymmetric Reaction
Using a Molecular Motor.” These folks aren’t fooling around . . .
and the fact they didn’t round the figure off to 300 exabytes adds
a little edge of precision that impresses stat junkies like
me.
Not only do we have to find a place to put all that data, but we’re
probably going to have to back it up and then stash away a copy or
two for disaster recovery. So that 295 exabytes could turn into a
few zettabytes of data. Can yottabytes of data be far behind?
MORE THIS JUST IN: According to IDC, in the fourth quarter of 2010,
“total disk storage systems capacity shipped reach 5,127 petabytes,
growing 55.7% year over year.” According to my seventh-grade math,
that’s approximately 290 exabytes short of what we need, but it’s
still a lot of disk.
Sooner or later, we’re going to have to learn how to throw some of
this stuff away. Once the attic and basement are crammed full, and
little 0s and 1s are spilling out of the cupboards, we won’t have
any room for new data. What happens then? Your shop might not be in
exabyte territory yet, but a surprising number of companies have
crossed the petabyte threshold, and coping with capacity is an
ongoing struggle even for shops with far more modest amounts of
data to store.
editorial | rich castagna
Gotta yottabyte? If you thought yottabytes were some kind of
snack food, you’re in for a rude awakening . . . but right now
we’ve got exabytes to deal with.
Sooner or later, we’re going to have to learn how to throw some of
this stuff away.
6 Storage April 2011
The problem is that data housecleaning tools either don’t exist or
aren’t up to the task at hand. Knowing what can be deep-sixed and
what needs to be preserved means you need to know what you have in
the first place. Few available products can give you much insight
into the state of your data stores. A few years ago, it looked like
data classification was poised to catch on—if not as a product
category then as an underpinning technology for a raft of storage
management chores, like identifying data that belongs in the trash
bin. Classification pretty much fizzled out, but maybe it can make
a come- back now that we have renewed in- terest in automated
storage tiering.
So what do you do? You can ask your users to clean up their acts by
voluntarily deleting all that useless old stuff. Somebody would
listen, right? I’ll be the first to admit that probably half of
what I produce can eventually end up in the data dump- ster without
any profound effect on humanity, my company or anyone for that
matter. You could try data storage quotas that limit what each user
can save; quotas can work, but you’ll become the second least popu-
lar person in your company, right after the guy who’s been stealing
everyone’s lunch from the cafeteria fridge.
EVEN MORE THIS JUST IN: According to CBC News, the Government of
Canada is getting very serious about reducing the amount of data it
stores: “The fed- eral government has ordered a monster machine to
chew up its discarded hard drives, USB thumb drives, CDs, and even
ancient Beta videotapes.” Why didn’t I think of that? It’s a
perfect solution: a monster machine that eats data. Let’s just hope
it has a healthy enough appetite to eat 295 exabytes or has some
hungry monster friends. But one CBC News reader had another idea:
“It would be easier and cheaper just to buy them sledgehammers.”
That’d work, too.
Or maybe just put everything on solid-state storage. An article in
the Journal of Digital Forensics, Security and Law ominously titled
“Solid State Drives: The Beginning of the End for Current Practice
in Digital Forensic Re- covery?”, suggests that you don’t really
have to worry if you’re drowning in data, just put it on
solid-state devices. At the end of the article’s abstract, the two
Australian authors wrote: “Our experimental findings demonstrate
that solid-state drives (SSDs) have the capacity to destroy
evidence cata-
I know it’s counter- intuitive for storage vendors to promote
technologies that help their customers buy less stuff from them,
but maybe there’s a wily little startup out there with a great idea
and a useful tool that can stem the data tide.
strophically under their own volition, in the absence of specific
instructions to do so from a computer.” Good news if you need to
put your data on a diet, I guess, but not so good for the
solid-state storage industry.
Talk about connecting the dots; stories about oceans of data,
skyrocketing disk sales, data munching machines and
not-so-solid-state storage, and all in the same week. Maybe it’s an
omen. Maybe it’s time to look for a real solution to soaring data
stores (and the associated soaring cost of keeping it all) instead
of just throwing more disk, tape or chips at the problem. I know
it’s counter- intuitive for storage vendors to promote technologies
that help their customers buy less stuff from them, but maybe
there’s a wily little startup out there with a great idea and a
useful tool that can stem the data tide. 2
Rich Castagna (
[email protected]) is editorial director
of the Storage Media Group.
* Click here for a sneak peek at what’s coming up in the May 2011
issue.
STORAGE
server rooms that require GPs NaviGatioN.
We get that virtualization can drive a better ROI. Highly certified
by Microsoft, VMware, HP and others, we can evaluate, design and
implement the right solution for you. We’ll get you out of this
mess at CDW.com/virtualization
soLveD.
©2011 CDW LLC. CDW®, CDW•G® and PeOPLe WHO Get It™ are trademarks
of CDW LLC.
9 Storage April 2011
fOR THE LAST year or so, cloud storage has been on a roller coaster
ride in terms of hype, buzz and, yes, plenty of skepticism. But
it’s not just a passing fad. Private clouds aren’t just another way
of talking about old IT stuff in a new way. And public clouds are
real alternatives and becoming pervasive.
Cloud storage is a next-generation IT infrastructure that’s
altering the data storage landscape. The changes will happen in
obvious and not-so-obvious ways. The obvious ways include using
cloud storage for backup, disaster recovery and archiving. The less
obvious ways include new business models and applications being
developed specifically for the cloud.
EMC Atmos offers private and public cloud storage. And even though
EMC has been at it for a while, I personally haven’t run into any
companies using that technology. I know they’re out there, but the
only Atmos user I’ve spoken to is using it as a Centera
replacement. NetApp is offering FAS in the cloud with major service
providers and giving big incentives to NetApp salespeople to drive
this business. True to form, NetApp is leveraging FAS for
everything, including cloud storage. EMC’s and NetApp’s strategies
are polar opposites—EMC has a number of different storage systems
with little synergy among them even as they sometimes overlap, and
NetApp has a single storage system it uses for everything.
Interestingly, while they have diametrically opposed strategies,
they’re both very successful overall and you can’t argue with
success.
Hitachi has positioned its entire storage portfolio for cloud
storage, and Dell is leveraging its DX solution. Hewlett-Packard
hasn’t announced anything significant, and IBM doesn’t have
anything substantial to speak of yet.
Nirvanix is an interesting emerging enterprise cloud storage vendor
with an end-to-end solution that makes it extremely easy for
companies to use its cloud storage. Nirvanix provides an
easy-to-manage complete solution with front-end usability,
security, reliability, performance, multi-tenancy and global access
combined with back-end controls, reporting, management and
analytics. It isn’t just a storage system that scales and supports
HTTP, which so many other vendors tout as a cloud platform. Rather,
Nirvanix provides a holistic solution designed specifically for
businesses to use without being cloud experts.
This is exactly the requirement I wrote about here nearly a year
ago for
StorWars | tony asaro
Some clarity for enterprise cloud storage
Cloud storage is moving beyond the hype and the cast of key players
is beginning to take shape.
what was needed for cloud storage to succeed. Having a storage
system built just for the cloud is important but it’s barely half
the story. It’s the appli- cation and management of that
infrastructure that’s essential to a true cloud storage
solution.
Another trend in enterprise cloud storage is the use of open-source
file systems to build your own. Techie IT professionals are
considering Gluster, Hadoop and other extensible, scalable file
systems. But they face a two-fold challenge: They need to
bulletproof their homemade storage systems, and they need to build
the front-end and back-end applications to manage and offer these
services. It’s daunting but achievable; Amazon and Google are
examples of companies doing this even though it was outside their
core competencies.
Google is also an important player in cloud. More and more
companies are moving to Google Apps and, as a result, are using
Google infrastructure, which should also be considered enterprise
cloud storage as real businesses are replacing their on-premises
applications with Google’s offerings.
I haven’t mentioned Amazon Simple Storage Service (Amazon S3) as an
enterprise cloud storage service be- cause it truly isn’t that.
Amazon S3 isn’t a total solution and seems to have no interest in
being one. Rather, it’s a utili- ty that you can build your own
apps to use. However, I believe next-generation Web-based
businesses will continue to use Amazon S3 and, in that sense, it
will be a storage platform for the enterprise.
I can see enterprise cloud storage playing out with EMC driving
Atmos within its own customer base, which sometimes works (e.g.,
Data Domain) and sometimes doesn’t (e.g., Centera). But EMC will
get customers and gen- erate business, things it tends to do very
well. Expect a major IT vendor to acquire Nirvanix and then go
head-to-head with EMC Atmos. Those two vendors will dominate this
market, with other vendors having pockets of success here and there
but never assuming a dominant position. Of course, that depends on
who buys Nirvanix and how well they execute. Google will continue
to drive more business applications into its cloud, which will
impact storage vendors and application vendors such as Microsoft
and Oracle.
There is no endgame with the market going in one direction or the
other. Instead, most users will have a mixture of on-premises IT,
private cloud, public cloud and applications in the cloud. There
will be a handful of major leaders and the others will be forever
chasing their tails. 2
Tony Asaro is senior analyst and founder of Voices of IT
(www.VoicesofIT.com).
STORAGE
10 Storage April 2011
Expect a major IT vendor to acquire Nirvanix and then go
head-to-head with EMC Atmos.
ACTIVE DRAWER TECHNOLOGY Manage without heavy lifting.
60 DRIVES IN 4U 3X the capacity at 1/3rd the power.
NEXT GEN CONTROLLER Latest dual RAID engine is fast, times
two.
EASY EFFICIENT ENTERPRISE-CLASS
Virtual DISASTER RECOVERY
Storage and server virtualization make many of the most onerous
disaster recovery (DR) tasks relatively easy to execute, while
helping to cut overall DR costs.
BY LAUREN WHITEHOUSE
iF YOUR COMPANY still lacks a viable disaster recovery (DR)
strategy, it might be time to start thinking virtualization. The
initial drivers behind server virtual- ization adoption have been
improving resource utilization and lowering costs through
consolidation, but next-wave adopters have realized that
virtualiza- tion can also improve availability.
Virtualization turns physical devices into sets of resource pools
that are independent of the physical asset they run on. With server
virtualization, decoupling operating systems, applications and data
from specific physical assets eliminates the economic and
operational issues of infrastructure
silos—one of the key ingredients to affordable disaster recovery.
Storage virtualization takes those very same benefits and extends
them
from servers to the underlying storage domain, bringing IT
organizations one step closer to the ideal of a virtualized IT
infrastructure. By harnessing the power of virtualization, at both
the server and storage level, IT organizations can become more
agile in disaster recovery.
REDUCE THE RISK Improving disaster recovery and business continuity
are perennial top-10 IT priorities because companies want to reduce
the risk of losing access to systems and data. While most shops
have daily data protection plans in place, fewer of them focus
their efforts on true disasters, which would include any event that
interrupts service at the primary production location. An event can
be many different things, including power failures, fires, floods,
other weather-related outages, natural disasters, pandemics or
terrorism. Regardless of the cause, unplanned downtime in the data
center can wreak havoc on IT’s ability to maintain business
operations.
The goal of a DR process is to recre- ate all necessary systems at
a second location as quickly and reliably as pos- sible.
Unfortunately, for many firms, DR strategies are often cobbled
together because there’s nothing or no one mandating them, they’re
too costly or complex, or there’s a false belief that existing
backup processes are adequate for disaster recovery.
Backup technologies and processes will take you just so far when it
comes to a disaster. Tier 1 data (the most critical stuff) makes up
approximately 50% of an organization’s total primary data. When the
Enterprise Strategy Group (ESG) surveyed IT professionals
responsible for data protection, 53% said their organization could
tolerate one hour or less of downtime before their business
suffered revenue loss or some other type of adverse business
impact; nearly three-quarters (74%) fell into the
less-than-three-hour range. (The results of this survey were
published in the ESG research report, 2010 Data Protection Trends,
April 2010.) Under the best conditions, the time it takes to
acquire
STORAGE
13 Storage April 2011
While most shops have daily data protection plans in place, fewer
of them focus their efforts on true disasters, which would include
any event that inter- rupts service at the primary production
location.
replacement hardware, re-install operating systems and
applications, and recover data—even from a disk-based copy—will
likely exceed a recovery time objective (RTO) of one to three
hours.
Recovery from a mirror copy of a system is faster than recovering
with traditional backup methods, but it’s also more expensive and
complex. Maintaining identical systems in two lo- cations and
synchronizing configuration settings and data copies can be a chal-
lenge. This often forces companies to prioritize or “triage” their
data, provid- ing greater protection to some tiers than others. ESG
research found that tier 2 data comprises 28% of all pri- mary
data, and nearly half (47%) of IT organizations we surveyed noted
three hours or less of downtime tolerance for tier 2 data.
Therefore, if costs force a company to apply a different strategy
or a no-protection strategy for “critical” (tier 1) vs. “important”
(tier 2), some risks may be introduced.
BENEFITS OF SERVER VIRTUALIZATION FOR DR Virtualization has become
a major catalyst for change in x86 environments because it provides
new opportunities for more cost-effective DR. When looking at the
reasons behind server virtualization initiatives coming in the next
12 to 18 months, ESG research found that making use of virtual
machine replication to facilitate disaster recovery ranked second
behind consolidating more physical servers onto virtualization
platforms. (See the ESG research report, 2011 IT Spending
Intentions, published in January 2011, for details of the survey
results.)
Because server virtualization abstracts from the physical hardware
layer, it eliminates the need for identical hardware configurations
at production and recovery data centers, which provides several
benefits. And since virtualization is often a catalyst to refresh
the underlying infrastructure, there’s usually retired hardware on
hand. For some organizations that might not have been able to
secure the CapEx to outfit a DR configuration, there may be an
opportunity to take advantage of the “hand-me-down” hardware. Also,
by consolidating multiple applications on a single physical server
at the recovery data center, the amount of physical recovery
infrastructure required is reduced. This, in turn, minimizes
expensive raised floor space costs, as well as additional
STORAGE
14 Storage April 2011
Recovery from a mirror copy of a system is faster than recovering
with traditional backup methods, but it’s also more expensive and
complex.
Efficient Enterprises do more with Dell EqualLogic.
Reduce man hours and save up to 76% in network management costs.
Learn how PACSUN reduced their storage administration time by 20%
at dellenterprise.com/equallogic.
* The Dell EqualLogic PS6010XVS Hybrid SSD/SAS SAN is InfoWorld’s
2011 Technology of the Year award winner for Best Storage System.
Click here to learn more.
Transform the way you store data with award-winning iSCSI SANs from
Dell EqualLogic*
power and cooling requirements. Leveraging the encapsulation and
portability features of virtual servers
aids in DR enablement. Encapsulating the virtual machine into a
single file enables mobility and allows multiple copies of the
virtual machine to be created and more easily transferred within
and between sites for business resilience and DR purposes—a
dramatic improvement over backing up data to portable media such as
tape and rotating media at a cold standby site. In addition,
protecting virtual machine images and capturing the system state of
the virtual machine are new concepts that weren’t available in the
physical world. In a recovery situation, there’s no need to
reassemble the operating system, re-set configuration settings and
restore data. Activating a virtual machine image is a lot faster
than starting from a bare-metal recovery.
Flexibility is another difference. Virtualization eliminates the
aforemen- tioned need for a one-to-one physical mirror of a system
for disaster recovery. IT has the choice of establishing
physical-to-virtual (P2V) and virtual-to-virtual (V2V) failover
configurations—locally and/or remotely—to enable rapid recov- ery
without incurring the additional expense of purchasing and
maintaining identical hardware. Virtualization also offers
flexibility in configuring active- active scenarios (for example, a
remote or branch office acts as the recovery site for the
production site and vice versa) or active-passive (e.g., a corpo-
rate-owned or third-party hosting site acts as the recovery site,
remaining dormant until needed).
Finally, virtualization delivers flexibility in the form of DR
testing. To fully test a disaster recovery plan requires dis-
abling the primary data center and attempting to fail over to the
second- ary. A virtualized infrastructure makes it significantly
easier to conduct fre- quent nondisruptive tests to ensure the DR
process is correct and the organi- zation’s staff is practiced in
executing it consistently and correctly, including during peak
hours of operation.
With server virtualization, a greater degree of DR agility can be
achieved. IT’s ability to respond to service interruptions can be
greatly improved, espe- cially with new automation techniques, such
as those available for VMware virtualization technology (see
“Automating DR in VMware environments,” p. 17) and Microsoft System
Center Virtual Machine Manager, which offers tools to determine
which applications and services to restore in which order.
Recovery
STORAGE
16 Storage April 2011
To fully test a disaster recovery plan requires disabling the
primary data center and attempting to fail over to the
secondary.
STORAGE
VMware Inc. introduced a VMware vCenter management service in 2008
to automate, document and facilitate disaster recovery (DR)
processes. VMware vCenter Site Recovery Manager (SRM) turns manual
recovery runbooks into automated recovery plans, providing
centralized management of recovery processes via VMware vCenter.
SRM accelerates recovery, improves reliability and streamlines
management over manual DR processes.
VMware SRM automates the setup, testing and actual failover courses
of action. With SRM, organizations can automate and man- age
failover between active-passive sites—production data center
(protection site) and disaster recovery (recovery site) location—or
active-active sites, two sites that have active workloads and serve
as recovery sites for each other.
SRM integrates with third-party storage- and network-based
replication solutions via a storage replicator adapter (SRA)
installed at both the primary and recovery sites. The SRA
facilitates discovery of arrays and replicated LUNs, and initiates
test and failover, making it much easier to ensure that the storage
replication and virtual machine configurations are established
properly. Datastores are replicated between sites via preconfigured
array- or network-based replication.
SRM doesn’t actually perform data protection or data recovery, at
least not yet. VMware pre-announced its forthcoming IP-based
replication feature in SRM. It will be able to protect dissimilar
arrays in local and remote locations, provide virtual machine-level
granu- larity and support local (DAS or internal) storage. This
opens up lots of possibilities for companies that don’t have a SAN
or don’t want to be limited to a peered storage replication
solution. Even those who have taken advantage of SRM with SAN-based
replication between like storage arrays at production and recovery
sites can extend re- covery strategies to other tiers of workloads
with an asynchronous solution.
can be quicker and the skills required by operations staff to
recover virtualized applications are less stringent.
USING STORAGE VIRTUALIZATION IN A DR PLAN As organizations become
more comfortable with one form of virtualization, they don’t have
to make great intellectual or operational leaps to grasp the
concept of virtualizing other data center domains. Often, IT
organizations un- dertaking complete data center refresh
initiatives position virtualization as a key part of the makeover
and look to extract all possible efficiencies in one fell swoop by
deploying virtualization in multiple technology areas. So it’s not
uncommon to see server virtualization combined with storage
virtualization.
Like server virtualization, storage virtualization untethers data
from dedicated devices. Storage virtualiza- tion takes multiple
storage systems and treats those devices as a single, centrally
managed pool of storage, enabling management from one console. It
also enables data movement among different storage systems trans-
parently, providing capacity and load balancing. In addition to
lowering costs, improving resource utilization, increas- ing
availability, simplifying upgrades and enabling scalability, the
expected benefit of storage virtualization is easier and more
cost-effective DR.
In a DR scenario, storage virtualization improves resource
utilization, allowing organizations to do more with less capacity
on hand. IT is likely to purchase and deploy far less physical
storage with thin, just-in-time provisioning of multiple tiers of
storage. By improving capacity utilization, organizations can
reduce the amount of additional capacity purchases and more easily
scale environments.
Virtualization allows storage configurations to vary between the
primary and the DR site. Flexibility in configuring dissimilar
systems at the production and recovery sites can introduce cost
savings (by allowing existing storage systems to be reclaimed and
reused), without introducing complexity. It also allows IT to
mirror primary storage to more affordable solutions at a remote
site, if desired.
Native data replication that integrates with the virtualized
storage environ-
STORAGE
18 Storage April 2011
Storage virtualiza- tion takes multiple storage systems and treats
those devices as a single, centrally managed pool of storage,
enabling management from one console.
ment can provide improved functionality for virtual disaster
recovery. Remote mirroring between heterogeneous storage systems
(that is, more expensive at the primary site and less costly at the
recovery site) contributes to lower costs.
FINAL WORD ON VIRTUALIZATION Whether used singly or combined,
server virtualization and storage virtu- alization are making an
impact on IT’s ability to deliver DR, and to deliver it cost
effectively. If your company has been on the sidelines, crossing
its collective fingers and hoping a disaster never strikes, it
might be time to inves- tigate virtualization. And if you have
virtualization in place, you should have the basic elements for an
effective and cost-efficient DR environment. It’s time to take the
next steps. 2
Lauren Whitehouse is a senior analyst focusing on backup and
recovery software and replication solutions at Enterprise Strategy
Group, Milford, Mass.
STORAGE
19 Storage April 2011
Whether used singly or combined, server virtualization and storage
virtualiza- tion are making an impact on IT’s ability to deliver
DR, and to deliver it cost effectively.
Quantum’s DXi-Series Appliances with deduplication provide higher
performance at lower cost than the leading competitor.
Preserving The World’s Most Important Data. Yours.™
Contact us to learn more at (866) 809-5230 or visit
www.quantum.com/dxi
©2011 Quantum Corporation. All rights reserved.
Quantum has helped some of the largest organizations in the world
integrate
deduplication into their backup process. The benefi ts they report
are immediate and
signifi cant—faster backup and restore, 90%+ reduction in disk
needs, automated DR
using remote replication, reduced administration time—all while
lowering overall costs
and improving the bottom line.
Our award-winning DXi®-Series appliances deliver a smart,
time-saving approach
to disk backup. They are acknowledged technical leaders. In fact,
our DXi6500 was
just nominated as a “Best Backup Hardware” fi nalist in Storage
Magazine’s Best
Product of the Year Awards—it’s both faster and up to 45% less
expensive than the
leading competitor.
Faster performance. Easier deployment. Lower cost.
provide higher p leading competi
Preserving The World’s Most Importan
Contact us to learn more at (8
©2011 Quantum Corporation. All rights reserved.
Q
d
s
u
a
O
t
j
P
le
G
F
http://www.youtube.com/QuantumCorp
http://twitter.com/QuantumCorp
http://www.facebook.com/quantumcorp
nOBODY WANTS TO pay for something they’re not using, but enterprise
data storage managers do it all the time. The inflexible nature of
disk storage purchasing and provisioning leads to shockingly low
levels of capacity utilization. Improving the efficiency of storage
has been a persistent theme of the industry and a goal for most
storage professionals for a decade, but only thin provi- sioning
technology has delivered tangible, real-world benefits.
Dealing with data growth
Cloud storage shaping up
Thin provisioning
in depth Thin provisioning can help you use your disk capacity much
more efficiently, but you need to get under the hood a little to
understand how thin provisioning will work in your
environment.
BY STEPHEN FOSKETT
The concept of thin provisioning may be simple to comprehend, but
it’s a complex technology to implement effectively. If an array
only allocates storage capacity that contains data, it can store
far more data than one that allocates all remaining (and
unnecessary) “white space.” But storage arrays are quite a few
steps removed from the applications that store and use data, and no
standard communication mechanism gives them insight into which data
is or isn’t being used.
Storage vendors have taken a wide variety of approaches to address
this issue, but the most effective mechanisms are difficult to
implement in existing storage arrays. That’s why next-generation
storage systems, often from smaller companies, have included
effective thin provisioning technology for some time, while
industry stalwarts may only now be adding this capability.
STORAGE
22 Storage April 2011
WHEN EVALUATING a storage array that includes thin provisioning,
consider the following questions, which reflect the broad spec-
trum of approaches to this challenge. Note that not all
capabilities are required in all situations.
• Is thin provisioning included in the purchase price or is it an
extra-cost option?
• Does the array support zero page reclaim? How often does the
reclamation process run?
• What is the page size or thin provisioning increment?
• Does thin provisioning work in concert with snapshots, mirror-
ing and replication? Is thick-to-thin replication supported?
• What does the array do when it entirely fills up? What’s the
process of alerting, freeing capacity and halting writes?
• Does the array support WRITE_SAME? What about SCSI UNMAP or ATA
TRIM?
• Is there a VMware vStorage APIs for Array Integration (VAAI)
“block zeroing” plug-in? Is it the basic T10 plug-in or a special-
ized one for this array family?
WHAT YOU SHOULD ASK ABOUT THIN PROVISIONING
THIN ALLOCATION-ON-WRITE Traditional storage provisioning maintains
a one-to-one map between internal disk drives and the capacity used
by servers. In the world of block storage, a server would “see” a
fixed-size drive, volume or LUN and every bit of that capacity
would exist on hard disk drives residing in the storage array. The
100 GB C: drive in a Windows server, for example, would access 100
GB of reserved RAID-protected capacity on a few disk drives in a
storage array.
The simplest implementation of thin provisioning is a
straightforward evolu- tion of this approach. Storage capacity is
aggregated into “pools” of same- sized pages, which are then
allocated to servers on demand rather than on initial creation. In
our example, the 100 GB C: drive might contain only 10 GB of files,
and this space alone would be mapped to 10 GB of capacity in the
array. As new files are written, the array would pull additional
capacity from the free pool and assign it to that server.
This type of “allocate-on-write” thin provisioning is fairly
widespread today. Most midrange and enterprise storage arrays, and
some smaller devices, include this capability either natively or as
an added-cost option. But there are issues with this
approach.
One obvious pitfall is that such systems are only thin for a time.
Most file systems use “clear” space for new files to avoid
fragmentation; deleted con- tent is simply marked unused at the
file system layer rather than zeroed out or otherwise freed up at
the storage array. These systems will eventually gobble up their
entire allocation of storage even without much additional data
being written. This not only reduces the efficiency of the system
but risks “over-commit” issues, where the array can no longer meet
its allocation commitments and write operations come to a
halt.
That doesn’t suggest, however, that thin provisioning is useless
without thin reclamation (see “The enemies of thin,” p. 25), but
the long-term benefit of the technology may be reduced. Plus, since
most storage managers assume that thin storage will stay thin,
effectively reclaiming unused space is rapidly becoming a
requirement.
THE THIN RECLAMATION CHALLENGE The tough part of thin provisioning
technology is reclaiming unused capacity rather than correctly
allocating it. Returning no-longer-used capacity to the
STORAGE
23 Storage April 2011
Storage capacity is aggregated into “pools” of same- sized pages,
which are then allocated to servers on demand rather than on
initial creation.
EMC2, EMC, and the EMC logo are registered trademarks or trademarks
of EMC Corporation in the United States and other countries. ©
Copyright 2011 EMC Corporation. All rights reserved.
FOR LESS FLY FIRST CLASS
Introducing EMC® VNXe™. Simple and efficient storage starting under
$10K.
free pool is the key differentiator among thin provisioning
implementations, and the industry is still very much in a state of
flux in this regard.
The root cause of the thin reclamation challenge is a lack of
communica- tion between applications and data storage systems. As
noted earlier, file systems aren’t generally thin-aware, and no
mechanism exists to report when capacity is no longer needed. The
key to effective thin provisioning is discovering opportunities to
reclaim unused capacity; there are essentially
STORAGE
25 Storage April 2011
“I MAY NEED 500 GB or more for this application,” the DBA thinks,
so just to play it safe she asks the storage administrator for 1
TB. The storage admin has the same idea, so he allocates 2 TB to
keep the DBA out of his office. This familiar story is often blamed
for the sad state of storage capacity utilization, but is that
justified?
In most enterprise storage environments, poor capacity utiliza-
tion can come from many sources:
• Annual and per-project purchasing cycles that encourage occa-
sional over-buying of storage capacity that may never be used
• Ineffective resource monitoring and capacity planning processes
that obscure capacity requirements
• Incomplete storage networking that strands capacity out of reach
of the systems needing it
• Disjointed allocation procedures resulting in assigned-but-
never-used storage capacity
• Inflexible operating systems and file systems that make it
difficult to grow and shrink as storage demands change
Thin provisioning can be effective in many of these situations, but
it’s no magic bullet. Organizations with poor purchasing and
capacity planning processes may not benefit much, and all the
capacity in the world is useless if it can’t be accessed over a
segmented SAN. But even the most basic thin provisioning system can
go a long way to repurpose never-used storage capacity.
THE ENEMIES OF THIN
two ways to accomplish this: • The storage array can snoop the data
it receives and stores, and attempt
to deduce when opportunity exists to reclaim capacity • The server
can be modified to send signals to the array, notifying it
when
capacity is no longer used
The first option is difficult to achieve but can be very effective,
since oper- ating system vendors don’t seem eager to add
thin-enhancing features to their file systems. Products like Data
Robotics Inc.’s Drobo storage systems snoop on certain known
partition and file system types to determine which disk blocks are
unused and then reclaim them for future use. But that approach is
extremely difficult in practice given the huge number of operating
systems, applications and volume managers in use.
Therefore, the key topic in enterprise thin provisioning involves
the latter approach: improving the communication mechanism between
the server and storage systems.
ZERO PAGE RECLAIM Perhaps the best-known thin-enabling technology
is zero page reclaim. It works something like this: The storage
array divides storage capacity into “pages” and allocates them to
store data as needed. If a page contains only zeroes, it can be
“reclaimed” into the free-capacity pool. Any future read requests
will simply result in zeroes, while any writes will trigger another
page being allocated. Of course, no technology is as simple as
that.
Actually writing all those zeroes can be problematic, however. It
takes just as much CPU and I/O effort to write a 0 as a 1, and
inefficiency in these areas is just as much a concern for servers
and storage systems as storage capacity. The T10 Technical
Committee on SCSI Storage Interfaces has specified a SCSI command
(WRITE_SAME) to enable “deduplication” of those I/Os, and this has
been extended with a so-called “discard bit” to notify arrays that
they need not store the resulting zeroes.
Most storage arrays aren’t yet capable of detecting whole pages of
zeroes on write. Instead, they write them to disk and a “scrubbing”
process later de- tects these zeroed pages and discards them, so
they appear used until they’re scrubbed and discarded. This process
can be run on an automated schedule or
STORAGE
26 Storage April 2011
Most storage arrays aren’t yet capable of detecting whole pages of
zeroes on write.
Dealing with data growth
Cloud storage shaping up
27 Storage April 2011
COMPARING THE total cost of ownership (TCO) for enterprise storage
solutions is controversial, with self-serving and incomplete models
the norm for storage vendors. Before spending money on cost-
saving, efficiency-improving technologies like thin provisioning,
it’s wise to create a model internally to serve as a reality check
for vendor assumptions and promises.
A complete TCO includes more than just the cost of hardware and
software— operations and maintenance, data center costs and the
expenses associated with purchasing, migration and decommissioning
storage arrays must be considered. And it’s a good idea to con-
sider the multiplier effect of inefficient allocation of resources:
Leaving 1 GB unused for every one written doubles the effective
cost of storage. With end-to-end storage utilization averaging
below 25%, this multiplier can add up quickly.
Such cost models often reveal the startling fact that storage
capacity on hard disk drives (or new solid-state disks or SSDs) is
a small component of TCO—often less than 15% of total cost. But
that doesn’t mean driving better capacity utilization is a wasted
effort. Eliminating the multiplier effect from inefficient
utilization can have a far-greater impact on TCO than merely
packing more bits onto a disk drive.
Consider the operational impact of thin provisioning, as well as
its mechanical impact on storage density. Thin systems may require
less administration because capacity can be allocated without tra-
ditional constraints, but that could lead to a nightmare scenario
of overallocated arrays running out of capacity and bringing apps
to a halt. The best thin storage systems are also highly
virtualized, flexible and instrumented, allowing improved
operational efficiency and high utilization.
THIN PROVISIONING AND TCO
Consider the opera- tional impact of thin provisioning, as well as
its mechanical impact on storage density.
EMC: #1 IN STORAGE FOR VIRTUALIZATION
Source: Enterprise Strategy Group (ESG) Data Center Spending
Intentions Survey EMC2, EMC, the EMC logo and where information
lives are registered trademarks or trademarks of EMC Corporation in
the United States and other countries. © Copyright 2010 EMC
Corporation. All rights reserved. 2128
manually initiated by an administrator. And some arrays only detect
zeroed pages during a mirror or migration, further reducing
capacity efficiency.
BUILDING BRIDGES Even if an array has a feature-complete zero page
reclaim capability, it will only be functional if zeroes are
actually written. The server must be instructed to write zeroes
where capacity is no longer needed, and that’s not the typical
default behavior. Most operating systems need a command, like
Windows’ “sdelete –c” or something on the order of NetApp’s
SnapDrive, to make this happen, and these are only run
occasionally.
Some applications, including VMware ESX volumes, do indeed zero-
out new space and the ESX command “eagerzeroedthick” will even
clear out space. Although certain compatibility issues remain,
notably with VMotion, ESX is becoming increasingly thin- aware. The
vStorage APIs for Array Integration (VAAI), added in ESX 4.1,
includes native “block zeroing” support for certain storage
systems. ESX uses a plug-in, either a special-purpose one or the
generic T10 WRITE_SAME support, to signal an array that VMFS
capacity is no longer needed.
Symantec Corp. is also leading the charge to support thin
provisioning. The Veritas Thin Reclamation API, found in the
Veritas Storage Foundation product, includes broad support for most
major storage arrays. It uses a variety of communication mechanisms
to release unneeded capacity, and is fully integrated with the VxFS
file system and volume manager. Storage Foundation also includes
the SmartMove migration facility, which assists thin arrays by only
transferring blocks containing data.
Thin awareness in other systems is coming more slowly. Another
standard command, ATA TRIM, is intended to support solid-state
storage, but it could also send thin reclamation signals, along
with its SCSI cousin, UNMAP. Microsoft and Linux now support TRIM,
and could therefore add thin provisioning support in the future as
well. They could also modify the way in which storage is allocated
and released in their file systems.
GETTING THINNER Thin provisioning is not without its challenges,
but the benefits are many. It’s one of the few technologies that
can improve real-world storage utilization
STORAGE
29 Storage April 2011
Even if an array has a feature-complete zero page reclaim
capability, it will only be functional if zeroes are actually
written.
even when the core issue isn’t technology related. Indeed, the
ability of thin provisioning to mask poor storage forecasting and
allocation processes contributed to the negative image many,
including me, had of it. But as the technology improves and thin
reclamation becomes more automated, this technology will become a
standard component in the enterprise storage arsenal. 2
Stephen Foskett is an independent consultant and author
specializing in enter- prise storage and cloud computing. He is
responsible for Gestalt IT, a community of independent IT thought
leaders, and organizes their Tech Field Day events. He can be found
online at GestaltIT.com, FoskettS.net and on Twitter at
@SFoskett.
STORAGE
www.pillardata.com
© 2010 Pillar Data Systems Inc. All rights reserved. Pillar Data
Systems, Pillar Axiom, AxiomONE and the Pillar logo are all
trademarks or registered trademarks of Pillar Data Systems.
Putting your data on Pillar Axiom® is the most efficient way to
boost productivity, cut costs, and put more money to the bottom
line.
Don’t put up with the wasteful ways of legacy storage systems. They
are costing you way too much in floor space, energy, and
money.
Make the move to Pillar Axiom, the world’s most efficient system,
because it’s truly Application-Aware. Get the industry’s highest
utilization rate – up to 80%. Guaranteed. Slash energy costs. Save
floor space. Reduce TCO by as much as 50%. And increase user
satisfaction immeasurably.
It’s time to Stop Storage Waste and see how efficient you can be.
www.pillardata.com
Thank You. Your CEO, CFO and CIO
Download a complimentary white paper: Bringing an End to Storage
Waste. www.pillardata.com/endwaste
Dealing with data growth
Cloud storage shaping up
Exchange 2010 and storage systems
The latest version of Exchange Server has some significant changes
that will impact the storage supporting the mail system.
BY BRIEN M. POSEYwITH EXCHANGE SERVER 2010, Microsoft Corp. made
some major changes to the database structure that underlies the
email application. These architectural changes have a significant
impact on planning for Exchange Server’s data storage
requirements.
The biggest change Microsoft made was eliminating single-instance
storage (SIS). Previously, if a message was sent to multiple
recipients, only one copy of the message was stored within the
mailbox database. User mailboxes received pointers to the message
rather than a copy of the entire message.
The elimination of single-instance storage means that when a
message is sent to multiple recipients, each recipient receives a
full copy of the message. In terms
of capacity planning, the overall impact of this change will vary
depending on how many messages include attachments.
Text and HTML-based messages are typically small and will have a
minimal impact on capacity planning, and Microsoft further reduces
the impact by automatically compressing such messages. However, if
you have users who routinely send large attachments to multiple
recipients, those messages could have a major impact on database
growth. Microsoft’s primary goal in designing the new database
architecture was to decrease database I/O re- quirements. As such,
Microsoft chose not to compress message attachments because of the
additional I/O that would have been required to compress/
decompress them.
It may seem odd that at a time when storage managers are looking to
reduce duplication in primary storage Microsoft removes a data
reduction feature from Exchange. But Microsoft scrapped
single-instance storage because Exchange mailbox databases perform
much more efficiently without it. Microsoft claims database I/O
requirements have been reduced by approximately 70% in Exchange
2010.
One of the most common methods of keeping Exchange 2010 mailbox
data- bases from growing too large is to use mailbox quotas. Quotas
prevent individual mailboxes from exceeding a predetermined size,
and the quotas in Exchange 2010 work as they did in previous
versions of Exchange with one notable exception. Exchange 2010
introduces the concept of archive mailboxes (discussed later). If a
user has been given an archive mailbox, the mailbox quota won’t
count the archive mailbox’s contents when determining how much
storage the user is consuming. Exchange does, however, let you
manage archive storage through a separate quota.
The use of mailbox quotas is a tried-and-true method for limiting
data storage consumption. But Microsoft has been encouraging
organizations to make use of low-cost storage rather than mailbox
quotas. The argument is that organizations can accommodate the
increased database size without spending a lot on expensive storage
solutions.
The low-cost storage recommendation is based on more than just
storage cost. Many organizations have been forced to set stringent
mailbox quotas that have forced users to delete important messages.
Ostensibly, cheaper storage will allow for larger mailbox quotas or
for the elimination of quotas altogether.
STORAGE
33 Storage April 2011
One of the most common methods of keeping Exchange 2010 mailbox
data- bases from growing too large is to use mailbox quotas.
Previously, using lower end storage subsystems in production
Exchange Server environments was unheard of, but Exchange 2010’s
reduced I/O re- quirements make storage options such as SATA drives
practical. And Exchange Server 2010 is flexible in terms of the
types of storage it can use; it will work with direct-attached
storage (DAS) or storage-area network (SAN) storage (or with an
iSCSI connection to a storage pool). However, Microsoft does
prevent you from storing Exchange Server data on any storage device
that must be
STORAGE
REPLACE THIRD-PARTY PRODUCTS?
Prior to the release of Exchange Server 2010, an entire industry
emerged around creating archival and e-discovery products for
Exchange Server. Now that Exchange 2010 offers native support for
user archives and has built in e-discovery capabilities, it seems
only natural to consider whether these new features can replace
third- party products.
Exchange 2010’s e-discovery and archiving features may be suffi-
cient for some smaller organizations, but they’re not
enterprise-ready. The archiving and e-discovery features both have
limitations you won’t encounter with most third-party tools.
For example, Exchange 2010’s archive mailboxes aren’t a true
archiving solution. Archive mailboxes let users offload important
messages to a secondary mailbox that’s not subject to strict reten-
tion policies or storage quotas. But if you want to do true
archiving at the organizational level you still must use Exchange’s
journaling feature. The journal works, but third-party archivers
provide much better control over message archival, retention and
disposal.
The situation’s the same for Exchange 2010’s multi-mailbox
e-discovery search feature. Multi-mailbox search has some major
limitations. For example, it can only be used with Exchange 2010
mailboxes, so you’ll still need a third-party product to search
legacy Exchange mailboxes or PSTs.
Multi-mailbox search also lacks some of the rich reporting options
and export capabilities commonly found in specialized e-discovery
products.
Up to 85% of computing capacity sits idle in distributed
environments. A smarter planet needs smarter infrastructure. Let’s
build a smarter planet. ibm.com/dynamic
IBM, the IBM logo and ibm.com are trademarks of International
Business Machines Corporation, registered in many jurisdictions
worldwide. A current list of IBM trademarks is available on the Web
at “Copyright and trademark information” at
www.ibm.com/legal/copytrade.shtml.
accessed through a mapped drive letter. So you won’t be able to
store a mail- box database on a network-attached storage (NAS)
system unless it supports iSCSI connectivity.
ADDITIONAL CONSIDERATIONS Even though low-cost storage might
provide adequate performance, it’s still important to choose a
storage subsystem that also meets your organization’s reliability
requirements. For instance, if you opt for SATA storage, it’s best
to create a fault-tolerant SATA array. Microsoft recommends using
RAID 1+0 arrays. Some organizations use RAID 5 because it’s less
costly and still pro- vides fault tolerance, but RAID 1+0 arrays
generally offer better performance.
It’s worth noting that database size can have a direct impact on
perform- ance. As a general rule, mailbox databases on standalone
mailbox servers should be limited to 200 GB or less. If a mailbox
database grows larger than 200 GB, you may benefit from dividing
the database into multiple, smaller databases. For mailbox
databases that are part of a Database Availability Group, the
recommended maximum database size is 2 TB.
DETERMINING STORAGE REQUIREMENTS Determining the storage
requirements for an Exchange 2010 deployment can be a big job, but
Microsoft offers a free tool that can help. The Exchange 2010
Mailbox Server Role Requirements Calculator is an Excel spreadsheet
that calculates your Exchange storage requirements based on your
organization’s
Exchange usage. To use the Exchange
2010 Mailbox Server Role Requirements Calculator, fill in a series
of cells by answering questions related to the intended Exchange
Server configu- ration and usage (see “Exchange 2010 Mailbox Server
Role Requirements Calculator” screenshot at left). For instance,
the spreadsheet asks ques- tions about the average size of an email
message
STORAGE
and the number of messages users send and receive each day.
Formulas built into the spreadsheet will use the information you
provide to determine the required storage architecture.
Keep in mind, however, that while the Exchange 2010 Mailbox Server
Role Requirements Calculator may be the best tool available for
estimating Exchange mailbox server storage requirements, the
recommendations it offers are only as accurate as the data you
provide. To compensate, Microsoft recommends you provision enough
disk space to accommodate at least 120% of the calculated maximum
database size.
EXCHANGE ARCHIVE MAILBOXES There are other factors to consider that
may impact your Exchange Server storage planning, such as whether
you plan to implement user archive mail- boxes, a new and optional
feature. User archive mailboxes are secondary mailboxes that can be
used for long-term retention of messages. What makes archive
mailboxes different from other Exchange archiving methods is that
unlike a more traditional archive (such as a journal mailbox), the
user retains ownership of the items in the archive mailbox. As
such, each user’s archives are readily accessible.
Archive mailboxes are designed to take the place of PST files. But
unlike PST files, archive mailboxes are stored within a mailbox
database on the Exchange Server where they can be managed and
regulated by the Exchange administrator.
In the original RTM release of Ex- change 2010, user archive
mailboxes were in the same mailbox database as users’ primary
mailboxes. In SP1, Microsoft provided the option of relocat- ing
user archive mailboxes to a separate mailbox database that allows
the archives to be offloaded so they don’t impact the primary
mailbox storage.
Microsoft generally recommends placing the archive mailboxes on a
low- end mailbox server that uses inexpensive direct-attached
storage (such as a SATA array). Remember, if a mailbox database
contains only archive mailbox- es then it won’t be subject to the
same I/O load as a mailbox database that’s
STORAGE
37 Storage April 2011
What makes archive mailboxes different from other Exchange
archiving methods is that unlike a more traditional archive (such
as a journal mailbox), the user retains ownership of the items in
the archive mailbox.
“When it comes to disaster recovery,
ShadowProtect® makes me fast, flexible and reliable.”
“When it comes to yoga, I’m just a disaster.”
The fast, flexible and reliable disaster recovery, data protection
and system migration that ShadowProtect Server™ 4
provides will not only help expand your business, but give you the
time to “expand your mind” as well.
New VirtualBoot™ technology lets you recover a 1.5 TB server in
just 3 minutes! Download a FREE trial version at
www.storagecraft.com/shadow_protect_server.php. It’s simple and
painless. (That’s more than can be said for the Lotus
position.)
StorageCraft is proud that ShadowProtect has been selected as a
finalist for the
Storage magazine/SearchStorage.com 2010 Product of the Year.
used to store the user’s primary mailboxes. Another advantage to
using low- cost storage for user archive mailboxes is that doing so
makes it practical to set a high mailbox capacity quota on the
archive mailboxes. (See “Can Ex- change Server’s archiving and
e-discovery replace third-party products?” p. 34.)
JOURNAL JUGGLING Another consideration to take into account is the
journal mailbox. If you use journaling to archive messages at the
hub transport level then all the archived messages are placed into
the journal mailbox.
I’ve never come across any Microsoft best practices for the
placement of journal mailboxes, but I like to put the journal
mailbox in its own mailbox database. This is because the journaling
process tends to be very I/O intensive and placing the journal
mailbox in a dedicated mailbox database ensures its I/O doesn’t
degrade the performance of the other mailbox databases. If all mes-
sages are journaled, locating the jour- nal mailbox within the same
store as the user mailboxes will double the I/O requirements
because Exchange 2010 doesn’t use single-instance storage. In other
words, journaling causes an extra copy of each message to be
created within the mailbox store.
If you were to create the journal mailbox in the same database as
the user mailboxes, it would have a major impact on the replication
process (assuming that database availability groups are being
used—see “Protecting Exchange Data,” p. 40).
Another advantage to locating the journal mailbox in a separate
mailbox database is that it makes it easy to manage storage quotas
and message retention based on mailbox function. You can create one
set of policies for user mailboxes and another set of requirements
for the journal mailbox.
DISCOVERY MAILBOX The last type of mailbox you should consider when
planning for Exchange 2010 storage is the discovery mailbox. The
discovery mailbox is only used when a multi-mailbox search
(e-discovery) is performed. The search results are stored in the
discovery mailbox.
STORAGE
39 Storage April 2011
I’ve never come across any Microsoft best practices for the
placement of journal mailboxes, but I like to put the journal
mailbox in its own mailbox database.
Dealing with data growth
Cloud storage shaping up
PROTECTING EXCHANGE DATA
Exchange Server has always been somewhat difficult to protect. If
you do a traditional nightly backup of your Exchange servers, a
failure could potentially result in the loss of a full day’s worth
of messages. For most companies, such a loss is unacceptable.
Exchange administrators have taken a number of different steps to
prevent substantial data loss. In Exchange 2007, for example, it
was a common practice to use continuous replication to replicate
mailbox data to another mailbox server. A continuous replication
solution provides fault tolerance and acts as a mechanism for pro-
tecting data between backups. (Of course, using a continuous data
protection solution such as System Center Data Protection Manager
is also a good option.)
Some observers feel Microsoft is working toward making Exchange
Server backups completely unnecessary. The idea is that Database
Availability Groups will eventually make Exchange resilient enough
that you won’t need backups.
Database Availability Groups are an Exchange 2010 feature that lets
you create up to 16 replicas of a mailbox database. These repli-
cas reside on other mailbox servers, and it’s even possible to
create database replicas in alternate data centers. Despite the
degree to which Database Availability Groups can protect mailbox
data, you shouldn’t abandon your backups just yet.
Having multiple replicas of each database makes it easier to pro-
tect Exchange Server, but if a mailbox database becomes corrupted
or gets infected with a virus, the corruption or viral code is
copied to the replica databases.
But Microsoft does offer a delayed playback feature in which lagged
copy servers are used to prevent transactions from being instantly
committed to replica databases. If a problem occurs, you’ll have
enough time to prevent the bad data from being committed to a
replica database. Once you’ve stopped the bad data from spread-
ing, you can revert all your mailbox databases to match the state
of the uncorrupted replica.
While this approach sounds great in theory, Microsoft still has a
lot of work to do to make it practical. Right now the procedure
requires you to take an educated guess as to which transaction log
contains the first bit of corruption and then work through a
compli- cated manual procedure to prune the log files. So while
Exchange 2010’s storage architecture makes it easier to protect
your data by way of Database Availability Groups, you shouldn’t
rely on them as the only mechanism for protecting Exchange
data.
By default, the discovery mailbox is assigned a 50 GB quota. This
sounds large, but it may be too small for performing e-discovery in
a large organization.
When it comes to choosing a storage location for a discovery
mailbox, capacity is generally more important than performance.
While the e-discovery process is I/O intensive, the I/O load is
split between the database containing the user mailboxes and the
database holding the discovery mailbox.
If e-discovery isn’t a priority, then you may consider not even
bothering to create a discovery mailbox until you need it. If
that’s not an option, your best bet is to place the mailbox in a
dedicated mailbox database that lives on a low-cost storage system
with plenty of free disk space.
MORE PLANNING REQUIRED Clearly, there are a number of
considerations that must be taken into account when planning an
Exchange Server storage architecture. Even though Exchange 2010
isn’t as I/O intensive as its predecessors, I/O performance should
still be a major consideration in the design process. Other
important considerations include capacity and fault tolerance.
2
Brien M. Posey is a seven-time Microsoft MVP for his work with
Exchange Server, Windows Server, Internet Information Server (IIS)
and File Systems/Storage. He has served as CIO for a nationwide
chain of hospitals and was once a network administrator for the
Department of Defense at Fort Knox.
STORAGE
Learn More!
• Colo-Level Virtualization
catch-up we were creating the next breakthrough in business
continuity.
MirrorCloud is a feature-rich, add-on to the robust SmartStyle
Computing platform which continuously mirrors data from
Windows-based servers and desktops
to the scalable SmartStyle Cloud Servers. It is expandable up to
100
reliable than RAID 6!
Score a ‘technical’ knock out with your customers today!
Announcing a Business Continuity Solution in a Weight Class By
Itself
Dealing with data growth
Cloud storage shaping up
43 Storage April 2011
aS DATA GROWTH and the costs associated with it keep rising,
leveraging storage infrastructure hosted by a service provider and
made available to subscribers over a network is gaining in
popularity. That means cloud storage resources are frequently being
combined with existing, on-premises backup technologies to provide
off-site copies for long-term retention and, in some cases, for
just- in-case-of-a-disaster copies. In addition, a few vendors are
attacking the issue of bandwidth and optimizing cloud backup
storage to ensure the implementation is up to the task and,
importantly, makes fiscal sense.
INTEREST IN CLOUD STORAGE ESG polled 611 IT professionals
responsible for evaluating, purchasing and/or operating corporate
IT and data centers in North America and Western Europe and found
61% were using or interested in using infrastructure as a service
(IaaS). With IaaS, the service provider owns the equipment and is
responsible for housing, running and maintaining it, with the
subscriber typically paying on a per-use basis. Subscribers have
access to a virtual pool of shared re- sources that promise
elasticity, so storage for off-site backup copies is avail- able on
demand. The costs associated with owning and managing resources at
a second site are reduced—and the need to maintain a secondary site
for off-site disk or tape storage is eliminated.
But backing up to and recovering from cloud storage may introduce
chal- lenges for large backup sets. The daily volume of backup
data, and the time needed to complete transfers, may require more
bandwidth and time than is available. IT organizations often
struggle with the tradeoff between the high costs of purchasing
more bandwidth and extending backup windows or recovery time.
hot spots | lauren whitehouse
Backing up to the cloud requires new approach to bandwidth
Can you use cloud storage for backup? Sure, but beware the
bandwidth and transfer issues
that can arise, and take note of the progress several key vendors
have made in this space.
STORAGE
OPTIMIZING BANDWIDTH FOR BACKUP But alternatives exist to address
these issues. Products from vendors such as NetEnrich, Pancetera
Software and Riverbed Technology make a hybrid backup configuration
(the combination of on-premises backup technologies and cloud-based
storage services) more feasible via technologies that optimize
bandwidth.
NetEnrich saw an opportunity to provide virtual storage in the
cloud for EMC Data Domain customers. Subscribers use existing
on-premises backup software and EMC Data Domain appliances to
protect data locally. For an off-site copy in the cloud, backup
data stored on the local EMC Data Domain appliance is replicated to
NetEnrich’s Data Recovery Vault, which is based on EMC Data Domain
deduplication appli- ances at the NetEnrich data center. Data
Domain dedupe and compression radically reduce the capacity of
band- width required for remote replication of backup copies.
Pancetera Unite is a virtual appliance that dramatically reduces
I/O and bandwidth usage for backup and replication of VMware
virtual server environments. Pancetera’s SmartRead and SmartMotion
technolo- gies optimize the capture and movement of virtual
machines into the cloud. The product integrates with in-place
backup environments to enable opti- mization without disrupting the
status quo. i365 teamed with Pancetera, embedding the Pancetera
Unite virtual appliance in i365’s EVault data pro- tection
products. The combination reduces the overhead associated with
VMware backups initiated by EVault, and optimizes data movement
across the LAN or WAN. Pancetera can be combined with WAN
acceleration technology to further accelerate transmissions.
Riverbed Whitewater cloud storage accelerator leverages the WAN
opti- mization technology in existing Riverbed offerings to provide
a complete data protection service to the cloud. Integrating
seamlessly with existing backup technologies and cloud storage
providers’ APIs, the appliance-based product provides rapid
replication of data to the cloud for off-site retention. A Riverbed
Whitewater appliance is installed in the network to serve as the
local target for backup jobs and is attached to the Internet for
replication of data to Amazon S3 cloud storage. The deduplication
and compression technologies that are
Pancetera Unite is a virtual appliance that dramatically reduces
I/O and bandwidth usage for backup and replication of VMware
virtual server environments.
STORAGE
the cornerstones of Riverbed products deliver WAN optimization and
accelerate data transfer.
As users look to cloud storage services as a low-cost alternative
to main- taining their own infrastructure, there are clear
benefits:
• Provides a more cost-effective strategy than maintaining a
corporate- owned and -operated secondary site.
• Eliminates capital and operating costs, including the acquisition
and maintenance of hardware, data center floor space, as well as
data center environmental factors such as power and cooling.
• Offers more predictable budgeting. • Facilitates disaster
recovery via a remote-based copy.
With bandwidth contributing significantly to the hybrid backup
configuration bottom line, it makes sense to explore
bandwidth-optimizing technologies such as deduplication,
compression and WAN acceleration. The latest crop of products
introduces hyper-efficiency in LAN/WAN transfer of data center-
driven backup copies to cloud-based storage. This is a key area for
data storage managers to focus on when considering their hybrid
cloud scenarios. 2
Lauren Whitehouse is a senior analyst focusing on backup and
recovery software and replication solutions at Enterprise Strategy
Group, Milford, Mass.
Products shown above are GSA compliant.
ReadyNAS® Pro 6 ReadyNAS® 3100, 2100, 4200, 3200 (Top to
Bottom)
*The 5-Year Hardware Warranty only covers hardware, fans and
internal power supplies, and does not include external power
supplies or software. Hardware modifications or customization void
the warranty. The warranty is only valid for the original purchaser
and cannot be transferred.
NETGEAR, the NETGEAR logo, Connect with Innovation, ReadyNAS,
ReadyNAS Replicate, and ReadyNAS Vault are trademarks and/or
registered trademarks of NETGEAR, Inc. and/or its subsidiaries in
the United States and/or other countries. Other brand names
mentioned herein are for identification purposes only and may be
trademarks of their respective holder(s). Information is subject to
change without notice. © 2010 NETGEAR, Inc. All rights
reserved.
Backup, Restore and Disaster Recovery • Ideal disk-to-disk backup
target for Symantec,
Acronis or StorageCraft
• Ideal target for virtual machine backups with Veeam or
Vizioncore
• ReadyNAS® Replicate option for easy offsite disaster
recovery
Virtualization • Build affordable virtualization solutions in small
or
remote offices
• Ideal backup target for VMs
Cloud Computing • Hybrid cloud solutions for combination local
and
hosted file sharing and archiving
• FREE! 100GB of ReadyNAS Vault offsite archive
ReadyNAS® Pro 4
ReadyNAS® Pro 2
Simply Smarter Business Storage for Virtualization,
Backup and Cloud Computing
Reliable • 5 year warranty
vendors
• Reduces operating expenses through automation
Simple • Easy installation
• Painless remote management
Dealing with data growth
Cloud storage shaping up
47 Storage April 2011
eVERYTHING IS “CLOUDY” these days. Hardly a day goes by without yet
another player jumping on the cloud bandwagon. Some are
legitimately tied to the cloud concept, but others are “cloud
washing” or force-fitting their products to the cloud concept
because they think if they don’t they’ll fall out of favor with IT
users.
However, the questions I’m asked most by IT users are usually on
the order of the following:
Our central IT supports several divisions, each of which also has
its own IT. One division decided to make a deal with Amazon Web
Services and transferred some data to S3 storage. Managers in
another division have done deals with Nirvanix or Rackspace or
AT&T Synaptic, and sent company data to them. What should we
do? We don’t want to suppress innovation, but we feel like we’re
losing control.
and . . .
Our storage vendor is asking us to create a private cloud using
mostly the same products as before but now with additional
federation products. Is the technology ready for building a private
cloud?
Here’s how I see it. The cloud is happening, whether you like it or
not. It’s a lot like what we saw with storage virtualization in
2000. I felt then that the concept had so much merit it was bound
to happen, but it took much longer than seemed logical. That’s
simply the reality of IT. Even when a paradigm- shifting technology
comes along, it takes time for it to get into daily use. The cloud
is similar. Implemented correctly, it’s supposed to improve storage
utilization while allowing you to scale up or down at will. You can
pay as you grow and enjoy an easy-to-use storage system. So, the
question isn’t why, but when and how.
read/write | arun taneja
Don’t let the cloud obscure good judgment
While new and largely untested, cloud storage is likely to become a
significant part of
your data storage infrastructure.
FOLLOW THE CLOUD My first piece of advice is don’t fight the cloud.
You’ll need to develop in- house expertise to understand what cloud
technology is, what’s real and what’s not, who’s in the game and so
on. Next, you’ll want to experiment with public cloud offerings
using data you can afford to mess around with. You can test the
waters to see how scaling works, how services provide security, if
data transfer speeds are adequate and so on. You’ll also want to
test out recovering files, full volumes and more. These tests
should help you to develop guidelines you can provide to business
divisions defining what data may or may not be sent outside the
company, and how it needs to be managed. This will bring
consistency to the enterprise while ensuring that innovation in
cloud technology is being exploited.
Perhaps the easiest way to get into the game is to use a gateway
product as an on-ramp to the cloud. You want to avoid writing your
own cloud inter- face code even if you’re familiar with the Web
services APIs used by most cloud services. The gateway vendors have
already done the heavy lifting and provide a standard way of
interacting with existing applications (via NFS, CIFS, iSCSI, Fibre
Channel), while ac- commodating the idiosyncrasies of each public
cloud on the back end. Vendors in this category include Cirtas,
Iron Mountain, LiveOffice, Mimecast (for email archiving), Nasuni,
Nirvanix, StorSimple, TwinStrata and Zetta among others.
BUILDING YOUR OWN CLOUD For a private cloud, find out what your
primary storage vendor is planning. Vendors are at different stages
of product development and availability. EMC seems to be ahead
right now, having announced and shipped VPLEX, an important
federation technology that’s crucial in building large clouds. But
all major storage vendors have serious plans to deliver private
storage cloud products and services. Not surprisingly, each wants
you to build your private cloud almost exclusively with components
from them, but from my perspective, no one in the market has all
the pieces yet. You may consider other alternatives. Nirvanix, for
instance, has created something it calls hNode, or a hybrid node.
Essentially, it lets you create a private cloud using the same
software Nirvanix uses for its own Storage Delivery Network (SDN);
this would allow your private
STORAGE
48 Storage April 2011
Perhaps the easiest way to get into the game is to use a gateway
product as an on-ramp to the cloud.
cloud to interface with a public cloud based on the Nirvanix
architecture.
LONG-TERM CONSIDERATIONS Whatever route you decide to take, keep in
mind that it’s one of the most strategic decisions you’ll make.
Once you sign on with a vendor you’re likely to be locked in for a
long time.
Vendors are all in learning mode today, just as we are. So take the
time to study and experiment, before jumping headlong onto the
bandwagon. 2
Arun Taneja is founder and president at Taneja Group, an analyst
and consulting group focused on storage and storage-centric server
technologies. He can be reached at
[email protected].
STORAGE
A disaster recovery strategy is paramount for maintaining
continuity, and ultimately protects your business from total
collapse. Spectra’s solutions
include the latest tape technologies and features that allow you to
backup and archive your data with complete confidence, giving you
the peace of
mind you need when you get caught by surprise.
Backup • Archive • Tiered Storage
2010 Winner of Storage Magazine Quality Awards for Mid-Range and
Enterprise Tape Libraries
Fibre Channel still top dog among disks With 6 Gb SAS disks and
emerging solid-state products stirring up the data stor- age pot,
and concerns about ever-escalating capacities, more attention is
turn- ing toward drive technologies. In our survey of more than 200
storage users, 69% had Fibre Channel (FC) arrays installed, helping
FC remain the most widely installed storage array type. But at 62%,
NAS is closing in on FC, and DAS—often overlooked or discounted—was
the third most popular alternative. iSCSI SANs and multipro- tocol
arrays are each used by approximately one-third of respondents. FC
disks account for 54% of all installed disks, followed by SATA and
SAS. But that mix will likely change as respondents shop for the
average 55 TB of disk capa city they expect to add this year.
Storage managers’ shopping carts will be filled with near- ly equal
parts of SATA (52%), FC (50%) and SAS (47%) disk s. In addition,
storage buyers expect SATA/SAS and PCIe solid-state storage to make
up 13% and 11%, respectively, of their storage purchases. —Rich
Castagna
“I’m waiting for the concept of ‘hybrid’ drives with a small amount
of their own SSD—popular in the desktop world now—to make it into
enterprise storage.”
—Survey respondent
snapshot
Which type of drive is used for your company’s highest storage
tier?
Dealing with data growth
Cloud storage shaping up
currently installed at your company?
69% Fibre Channel SAN
33% iSCSI SAN
Average TB of disk capacity to be added in 2011
By percentage, the mix of currently installed disk types and 2011
planned additions.
Solid state (PCIe
interface) 8%
SAS 10%
SATA 11%
14%
Will add in 2011
MAY Automated Storage Tiering
Storage tiering is gaining renewed interest, largely due to the
emergence of solid-state storage as a viable alter- native for
high-performance storage. But automated tiering doesn’t just
benefit the high end; it’s an effective way to make efficient use
of all installed storage resources. We’ll describe what vendors are
offering automated tiering capabilities and how they work.
Storage Purchasing Intentions Over the last eight years, Storage
magazine and SearchStorage.com have fielded a twice-yearly survey
to determine the purchasing plans of storage professionals. This
article reports and analyzes the results of the latest edition of
the survey and provides insight into emerging trends.
Blueprint for Cloud-Based Disaster Recovery
Cloud storage services may seem perfect for disaster recovery (DR)
planning, especially for smaller firms that may not have the
resources for collocation facilities. But there’s much more to plan
for than just tucking a firm’s data into a safe place in the cloud;
getting it back when it’s needed may be the key to whether cloud DR
is an appropriate choice.
And don’t miss our monthly columns and commentary, or the results
of our
Snapshot reader survey.
Editorial Director Rich Castagna
Creative Director Maureen Joyce
Steve Duplessie, Jacob Gsoedl, W. Curtis Preston
Executive Editor Ellen O’Brien
Senior News Director Dave Raffo
Senior News Writer Sonia Lelii
Features Writer Carol Sliwa
Editorial Assistant Allison Ehrhart
Managing Editor Heather Darcy
Features Writer Todd Erickson
TechTarget Conferences Director of Editorial Events Lindsay
Jeanloz
Editorial Events Associate Jacquelyn Hinds
Storage magazine Subscriptions: www.SearchStorage.com
• Preventing Data Overload
• How Dell EqualLogic Auto-Snapshot Manager/VMware Edition Helps
Protect Virtual Environments
• Kane County Saves $1 Million with Server and Desktop
Virtualization on Dell Servers and Virtualized iSCSI SANs
• Sizing and Best Practices for Deploying VMware View 4.5 on VMware
vSphere 4.1 with Dell EqualLogic Storage
See ad page 4
• ESG:What EMC is Doing to Backup
• IDC BUYER CASE STUDY: EMC IT Increasing Efficiency, Reducing
Costs, and Optimizing IT with Data Deduplication
SPONSOR RESOURCES
• A guide to understanding what VMware backup method is best for
your environment
• HP StoreOnce Deduplication Software: Technology Fueling the Next
Phase of Storage Optimization
See ad page 35
• Reduce your data storage footprint and tame the information
explosion
• Leverage the IBM Tivoli advantages in storage management
• Virtualize Storage with IBM for an Enhanced Infrastructure
See ad page 46
• Printing Franchisee Uses NETGEAR to Protect Against Site-Wide
Disaster® ReadyNAS® 2100
• ReadyNAS® 3200 Boosts Financial Services Firm Productivity by
Supporting 20 Virtual Machines, Cutting Rack Space by 80% and Costs
by Over 50%
See ad page 20
• Deduplication in the Enterprise Data Center: Assessing the Impact
and Benefits
• Deduplication for Dummies
• Checklist: Key factors in planning a virtual desktop
infrastructure
• The first step toward a virtual desktop infrastructure: The
assessment
Gotta yottabyte?
Virtual disaster recovery
Backing up to the cloud requires new approach to bandwidth
Don't let the cloud obscure good judgment
Fibre Channel still top dog among disks
May 2011 preview/Editorial masthead