Green Data Center Sto rage - Home - VOX

40
Green Data Center Storage SYMANTEC TECHNOLOGY NETWORK 1 Green Data Center Storage Sustained CostEffectiveness in Transitioning Times Bruce Naegel John Colgrove W. David Schwaderer October 02, 2007 SymantecSolutions

Transcript of Green Data Center Sto rage - Home - VOX

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

1

Green Data Center Storage

Sustained Cost­Effectiveness in Transitioning Times

Bruce Naegel John Colgrove

W. David Schwaderer

October 02, 2007

Symantec™ Solutions

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

2

Storage Network Security Executive Summary

Why You Should Read This Paper In May, 2006, IDC reported its 2005 data center survey results in a telebriefing titled End User Perspectives on Server Blade Adoption 1 . This telebriefing indicated that, in order, the top three data center issues survey qualified respondents felt they faced were 2 :

1. Power provisioning 2. Floor Space 3. Power Consumption

Exacerbating matters, respondents also expected both processing requirements and power requirements to increase – at 25 percent for power by 2009 while other independent estimates suggest approximately a doubling every five years.

Subsequently, on November 29, 2006, Gartner, Inc. issued a news release 3 titled:

50 Percent of Data Centers Will Have Insufficient Power and Cooling Capacity by 2008.

Power, Space, and Cooling – exceed any of these three constraints within data centers and enterprises may well face building new data centers, sometimes for as much as $1,000 per square foot.

Hardware power and cooling problems are clearly a major and growing concern.

Yet, today, Gartner research vice president Rakesh Kumar estimates traditional data centers typically waste more than 60% of the energy they use to cool just their equipment. 4

The good news is that effective software utilization can significantly help reduce hardware power consumption and consequent cooling requirements. While it appears that more energy­efficient power supplies, processors, chipsets, and cooling solutions are beginning to, will eventually, address these near­ term problems, in the short term, enterprises must vigilantly oversee data center power and cooling issues. To that end:

Symantec contends it’s software products can help customers reduce data storage energy consumption by as much as 50 percent and total data center energy consumption by as much as 25 percent. Moreover, Symantec software allows enterprises to exploit new energy­ efficient servers and storage more easily, enabling them to continue conserving energy optimally.

This paper explains how CIOs can potentially save tens of millions of dollars in advance of potential governmental data center energy conservation mandates and taxation penalizations.

Symantec invites enterprises to request a free ROI analysis on the total savings Symantec software can bring to their specific data center facilities. We look forward to the ensuing engagements.

1 http://www.idc.com/getdoc.jsp?containerId=TB20060525 2 http://poland.emc.com/techlib/pdf/H2402_power_efficiency_storage_array_wp_ldv.pdf 3 http://www.gartner.com/it/page.jsp?id=499090 4 http://www.gartner.com/it/page.jsp?id=498224

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

3

Executive Overview

Today’s data center professionals are contending with unprecedented challenges that were largely unforeseen a short time ago. Collectively, the challenges invariably involve Data Center electrical energy, namely its:

• Decreasing Reliability • Decreasing Availability • Increasing Cost Per Watt • Consumption Patterns • Generated Heat Disposal

The situation is so pressing that enterprises increasingly find they cannot obtain additional power at any cost. 5

Indeed, power and cooling limitations are now usually the greatest data center growth impediment in addition to floor space availability. The problem is highlighted by a November 7, 2006 Business Week article titled Energy Could Eat Up to 40% of IT Budgets:

Large organisations spend between four and eight per cent ­ and sometimes as much as 10 per cent ­ of their IT budget on energy and Gartner predicts this will rise by up to four times within the next five years. 6

One 2006 data center survey conducted by power supply vendor Liebert found:

• 33% of respondents expect to be out of power and cooling capacity by the end of 2007 • 96% stated they would be out of capacity by 2011. 7

Unfortunately, traditional data center provisioning methodologies and simplistic rules­of­thumb that worked in the past are now proving inadequate – both economically and operationally. Worse, they likely provide a direct path to unpleasant near­term enterprise surprise as well as long­term disruptions that could be highly visible both internally and externally.

Industry consensus suggests that, if not efficiently and effectively addressed, these challenges will inevitably limit data center growth and flexibility that enterprise competitiveness requires. Indeed, the IT industry must now solve these many significant energy problems using a different level of thinking than the one used to create them.

That’s some of the news. Daily headlines suggest more – namely, the situation is already volatile and rapidly getting worse. Corporate compliance regulators and adversarial, litigious interests would likely agree, at your enterprise’s expense. Because significant change and its careful planning can require substantial lead­time to effect, it is important to begin addressing these obvious challenges now, particularly within data storage systems.

This Symantec white paper presents a preliminary storage energy analysis and conservation methodologies for both data center servers and storage subsystems. The discussion enables IT professionals to identify and avoid unnecessary energy consumption, with a focus on data center storage subsystems. This

5 http://www.businessweek.com/technology/content/may2007/tc20070514_003603.htm 6 http://www.businessweek.com/globalbiz/content/nov2006/gb20061107_447569.htm 7 http://www.enterprisestorageforum.com/management/features/article.php/3639286

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

4

discussion enables readers to determine how much energy they can potentially save within their data centers. The Symantec Technology Network 8 will post subsequent updates to this paper.

Clearly, significant change is at hand, requiring strong leadership in unstructured, transitional times. The good news is that efficient energy consumption and conservation is not only economically beneficial and possible, it also provides indisputable evidence of executive commitment to social responsibility in a world of increasing ecological awareness.

8 http://www.symantec.com/enterprise/stn/index.jsp

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

5

Table of Contents

Executive Summary ........................................................................................................2 Why You Should Read This Paper...............................................................................2 Executive Overview ....................................................................................................3

Table of Contents ............................................................................................................5 A Brief Instructional Retrospective..................................................................................6 Your Data Center ............................................................................................................7 Enterprise Data................................................................................................................8 Harnessing Enterprise Data Growth.................................................................................9 Different kinds of Data..................................................................................................10 Combining Data with Disk Types..................................................................................11 Disk Drive Growth........................................................................................................16 Data Growth Tyranny....................................................................................................16 MAID System Overview ...............................................................................................17 Three Tier Savings Potential..........................................................................................17 Aggressive Dynamic Storage Tiering (DST)..................................................................19 Server Energy Savings ..................................................................................................19 Reduced High Availability Cost ................................................................................20 Managing Virtualization............................................................................................20

Evolving To Green Data Center Storage........................................................................21 Conclusion ....................................................................................................................23 Call To Action...............................................................................................................24 Appendix A – Estimating Storage Subsystem Electrical Power and Cooling Costs ........25 Appendix B – RAID Overview......................................................................................30 Appendix C – CC Storage Analysis Tool.......................................................................32 Appendix D – A Sample 50% Data Storage Power Consumption Reduction..................35

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

6

A Brief Instructional Retrospective

From 1949 to 1966, Joseph Eichler’s company, Eichler Homes, built approximately 11,000 Northern and Southern California homes. Achieving remarkable architectural acclaim and commercial success, these uncommonly attractive homes were spacious ­ often bathed inside with natural light from floor­to­ceiling glass rear and inner­courtyard walls. Even today, many passers­by find the homes visually appealing, if not stunning. Beyond looks, Eichler homes included numerous innovations, including concrete floors with embedded copper pipes to circulate heated water for unobtrusive radiant heating. Best of all, the acquisition entry price was relatively modest considering the truly superb livability and privacy. To many, Eichler Homes indisputably built best­of­class dwellings in their day. Fast forward to today. Unfortunately, Eichler Homes constructed homes in an era when monthly home electrical bills were often ten dollars or less. Moreover, contemporary nuclear power pundits suggested that nuclear energy’s unlimited potential and availability would make energy so inexpensive that it would cost too much to meter it. But, as Neils Bohr once observed:

Prediction is very difficult, especially about the future.

Today, an original unmodified Eichler Home comprises an unqualified energy efficiency and monthly electric bill disaster. With no motivation to conserve energy costs, Eichler homes had roofs without insulation and used thin, single pane glass in windows and glass walls. Moreover, exterior wooden wall insulation was an option that buyers sometimes did not elect. In other words, Eichler homes heated their neighborhoods on cool days.

To rectify the energy efficiency problems, subsequent Eichler Home owners have insulated roofs and exterior walls, as well as replaced glass walls with other materials such as energy­efficient double pane glass or acrylics. To address a natural failure of the floor’s radiant heating system, some owners have resorted to other heating approaches, including retrofitting new gypsum flooring with heated water pipes when they wanted to preserve the traditional Eichler ambiance. In short, Eichler home energy efficiency problems have usually proven solvable. It’s just a matter of money – maybe more than the house originally cost in present value dollars. And, sadly, so it is with many data centers today…

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

7

Your Data Center

Many people, previously unaware of how energy inefficient original Eichler homes were, often gently chuckle when they learn the facts. However, what these people are possibly unaware of is that an analogous situation exists in most enterprise data centers. And, there are few chuckles to be had there because, if it does exist, it threatens the owning enterprise’s growth and vitality.

You see, electrical inefficiency unnecessarily consumes electric energy that society now considers a shared, finite resource. In an age of increasing transparency and Internet blogging, the public’s umbrage can be widespread, varied, harsh, and swift. Here, it is useful to recall that Ida M. Tarbell’s investigative journalism for McClure’s Magazine over a century ago eventually led to the government­dictated Standard Oil Company dismemberment, much to the chagrin of John D. Rockefeller – arguably history’s wealthiest American.

Here, it is useful to note that the Congress’s December 20, 2006 Public Law 109–431 9 authorized the U.S. Environmental Protection Agency To study and promote the use of efficient computer servers in the United States under its Energy Star Program.. On April 23, 2007, the EPA released its Public Review Draft which is informative but cannot be cited at this time 10 because of its draft status. Finally, consider that various legislation discussion suggests banning low­efficiency incandescent lighting. It now seems certain that governmental oversight is inevitable.

From a civil litigation perspective, consider that the United States Supreme Court has now ruled that carbon dioxide and other greenhouse gases are air pollutants under the Clean Air Act, and that the current U.S. administration has violated law by refusing to limit carbon dioxide emissions.

It follows that adversarial litigants may now assert that when enterprises fail to rectify data center power waste that needlessly generates carbon dioxide, they are indistinguishable from committing a corporate governance and social responsibility dereliction of duty. It’s really that serious.

The good news is that it is easy to initiate changes to eliminate data center power waste though admittedly harder to implement the changes. However, the solutions can easily pay for themselves by reclaiming underutilized data center resources. This can delay or subsidize future capital expenditures while eliminating considerable energy OPEX costs.

We begin by discussing data.

9 http://energystar.gov/ia/products/downloads/Public_Law109­431.pdf 10 http://www.energystar.gov/ia/products/downloads/EPA_Server_Report_Public_Draft_23Apr.pdf

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

8

Enterprise Data Data is a deceptively simple­sounding, two­syllable word. About all that some people know about data is that enterprises derive vital information from data they must safely store in ways that permit only authorized users to read and perhaps change it. Other than that, it’s a puzzlement ­ as it should be.

But data needs careful protection because it is stored using mechanical devices such as hard disk drives that have a tendency to fail at inconvenient times. As recent events have shown, preventable data loss can be extraordinarily inconvenient when associated with litigation discovery processes. Courts have historically adopted a predisposition against oops as an acceptable data loss explanation.

Finally, authorized users needs to access data for their purposes according to Service Level Agreements (SLAs) ­ pre­agreed upon conventions that establish response time levels between user groups and the IT organization.

Barricading Data Against Loss Because mechanical devices fail and thereby place data at risk, the IT industry uses various approaches that allow data storage subsystems to provide a substantial immunity to the failure of one or more disks. You’ve likely heard of it – it’s called Redundant Array of Independent Disks or RAID for short.

There are many RAID variants – RAID 0, RAID 1, RAID 1+0, RAID 5, and RAID 6 being the most well known. In the event you are unfamiliar with RAID, Appendix B contains a brief RAID overview.

For our purposes here, RAID technical details are not important. Each RAID variant has its own advantages, achieved by trading different amounts of storage space for different levels of performance and immunity to failure. But the important point is this:

When any RAID scheme provides protection against disk failure, there is a difference between the total amount of data an enterprise has and how much disk space the data occupies.

RAID protection always requires more space than unprotected data requires.

• With simple RAID 1 support (mirroring), the required space is 200 percent (twice) what unprotected data requires.

• With RAID 5 support that protects the capacity of four disks using the capacity of a fifth requires 125% (5/4) what unprotected data requires.

Again, Appendix B has RAID technology information including a discussion on how to calculate effective storage utilization for various RAID configurations.

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

9

Harnessing Enterprise Data Growth A primary challenge with enterprise data is that is continues to grow. This is partly a consequence of the increasing scope of the IT mission as well as regulatory compliance statutes that require electronic communication retention. It’s also a consequence of data naturally increasing with record and file size increases (emails, attachments, richer documents, larger spreadsheets, etc). One potential opportunity for substantial power savings involves the amount of inactive data existing on systems. Indeed, Symantec often finds that an enterprise has often provisioned more than five times the storage their active data requires.

As another example, email usually exhibits explosive growth in any enterprise. Moreover, because email and other electronic systems proliferate identical copies of files, it follows that one way to economize storage consumption is to use de­duplication technologies. Here, programs such as Symantec’s Enterprise Vault 11 email and Veritas NetBackup PureDisk 12 packages can make significant contributions in this arena. One common estimate is that enterprise data grows at a 70% compound annual growth rate (CAGR) or more. This discussion uses a 70% CAGR because that often proves conservative. However, some enterprises experience a 130% growth rate and your enterprise’s growth rate likely experiences a different rate yet.

Figure 1 shows how serious a problem a conservative 70% growth rate can be by using carefully constructed triangles. Just glancing at the figure suggests that without proper data storage capacity planning and provisioning, in a few years, your enterprise’s data is probably not going to fit within your storage subsystems. 13 If so, there is little time to waste, carpe diem.

Finally, comprehensive data center data storage assessments invariably reveal substantial unutilized, unallocated, hence collectively wasted, storage capacity. An excellent tool to help perform such discovery efforts is Symantec’s Veritas CommandCentral Storage 14 (CC Storage), a tool that identifies waste and file usage patterns.

Appendix C contains a brief overview and example CC Storage screen captures.

11 http://www.symantec.com/enterprise/products/overview.jsp?pcid=1018&pvid=322_1 12 http://www.symantec.com/enterprise/products/overview.jsp?pcid=1018&pvid=1381_1 13 Note that a 50% CAGR in storage capacity partially offsets the severity of the continuing data growth challenge. Thus, an adjusted storage device capacity growth rate would be approximately (1.7/1.5) or 13% in this discussion. 14 http://www.symantec.com/enterprise/products/overview.jsp?pcid=1020&pvid=19_1

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

10

Figure 1 – The effect of a 70% data growth rate

Different kinds of Data As indicated earlier, IT organizations support their enterprises according to pre­ determined SLAs. Some SLA terms require extremely fast access to selected data while allowing slower access for other data. Other SLAs might specify required data retention periods before the data can be discarded. Yet other SLA might specify different access speeds depending on time periods such as high­speed access during quarter­end and year­ end reporting and slower access during other periods.

We now define transaction data as data that currently experiences high read, write, and update activity and therefore requires high performance storage. Non­transaction data is other data that does not currently experience such high activity and can reside on lower performance storage.

We have mentioned that Appendix B shows how different forms of RAID provide different performance levels. As it turns out, there are multiple types of hard disks that can also help meet the varying objectives that SLAs dictate. 15

Two generally recognized groups are:

High performance disks – These expensive disks provide low­latency (fast) responses to read/write requests and high, sustained transfer rates (throughput). In every possible manner, this type of disk is highly optimized for high performance and well suited to store transaction data ­ data involved in high transaction rate applications.

15 While the IT industry attempts to place any disk into one of two groups, the large population of available models provides a continuum of performance and capacity tradeoffs.

70% Growth

70% Growth

70% Growth

Now 1.7X In 1 Year

2.98X In 2 Years

4.9X In 3 Years

Current Data

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

11

Typically, high performance disks have relatively modest capacities – 36.8 or 73.4 Gbytes at the time of this writing. This modest capacity helps accelerate performance in several ways.

Internally, the storage platters can be very small, allowing them to spin faster and generate less air friction heat. The smaller disks naturally decrease the distance the read­write arm moves to perform operations. In addition, the faster rotation rate and faster electronics increases the media transfer rate, which increases the throughput.

Finally, smaller capacities and higher sustained transfers allow applications to read all data on a disk in a relatively short time, say, less than 20 minutes. This is important in backup operations and RAID data recovery rebuild operations.

But, high performance disks require relatively substantial power for their modest storage capacities. And this substantial power generates substantial heat. High performance disks are therefore both expensive to buy and expensive to operate. The only visible relief in this situation is an eventual migration to 2.5” high performance disks which can use as much as 40 percent less power.

High capacity disks ­ These relatively inexpensive disks are highly optimized to provide substantially more (e.g.10X) storage capacity than high performance disks while consuming lower total power. Their storage platters are bigger and spin slower, thereby reducing air friction heat and helping keep them from tearing themselves apart.

Consequently, high capacity disks have higher access latency and slower sustained transfer times. But, because they require substantially less power, their consumed Watts per GByte ratio is much more favorable than high performance disks achieve though they can rarely meet the performance level high performance disks exhibit.

Historically, the IT industry has regarded high capacity disks as being less reliable than high performance disks enterprises often use. Interestingly enough, a recent Carnegie Mellon University’s Computer Science Department report titled Disk failures in the real world: What does an MTTF of 1,000,000 hours mean to you? suggests that high capacity disks have reliability comparable to high performance disks, much to the surprise of the IT industry. The report indicates that they detected “little difference in replacement rates between SCSI, Fibre Channel and SATA drives, potentially an indication that disk­independent factors, such as operating conditions, affect replacement rates more than component specific factors."

Combining Data with Disk Types We have seen that different enterprise data has different application response and protection requirements, dictated by associated SLAs. Moreover, different disk types

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

12

have different performance and capacity requirements. Finally, different RAID variants provide different performance and protection levels.

Transaction data is well suited to RAID 0+1 protection using performance­oriented drives, requiring an additional 100 percent more disk capacity.

Non­transaction data is well suited to RAID 5 or RAID 6 protection using capacity­oriented drives. RAID 5 (4+1) systems require an additional 25 percent more disk capacity and RAID 6 (4+2) systems require an additional 50 percent more disk capacity

While it can prove tempting to place transaction data on capacity­oriented drives, it is important to know that the drives may not physically allow applications to meet the application SLAs. This is a career­limiting train to miss.

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

13

Figure 2 illustrates the concept.

Figure 2 – Matching Data Types with Appropriate Drives to meet SLAs

In examining Figure 2, there are two storage tiers resulting from various SLA requirements. The upper tier is called Tier 1 and the lower tier is called Tier 2.

Note that this is a rather simplistic rendering because there are a variety of disk drive models with different performance and capacity tradeoffs as well as several RAID variants that can collectively combine to provide a continuum of different application performance levels.

RAID 0+1 Mirrored Transaction

Data

RAID 5 (4+1) Non­Transaction

Data

1 2 3 4 ck 1 2 3 4 ck

1 2 3 4 ck

1 2 3 4 ck

1 2 3 4 ck

1 2 3 4 ck

1 2

3 4

ck 1 2 3 4 ck

1 2 3 4 ck

1 2 3 4 ck

Performance Drives

Capacity Drives

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

14

In addition, Figure 2 also fails to depict the snapshot and backup information automated processes create for disaster and error recovery. Figure 3 therefore depicts a closer look at Tier 1, representing backup and snapshot data as shaded figures.

Figure 3 – Mirrored Tier 1 data from Figure 2 with recovery snapshots and backup copies unnecessarily consuming Tier 1 capacity.

Now, the best snapshot and backup information an enterprise can have is snapshot and backup information that is never used because potential failures never occurred. Therefore, recovery data can often reside in Tier 2 where it is available with reasonable access speed and more economically stored.

However, the reality is that such snapshot and backup information often resides in the same tier where the original information resides, unnecessarily consuming copious high performance storage capacity while simultaneously consuming increasingly expensive electric power. Both resources are usually in short supply, becoming scarcer and more expensive every day.

The solution is to migrate as much data as possible as soon as possible from the top tier to the lower tier. This is easily accomplished by using Symantec’s Storage Foundation Dynamic Storage Tiering (DST) feature, Veritas Volume Manager (VxVM), and CC Storage products. These products interoperate with many leading storage subsystem offerings from different vendors that provide Tier 1 and Tier 2 data storage support.

Dynamic Storage Tiering provides continuous migration support for file system data (not data contained within database structures) using automated methods that respond to changing conditions. Conversely, Veritas Volume Manager and CC Storage provide scheduled data migration services for both file system data and data contained within database structures. Collectively, these offerings provide effective enterprise data migration capability back­and­forth between tiers.

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

15

Symantec also has other products (e.g. CC Storage and Storage Foundations Volume Manager) that can identify storage migration candidates as well as tools to help migrate the data. These will be described in more detail in an update to this white paper.

Too Many Data Copies

Enterprises usually retain too many data copies. Mirroring may protect mission critical application data which is also probably protected by another replica in a disaster recovery (DR) datacenter. The DR replica is likely also mirrored, making four total copies at this point.

Next, enterprises often have a surprising number and variety of snapshots as well as other data copies. Some copies:

• enable accelerated backup and recovery • are available for data mining or end­of­period processing • enable data transformations to move data between applications • exist as safety precautions before various administrative operations • are available for testing and development

Often snapshot operations create the copies, leaving them in the same high performance disk array as the original data. Often they are also mirrored or otherwise protected at the same RAID level as the production data because it is operationally easier to configure snapshot storage the same way as the production storage.

In typical customer environments Symantec often sees from 10 to 30 copies of each production data byte. Many of these copies are no longer needed or are misplaced and the enterprise does not even know they still exist. Clearly, reducing the number of data copies reduces storage capacity requirements and storage power consumption. Once reduced, Symantec’s Veritas Volume Manager (VxVM) can move snapshots and other copies from Tier 1 performance arrays to Tier 2 capacity­oriented arrays.

Pareto’s Law Since data accesses often follow Pareto’s Law – 80% of data accesses involve only 20% of the data – aggressively migrating effectively­dormant application and application recovery data from Tier 1 to Tier 2 usually meets application performance SLAs while liberating substantial Tier 1 capacity – sometimes as much as 80%. The trick is to know what data to migrate and when. This is possible by using Symantec analysis tools such as CC Storage. CC Storage enables Symantec customers to identify storage for migration, saving them both capital costs and power after the migration.

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

16

As indicated earlier, through purpose­built construction, the lower Tier 2 usually cannot exhibit the high­speed application performance that the upper tier exhibits. So, migrating data from Tier 1 to Tier 2 must proceed judiciously to ensure continuous SLA fulfillment.

In this respect, migration can therefore seem like performing in­flight airplane engine maintenance. Symantec has therefore developed a first­approximation tool that measures file accesses to help identify what data should migrate. Appendix C briefly discusses this tool which is available on the Symantec Technical Network (STN).

If desired, released storage capacity can be removed to save data center electricity and space. Alternately, consolidated and unused Tier 1 storage can be powered off and left in place, thereby saving as much as 80% of Tier 1’s relatively high electrical requirements. Leaving the unused storage in place when possible provides future and emergency Tier1 storage expansion while deferring capital outlays.

Disk Drive Growth

While disk storage capacity naturally increases with each areal density (bit density) increase, data is growing far faster and will be for the foreseeable future. Hence, the total number of disk drives an enterprise needs necessarily grows commensurately simply from storage capacity requirements.

However, performance­oriented disks are not increasing their I/O Operations/second (IOPs) nearly as fast as disk capacity increases. Consequently, this further increases the required number of disks ­ many applications have increasing IOP requirements in addition to increasing storage needs. Since the number of disk drives and array subsystem DRAM cache amounts primarily determine storage power consumption, the power consumption situation will likely become worse. Here, it is important to note that most Tier 1 arrays presently use 36 GByte or 73 GByte disk drives because of stringent I/O Operations/second (IOPs)SLA performance requirements. Though new performance­oriented Tier 1 arrays may arrive with 146 GByte drives, many data centers also likely have existing legacy arrays in service that use 36 GByte drives or, perhaps, even 18 GByte drives.

From a power and cooling perspective, it may appear tempting to replace, say, eight older arrays that use 18 GByte drives, with one new array that uses 146 GByte drives and has the equivalent storage capacity with 1/8th the drives. However, 18 GByte drive IOPs performance nearly equals 146 GByte drive IOPs performance. So the IOPs performance loss would likely be too great with such a new, single array. Finally, other considerations such as the actual environmental impact of building new equipment to replace working, less­efficient equipment further exacerbate migration challenges.

Data Growth Tyranny

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

17

Because of continuing data growth, any existing data storage subsystem will eventually exceed available capacity unless more capacity becomes available. However, the data center may eventually not have enough space or electric power to support expanded storage capacity. This is the point where enterprises begin to consider building a new data center that can easily cost tens or hundreds of millions of dollars over time.

Clearly what is required is a different level of thinking than the incremental one used to arrive at this situation. That solution involves expanding Tier 2 capacity with extremely high capacity disks. But that consumes increasing space and power because data is growing faster than disk capacity is.

Moreover, unlike Tier 1 storage, traditional Tier 2 storage has no place to migrate data to that would free Tier 2 storage capacity and allow enterprises to remove or power off storage devices. An alternative solution augments traditional Tier 2 storage capacity with a storage resource that is optimized to provide high capacity storage capacity while simultaneously consuming minimal electrical power. That approach is available in the form of Copan Systems’Massive Array of Inactive Disks (MAID) technology.

MAID System Overview The Storage Networking Industry Association (SNIA) defines a MAID system as:

A storage system comprising an array of disk drives that are powered down individually or in groups when not required.

Copan’s MAID system also provides massive storage capacity in minimal space by housing disks in a three dimensional array versus a two dimensional array found in many storage systems. Like all MAID systems, Copan’s massive storage system consumes a small fraction of the electric power that equivalent capacity, traditional Tier 2 storage systems require.

Three Tier Savings Potential In effect, MAID systems provide a third data storage tier to migrate data to from the second tier in traditional two­tiered systems. Data residing on a MAID system is not archived in the traditional sense because it is readily accessible to applications as though it was on a Tier 2 high capacity data storage subsystem.

The difference is that MAID storage subsystems electrically power­off idle disks and powers them back on when an application needs access to the dormant data. This approach is similar to sensors that turn room lighting on and off depending on whether anyone is in the room. There is a slight initial access delay, usually less than 30 seconds, before the data is available. Thereafter, applications usually enjoy standard Tier 2 access performance.

Because of this slight delay, the type of data that a MAID system best stores is data that must be readily available, albeit with a longer access latency, but is not likely to

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

18

experience high access rates. The storage industry refers to online data with such negligible access activity as persistent data.

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

19

Aggressive Dynamic Storage Tiering (DST) With a MAID Tier 3 data storage system and aggressive migration to lower tiers using Veritas Storage Foundation’s Dynamic Storage Tiering (DST) feature, it is possible to significantly reduce the amount of data stored in traditional Tier 1 and Tier 2 configurations.

Within many enterprises today, Tier 1 might have 20% of the total data and Tier 2 might have 80%. Through aggressive DST, this can become, say, 5% in Tier 1 and 95% in Tier 2. By introducing MAID storage as third tier and applying aggressive DST to Tier 2, this may result in 5% at Tier 1, 20% in Tier 2, and 75% in Tier 3 and substantially reduced electrical power requirements.

While Tier 2 may use the same capacity drives as Tier 3 MAID system, therefore exhibiting a similar storage density (i.e. space efficiency expressed as GBytes per data center square foot) as Tier 3, Tier 3 substantially reduces the electrical power traditional Tier 2 requires because it stores infrequently accessed data on disks that are usually powered off.

Power­wise, Tier 3 would therefore be more economical than Tier 2 storage because it requires significantly fewer watts per GByte. Note that it is also possible for Tier 3 storage to reside in a different data center, thereby further reducing space and electrical requirements in crowded or underpowered primary data center facilities. Finally, it is important to approximate the potential power savings between tiers. Since power is proportional to the number of disk drives, their characteristics (IOPS and capacity), and individual power consumption, Tier 2 is often 8 times more efficient (6x space per drive times 30% power difference plus a little cache). Similarly, Tier 3 is often 6 times as efficient as Tier 2 – it uses same disks but powers most disks off most of the time).

Therefore, it seems reasonable that combining this tier efficiency difference and improving utilization likely saves more than half the power used for storage in almost any environment.

Server Energy Savings

While this paper focuses on saving power through optimized data capacity utilization by storage class, there are also substantial savings available from optimizing and reallocating servers. The nature of these savings is in 2 realms:

­ Reduced High Availability Cost ­ Managing Virtualization

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

20

Reduced High Availability Cost

Most high availability configurations consist of one active server and one passive standby server. This approach provides simplicity, high availability, and quick switch over time. However, it is possible to save money on both server costs and power by using one passive standby server to backup multiple active server – as many as 32 for example.

If one assumes a high availability configuration uses only 7 active servers and one passive standby server, the calculation is:

• One­to­one: 7 active servers and 7 passive servers = 14 servers • Seven­to­One: 7 active servers and 1 passive server = 8 Servers, 6 eliminated

% Savings = (14 Servers ­8 Servers)/ 14 servers = 43 %

Such efforts can generate considerable energy savings considering that a single server can consume $600 to $1,200 per year in power and cooling costs. Note that such configurations are available for both Veritas Cluster Server (VCS) and Veritas Application Director (VAD).

Managing Virtualization

Server virtualization allows for better processing load distribution across server hardware, therefore optimizing server hardware and power consumption. Server virtualization is available from a number of vendors including HP (for HP­UX), IBM (for AIX), Microsoft for Windows, Sun for Solaris, VMware for Windows and Linux and Zen for Linux.

Three Symantec products help manage virtualization software:

1. Altiris 2. Veritas Application Director 16 (VAD) in Release 1.1 3. Veritas Cluster Server 17 (VCS) in current releases.

Note that this support is designed to expand over time and that more information on these approaches will appear in future versions of this white paper.

16 http://www.symantec.com/enterprise/products/overview.jsp?pcid=1020&pvid=1221_1 17 http://www.symantec.com/enterprise/products/overview.jsp?pcid=1019&pvid=20_1

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

21

Evolving To Green Data Center Storage A green data center uses green data center servers and storage. So, it is important to ensure your data center contains both type resources.

There are many steps and considerations involved in transforming a traditional data center into a green data center. Here are some of them:

First determine precisely how much power a data center is using and how the power company generates the power. Note that many factors, including weather affect consumption rates.

In addition, develop operating plans that include minimum energy growth and even stepped reductions for power shortage brownouts of varying severity and duration. While the required effort and cost to conserve energy may appear as a problem today, ad hoc attempts to run a data center with a power rationing allotment at 50% of today’s power is probably a much larger problem.

Next, it is important to determine how the energy is consumed – voltage and power conversion losses; switches; routers; blade server racks; storage subsystem bays; water chillers; Heating, Ventilation, and Air Conditioning (HVAC); etc. This usually is very difficult to do but is important if a data center hopes to receive rip­and­replace electric power conservation rebates, is participating in a carbon credit program, or is projecting financial returns. The potential fractional benefit of green storage systems will immediately be obvious after this step.

Next, it is important to assess the cooling system, fixing obvious problems such as inappropriate cold air leakage with caulking, blocking rack open spaces with filler panels, and using newly available sponge plugs that seal up raised floor power and cable cutouts. Various vendors offer a variety of thermal assessment services that also review rack placement and HVAC vent positioning as well as provide what­if scenario planning. Keep in mind that losing cooling for more than one minute destroys some rack equipment (one minute ride­through time). Next, turn off unused equipment and identify older, inefficient server equipment that can be upgraded to newer, more efficient configurations or completely eliminated through virtualization consolidation. Enterprises often poorly utilize their storage with too many data copies. Moreover, if both storage capacity utilization and IOPs utilization is low then an enterprise is likely powering and cooling storage it doesn’t need. Here, Symantec’s CC Storage product can help track storage utilization and find unused storage. As you repopulate your data center with new devices, be sure to fully amortize the hardware that directly supports disk drives. By this, we mean to do things such as fully populating disk enclosure devices, ordering storage bays in increments as they are required, and using longer Fibre Channel loop lengths on lower performance storage systems.

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

22

Keep in mind that every eliminated watt now consumed by a device eliminates a corresponding air conditioning watt. It also reduces the power required by stand­by generators and the wattage required by Disaster Recovery sites that may be required to replicate multiple failing sites simultaneously. Finally, dispose of eliminated equipment through equipment reclamation vendors.

Conduct an in depth Information Lifecycle Management analysis to identify feasible ways to perform aggressive Dynamic Storage Tiering (DST). Veritas Volume Manager (VxVM)and it’s Dynamic Storage Tiering features can help migrate data to newer, more power­efficient storage solutions.

Engage with your storage subsystem vendors to determine ways to further reduce energy consumption. Many have power consumption estimators that will prove useful in planning efforts.

Finally, a variety of organizations exist to help with power efficiency and cooling issues. These include:

The Green Grid – http://www.thegreengrid.org/

The American Society of Heating, Refrigerating and Air­Conditioning Engineers (ASHRE) ­ http://www.ashrae.org/

The Uptime Institute ­ http://www.upsite.com/

80 Plus ­ http://www.80plus.org/

Monitoring these organizations for new developments and technologies can help address power and cooling issues.

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

23

Conclusion Acute data center power and cooling concerns are an increasingly­pervasive industry problem. How much an enterprise can save by evolving to green data centers varies by enterprise. However, reducing electrical power consumption is an enterprise imperative for numerous reasons in addition to financial ones.

From a governance perspective, increased efficiency can help avoid significant OPEX power cost as well as the tremendous headaches and cost when more power is required. The time and the cost required to dig a new utility line dug to run a data center is tremendous and increasingly not feasible.

Successful energy consumption reduction not only enables enterprises to reduce operating costs, it also delays costs associated with new, unnecessary equipment and building new data centers that are increasingly expensive.

Because of increasing public awareness of environmental issues, it likely is advantageous to consider energy conservation expenditures more as a prudent insurance policy premium rather than as an investment that must meet a specific Return on Investment (ROI) threshold. Power consumption differences will always exist between different equipment generations. Servers are presently receiving tremendous focus and enterprises are buying more power efficient servers. But storage will become an increasing power consumption concern.

Usually, enterprises will have older equipment in service for some time. Future servers will be still more power efficient, rendering current “power efficient” servers as relatively inefficient. The challenge will be to migrate enterprise data agilely to newer devices and subsystems while maintaining SLAs. As storage vendors make increasingly power­ efficient disks, the analogous equipment replacement lifecycle considerations will appear.

Creating cost­effective storage for green data centers is initially achieved through careful planning and evolvement. Over time, cost­effective storage is not a project, it is a sustained approach nurtured by a larger energy conservation culture and executive leadership. Thus being green isn’t a one time event or destination, it’s an ongoing process that effectively utilizes the most power efficient equipment possible in the most effective way.

As energy efficient processors and virtualization techniques begin addressing server energy consumption, storage energy efficiency will become increasingly important. Because data usage patterns are much more complex than server utilization patterns, it is important to begin evolving to green data storage now.

In the final analysis, though hardware power and cooling requirements are the source of many Data Center problems, Symantec software can help remediate them. Thus,

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

24

addressing Data Center power and cooling challenges requires a total end­to­end system perspective.

As a hardware­neutral software provider with software supporting heterogeneous systems, Symantec is uniquely positioned to partner with enterprises seeking to reduce storage system energy costs.

Again:

Symantec contends it’s software products can help customers reduce data storage energy consumption by as much as 50 percent and total data center energy consumption by as much as 25 percent. Moreover, Symantec software allows enterprises to exploit new energy­efficient servers and storage more easily, enabling them to continue conserving energy optimally.

Call To Action

Symantec invites enterprises to request a free ROI energy conservation analysis on the total savings Symantec software can bring to their specific data center facilities. We look forward to the ensuing engagements.

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

25

Appendix A – Estimating Storage Subsystem Electrical Power and Cooling Costs

This appendix illustrates a general methodology to estimate power and cooling energy requirements for data storage subsystems.

We begin by considering how many hard disks of various capacities are required to store a PetaByte (PB) of data where one PB is 1,000,000 GBytes (GB) of data.

Important first simplification to note:

For preliminary calculation ease, we now assume any disk (with any interface with any capacity with any rotational speed) requires 10 Watts to operate. We compensate for this clearly incorrect simplifying assumption later.

Since there are 1,000 Watts in a Kilowatt (KW), we have Table A­1.

Individual Disk Formatted

Capacity

Number of Disks Required to Store 1 PB

(1,000,000GB/Formatted Capacity)

KW Required to Store 1 PB Assuming 10 Watts/Disk for Disks with

the Formatted Capacity 18.3 GB 54496 544.96 36.7 GB 27248 272.48 73.4 GB 13624 136.24 100 GB 10000 100.00 146.8 GB 6812 68.12

300 3334 33.34 500 2000 20.00 750 1334 13.34 1000 1000 10.00

Table A­1 – KW required to power 1 PB data on powered­on disks, assuming 10 Watts/Disk

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

26

Important second simplification to note:

For preliminary calculation ease, we now assume each KWH costs $.10 (U.S.). We compensate for this clearly incorrect simplifying assumption later.

Using a KW for one hour consumes a Kilowatt hour (KWH), the standard electrical power billing unit. Assume each KWH costs $.10 (U.S.), again waiting to correct this simplifying assumption. Since there are 8,766 hours in a year (24 * 365.25), we have Table A­2.

Individual Disk Formatted

Capacity

KWH Required to Store 1 PB Assuming 10 Watts/Disk for Disks with

Formatted Capacity

$ Cost (U.S.) Required to Store 1 PB Assuming 10 Watts/Disk for Disks and

$0.10 (U.S.) per KWH 18.3 GB 4,777,120 $477,712.00 36.7 GB 2,388,560 $238,856.00 73.4 GB 1,194,280 $119,428.00 100 GB 876,600 $87,660.00 146.8 GB 597,140 $59,714.00

300 292,258 $29,225.80 500 175,320 $17,532.00 750 116,938 $11,693.80 1000 87,660 $8,766.00

Table A­2 – Cost to power 1 PB data on powered­on disks, assuming 10 Watts/Disk

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

27

Now, every KW that data center devices consume requires a corresponding data center cooling equipment KW to remove the generated heat. Therefore we have Table A­3.

Individual Disk Formatted

Capacity

$ Cost (U.S.) Required to Power and Cool 1 PB Assuming 10 Watts/Disk and $0.10 (U.S.) per KWH

18.3 GB $955,424 36.7 GB $477,712 73.4 GB $238,856 100 GB $175,320 146.8 GB $119,428

300 $58,452 500 $35,064 750 $23,388 1000 $17,532

Table A­3 – KW to power and cool 1 PB data on powered­on disks, assuming 10 Watts/Disk and $0.10 (U.S.) per KWH

We now correct the two simplifying assumptions:

1. Different disk models, even with the same storage capacity, have different Watt requirements.

• If a disk typically uses 12 watts, then it uses 1.2 times the power (12 actual / 10 assumed) the above calculations assumed.

• If a disk typically uses 15 watts, then it uses 1.5 times the power (15 actual / 10 assumed) the above calculations assumed.

• If a disk typically uses 16 watts, then it uses 1.6 times the power (15 actual / 10 assumed) the above calculations assumed.

• If a disk typically uses 18 watts, then it uses 1.8 times the power (18 actual / 10 assumed) the above calculations assumed.

Moreover, a given disk can consume 30% more power when it is busy that when it is idle.

2. How much power a disk array consumes depends on its work load. Symantec’s performance lab recently performed an informative test that measured the power a disk array containing 14 drives consumed.

• The array controller alone and 14 powered­off disks consumed 68 watts. • The array controller and seven idle, spinning disks consumed 170 watts.

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

28

• The array controller and 14 idle, spinning disks consumed 255 watts. • The array controller and 14 disks performing sequential reads consumed 280

watts. • The array controller and 14 disks performing violent seeks consumed 350

watts.

3. Your enterprise’s power rate is not likely $0.10/KWH. If is, say, $0.12/KWH, then your enterprise pays 1.2 times as much [($0.12/KWH)/ ($0.10/KWH)] as the above calculations consider.

4. Different disks offer different advantages such as differing performance, power consumption, reliability, and storage capacity. Therefore it is natural to expect that enterprise data centers use different disk models for different applications with any specific application not necessarily requiring 1 PetaByte (1,000 TeraBytes) of storage. Assume a single disk model stores 675 Terabytes of data. It follows that this is .675 PB of data (675/1,000).

Table A­3 shows that it requires $119,428 (U.S.) to store and cool 1 PB of data on disks that provide 146.8 GB capacity and require 10 Watts power. However, suppose an application stores 675 TeraBytes of data on hard disks each providing 146.8 GB of storage. Moreover, suppose that each disk requires 15 Watts to operate and that your enterprise pays $0.12/KWH.

We now have three correction factors to the $119,428 result from Table A­3:

1. Watts Correction Factor – 1.5 2. Power Rate Correction Factor ­ 1.2 3. Stored Data Correction Factor ­ .675

But, there are two more:

1. Power Supply efficiency – Power supplies are not 100 percent efficient when they convert one form of power (e.g. 110 V) into another (e.g.5V). The lost power results in radiated heat, usually inside the data center. If a power supply is 85% efficient, it requires 1/.85 or 1.176 power units to produce 1 power unit at the new power level.

2. Support electronics – Disk drives reside in disk enclosures containing cooling fans and other electronics. The disk enclosures populate storage bays connected by electronic links to storage controllers. This disk drive support infrastructure also requires power and cooling in addition to what the disk drives require. In this example, we use a conservative 18% (1.18) levy for this infrastructure.

This, for our example application, the total cost to operate, power, and cool the storage subsystem is:

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

29

$119,428 * 1.5 * 1.2 * .675 * 1.176 * 1.18 –or– $200,677

If an enterprise uses 14 Watt, 750 GB drives to store 2.5 PB of data and pays $0.115 /per KWH, the power and cooling cost calculation would be:

$23,388 * 1.4 * 1.115 * 2.5 * 1.176 * 1.18 –or– $126,656

These two total calculations attempt to consider power to operate storage controllers, but not power required to simultaneously recharge batteries following a failure. Keep in mind that storage subsystem disks are likely very busy and drawing maximum power immediately following a power failure restoration.

This is precisely the same time that data center UPS batteries are recharging and also placing substantial additional load on the power system. This can be important to worst­ case power consumption planning efforts.

Finally, note that voltage conversion and power line transmission losses require a power generation plant to produce 1.25 watts for every watt that enters the data center. Thus, the energy a power plant must produce to support a data center is approximately 25% greater than what the data center directly consumes.

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

30

Appendix B – RAID Overview There are many forms and variants of RAID. Here are some of them:

While providing no protection against disk failure, RAID 0 spreads data uniformly across a number of disks. This is useful when many users need simultaneous access to a data such as a shared database. Here, spreading the data out across many disks allows multiple overlapped concurrent accesses though the probability of data loss increases with each additional disk used.

RAID 1 stores data in the necessary disk space and simultaneously makes one or more copies of the data on other disks. So, the multiple copies occupy much more disk capacity as the actually data values require. Clearly, there is a difference between the total amount of data an enterprise has and how much disk space the data requires.

RAID 0+1 combines RAID 0 and RAID 1. A data set resides on multiple disks and a second, independent set of disks contains mirror copies of the data. The mirroring provides protection against the loss of disks, and using twice as many disks allow twice as many concurrent accesses, greatly accelerating performance.

RAID 5 similarly spreads data across a number of disks, say four, and then uses the data to create a checksum value that can reconstruct lost data if one of the four disks fails. The storage subsystem can store the value on a fifth disk drive but usually uses the five drives together, spreading the data and checksum information across the set of disks.

Like RAID 5, RAID 6 spreads data across a number of disks, say four, and then uses the data to create two or more checksum values that can reconstruct lost data if more than one the four data disks fails. The storage subsystem can store the value on a fifth and sixth disk drive but usually uses the drives together, spreading the data and checksum information across the set of disks.

Clearly, there is a difference between the total amount of data an enterprise has and how much disk space the data requires.

Calculating various RAID Capacity Factors

The formula to compute a RAID set capacity factor is:

(Total Drives in the RAID set) / # of Data Drives

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

31

RAID 1: RAID 1 replicates data. A 2­way mirror has 2X the total number of drives as compared to data drives. A 3­way mirror has 3X the number of drives. These yield 2X and 3X capacity factors respectively. This scheme also applies to RAID 0+1 or 1+0 that combines striping and mirroring.

RAID 5: RAID 5 is usually a number of data drives plus one additional parity drive that completes the RAID set. A RAID set with 5 data drives and one parity drive has 6 drives total and 5 data drives worth of capacity

RAID 6: RAID 6 has 2 parity drives per RAID set. So a RAID set with 5 data drives would have 7 total drives / 5 data drives or a RAID capacity factor of 7/5.

I/O Performance Measurements:

Many factors affect I/O performance. Two primary factors are the total number of disk drives and individual disk drive performance. A future revision of this white paper will address this topic in more detail.

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

32

Appendix C – CC Storage Analysis Tool

CC Storage is a Storage Area Network (SAN) hardware management and monitoring application that discovers SAN resources ­ switches, Host Bus Adapters (HBAs), disk arrays, host servers, and applications.

CC Storage also has the ability to monitor file system activity and determine which files are active and inactive, hence eligible for archiving or migration to more energy efficient storage devices (Figures C­1 through C­4). In addition to understanding where the inactive files are, CC Storage can also help to reconfigure the SAN and help move the data to more appropriate storage. Data administrators can identify inactive data on Tier 1 storage and help to migrate it to the Tier 2 storage using SAN reconfiguration by:

­ Opening a connection between the source and the destination ­ Copying the data to Tier 2 ­ Closing the connection. ­

Thus, CC Storage is a powerful primary storage subsystem management tool that enable enterprises to reduce power and cooling costs significantly through increased storage utilization.

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

33

Figure C­1 – CC Storage File System Analysis Screen Shot

Figure C­2 – CC Storage File Aging Summary Report

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

34

Figure C­3 – CC Storage File Types Usage Summary Report

Figure C­4 ­ CC Storage File Tpes Aging Summary – By File Types Modified Report

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

35

Appendix D – A Sample 50% Data Storage Power Consumption Reduction This appendix calculates the power savings for a before­and­after scenario, providing an example of what is achievable by migrating less active data from one storage class to another.

Drive Specifications and Conditions:

1. The drives for this analysis are all 15K RPM and 7200 RPM drives. Note that 10K RPM drives draw similar current as 15K PRM drives, but exhibit lower performance. Thus, using 10K RPM drives shows less performance difference between Scenario 1 and Scenario 2. Using 15K RPM drives only ensures the analysis is conservative in some cases.

2. The performance and power consumption figures were taken from a major disk vendor. With the competitiveness of the drive industry, similar figures would be assumed for other brands of drives.

3. Drive power and performance numbers were averaged among 2 generations of vendor 15 K RPM drives across the 3 interfaces (SCAI, FC, SAS). Drive activity is assumed to be 50% seeking and 50% idle (only spinning).

4. The 7200 PRM 500 GB drives are from the same vendorl. The same activity assumption (50% seeking and 50% spinning) is also used.

5. A drive power consumption and performance table is:

Drive Capacity 73.4 GB 146.8 GB 500 GB Drive RPM 15,000 15,000 15,000 Seek Time Avg 3.75 ms 3.75 ms 8.5 ms Rotational Latency 2 ms 2 ms 4.16 ms Seek + Latency 5.75 ms 5.75 ms 12.66 ms Drive IOPs 174 174 79 Power Seeking 15.1 17 12 Power Spin Only 9.8 11.8 8 Avg Power (1/2 seek 1 /12 spin

14.4 12.4 10

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

36

Initial Conditions:

6. The assumed conditions are Tier 1 storage is 50% of the total storage and uses 73.4 GB drives (RAID 1. Tier 2 storage is 50% of the total storage and uses RAID 5 (1+4 RAID sets). A RAID 1 configuration is mirrored and therefore has twice the number of drives required for just the data. A RAID 5 set with 4 data and 1 parity drives has 1.25 the number of drives (5/4) required to just store the data.

Final conditions:

7. The power conservation scheme moves Tier 1 inactive storage from 73 GB RAID 1 storage to RAID 5 storage with 500 GB 7200 RPM drives (5 +1 RAID sets). A 5+1 RAID set has 5 data drives and 1 parity drive or 6/5 (1.2) x the number of drives required to store the data.

8. The power conservation scheme moves Tier 2 inactive storage from 146 GB RAID 5 (RAID 4+1) to RAID 5 storage (RAID 5+1 sets). This storage is MAID and powers­on at most 25% of the drives at any time. This means first data accesses delay until the initial data is available. After it is available, performance is the same as it is for Tier 2 storage using the same drives.

9. Performance is usually comprises 2 categories, IOPS and Transfer rate. High sustained transfer rates are achievableained even with lower performance drives like the 7200 RPM drives. IOPS is more directly related to drive seek times and latency, therefore is different between RAID sets of 7200 and 15,000 RPM drives.

Read I/O performance is measured by the number of active drives in the configuration. Note that Write performance is higher in general for RAID 1 as compared to RAID 5.. However caching algorithms over time have increased RAID 5 write performance. So Read I/O Performance is the parameter measured.

10.. Pareto’s law states that 80% of the activity occurs on 20% of the population and this analysis applies that maxim in both scenarios. To provide a performance measurement margin, the analysis uses the parameters listed in Final 2 row below.

Category 73 GB Tier1 146 GB Tier2 500 GB Tier 2 500 GB Tier3 Initial 50% 50% Final 10% 10% 40% 40% Final 2 25% 25% 25% 25%

11. The calculations used 642 GB as a representative data center capacity for a Fortune 500 company. This figure was taken from a survey by The InfoPro. The savings percentage is constant across other storage configurations.

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

37

12. Power Consumption = Number of Disk Drives x Power consumption per drive

13. Number of Drives Tier 1 = Total Capacity x % Tier 1 x RAID Factor (mirroring =2) Capacity per Disk Drive (73.4 GB)

14. Number of Drives Tier 2 = Total Capacity x % Tier 2 x RAID Factor (5/4) Capacity per Disk Drive (146.8 GB)

14. Number of Drives Tier 2A = Total Capacity x % Tier 2 x RAID Factor (6/5) Capacity per Disk Drive (500 GB)

(Also applies for Tier 3)

Scenario 1 (Before Aggressive Tier Migration):

Category Tier 1 Tier2 Tier 2A Tier 3 A. Total Storage GB 642,000 642,000 642,000 642,000 B. % in Tier 50% 50% C. GB in Tier (A x B) 321,000 321,000 D. Drive Capacity 73.4 146.8 E. # of Data Drives (C/D)

4373 2187

F. RAID Factor 2 1.25 G. # of Drives (E x F) 8747 2733 H. Power per Drive 12.4 14.4 I. Power on Factor 100% 100% Power /Tier (G x H x I) 108458 39360 Total Power 147817

# of Data Drives 4373 2187

IOPS/Drive 174 174 IOPS/Tier 760954 380477 Total Read IOPS 1141431

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

38

Scenario 2 (After Aggressive Tier Migration):

Category Tier 1 Tier2 Tier 2A Tier 3 A. Total Storage GB 642,000 642,000 642,000 642,000 B. % in Tier 23% 23% 27% 27% C. GB in Tier (A x B) 147660 147660 173340 173340 D. Drive Capacity 73.4 146.8 500 500 E. # of Data Drives (C/D)

2012 1006 347 347

F. RAID Factor 2 1.25 1.2 1.2 G. # of Drives (E x F) 4023 1257 416 416 H. Power per Drive 12.4 14.4 10 10 I. Power on Factor 100% 100% 100% 25% Power /Tier (G x H x I) 49891 18105 4160 1040 Total Power 73196

# of Data Drives 2012 1006

IOPS/Drive 174 174 IOPS/Tier 350039 175019 Total Read IOPS 525058

Power Scenario 1= 147,817 (Before Aggressive Tier Migration) Power Scenario 2 = 73,196 (After Aggressive Tier Migration)

73196 / 147817 = 50% data storage power and energy savings

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

39

About the Authors

Bruce Naegel is…

John Colgrove is ….

W. David Schwaderer is the Symantec Technology Network (STN) Editor­In­Chief.

He has authored six commercial software programs and ten technical books including Data Lifecycles—Managing Data for Strategic Advantage published by John Wiley & Sons Ltd. His soon­to­be­published eleventh book is titled Innovation Survival— Concept, Courage and Change. He is also working on a twelfth book titled Data Center of the Future that examines the tectonic forces reshaping today’s enterprise data centers.

David has a Masters Degree in Applied Mathematics from the California Institute of Technology and an MBA from the University of Southern California. He lectures at Stanford on the subject of innovation.

Green Data Center Storage

SYMANTEC TECHNOLOGYNETWORK

40

Symantec Corporation, 20330 Stevens Creek Blvd. Cupertino, CA 95014 United States of America

Copyright © 2007 Symantec Corporation. All rights reserved. Symantec, the Symantec Logo and VERITAS are trademarks or registered trademarks of Symantec Corporation or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners.