A Guide to Modern StorAGe ArchitectureS - Infinio · A Guide to Modern StorAGe ArchitectureS 4...

19
A GUIDE TO MODERN STORAGE ARCHITECTURES www.infinio.com | [email protected] | 617-374-6500

Transcript of A Guide to Modern StorAGe ArchitectureS - Infinio · A Guide to Modern StorAGe ArchitectureS 4...

Page 1: A Guide to Modern StorAGe ArchitectureS - Infinio · A Guide to Modern StorAGe ArchitectureS 4 VirtuAlizAtion traditional SAn and nAS arrays were designed for a world without virtualization.

A Guide to Modern StorAGe ArchitectureS

w w w . i n f i n i o . c o m | c o n t a c t @ i n f i n i o . c o m | 6 1 7 - 3 7 4 - 6 5 0 0

Page 2: A Guide to Modern StorAGe ArchitectureS - Infinio · A Guide to Modern StorAGe ArchitectureS 4 VirtuAlizAtion traditional SAn and nAS arrays were designed for a world without virtualization.

TradiTional STorage archiTecTureS 2

Modern coMplexiTieS: 3

VirTualizaTion, Scale ouT, and cloud

hybrid arrayS 9

all-FlaSh arrayS 10

hyper-conVerged inFraSTrucTureS 13

STorage acceleraTion 14

inFinio’S STorage acceleraTion plaTForM 15

A Guide to Modern StorAGe ArchitectureS

2

decade ago, the data center was a vastly different world. traditional

storage arrays—configured as SAn or nAS—sat as centralized storage units

accessible by multiple physical servers. the rotational speed of the spinning

media created limitations in how fast data could be read and/or written and,

when a user needed better performance, they bought another tray of disks.

From an architectural standpoint, the major innovation at the time was whether

you had a native block device with file support layered on or a native file server

layered with block storage.

the storage industry was able to keep up—for a while. Spinning disks got

faster. Advances like unified block and file enabled users to achieve efficiencies

through centralized, common management and handle multiple hosts.

tiering—although primitive and coarse-grained by today’s abstraction

standards—enabled improvements in speed and performance.

a

Page 3: A Guide to Modern StorAGe ArchitectureS - Infinio · A Guide to Modern StorAGe ArchitectureS 4 VirtuAlizAtion traditional SAn and nAS arrays were designed for a world without virtualization.

buT The old WayS JuST couldn’T Keep up.

the demands placed on the data center continued to increase. Although memory

was getting more inexpensive, adding increasingly more trays of spinning disks

was taking up more space in the data center, which increased energy usage. With

10Gbe access, the pipeline expanded further, and systems needed more processing

power to manage the fl ood of data in both primary and backup systems, or risk

slowing data access (and the applications it supports) to unacceptable levels.

Storage in particular began feeling the impact of these changes. performance

and capacity began to emerge as separate storage resources shackled together

but with vastly different profi les with respect to both growth and economics.

Specifi cally, three trends in the data center began to put pressure on

traditional storage:

VirTualizaTionconsolidating more

i/o onto the same

storage resources

Scale-ouT applicaTion

archiTecTureS generating more work as

they scale-out into available

cpu and memory

FlaShsupporting orders

of magnitude greater

performance while

requiring special processing

A Guide to Modern StorAGe ArchitectureS

3

Page 4: A Guide to Modern StorAGe ArchitectureS - Infinio · A Guide to Modern StorAGe ArchitectureS 4 VirtuAlizAtion traditional SAn and nAS arrays were designed for a world without virtualization.

A Guide to Modern StorAGe ArchitectureS

4

VirtuAlizAtiontraditional SAn and nAS arrays were designed for a world without virtualization.

While not new, over the past few years virtualization has had a profound impact

on storage architectures and workloads.

Simply put, server virtualization means that a single physical server can be

shared by multiple, independent virtual machines. no longer are there dedicated

luns for each physical application; gone too are the optimizations built for that

architecture. it’s no longer efficient to employ simplistic mechanisms to improve

the ability of drive heads to access data from a specific location.

Most companies spend $2-3 on storage for every dollar spent on server virtualization projects.

“”

William blair & co.

Page 5: A Guide to Modern StorAGe ArchitectureS - Infinio · A Guide to Modern StorAGe ArchitectureS 4 VirtuAlizAtion traditional SAn and nAS arrays were designed for a world without virtualization.

A Guide to Modern StorAGe ArchitectureS

5

SerVer VirtuAlizAtion drAMAticAllY chAnGed the eFFectiVeneSS oF thiS ApproAch in A rAnGe oF WAYS:

• Generally, servers are running more workloads on the same number of drives, which negatively impacts performance. this greater number of VMs on a single system may compete with one another for both storage capacity and performance.

• however, VMs that share resources can’t take advantage of optimizations designed for individual access patterns — all i/o is merged, causing the so-called “Blender effect.”

• Multiple VMs may require the ability to dynamically shift workload movement an operation that adds further complexity to the data center.

• particularly in virtual desktop environments, synchronization peaks around time-oriented events (like antivirus scans or workers’ morning log-in) may result in mass-seek overload and dull the system’s responsiveness.

Page 6: A Guide to Modern StorAGe ArchitectureS - Infinio · A Guide to Modern StorAGe ArchitectureS 4 VirtuAlizAtion traditional SAn and nAS arrays were designed for a world without virtualization.

A Guide to Modern StorAGe ArchitectureS

6

ScAle out ApplicAtion ArchitectureSSeveral trends around scalability and workload— the advent of cloud storage and Big data, in particular—have imposed changes on the way modern applications are built and how they access storage.

From an architectural standpoint, this is a fundamental change from overwriting data to appending it. Because these architectures are distributed, they rely on copies for data protection, rather than traditional rAid. As a result, users have to contend with both the explosion of storage, driven by machine-generated data, along with a new multiplier: copies used for data protection.

in the cloud, Amazon and others have found ways to make storage outside the datacenter economical and attractive. these web-scale architectures, while most appropriate for the Facebooks of the world, are finding their way into everyone’s datacenter at a smaller scale. For example, VMware and oracle both have scale-out designs that are well-known and familiar. And those are the most traditional of applications.

When you have a lot of compute in a scale-out architecture (like you do with VMware and oracle, for example), storage is under significantly more pressure. the same spindles must handle increased workload compared to when they were sized 1:1 for individual scale-up applications. the goal of these systems is to evenly and automatically distribute the data across multiple systems. however, this efficiency comes with a price: the cluster as a whole is processing significantly more data, putting more performance pressure on storage. Similarly, scale-out applications in the datacenter that are starting to use replicas rather than rAid for data protection are driving a huge amount of data capacity requirements into storage.

Users have to

contend with both

the explosion of

storage, driven by

machine-generated

data, along with

a new multiplier:

copies used for

data protection.

Page 7: A Guide to Modern StorAGe ArchitectureS - Infinio · A Guide to Modern StorAGe ArchitectureS 4 VirtuAlizAtion traditional SAn and nAS arrays were designed for a world without virtualization.

A Guide to Modern StorAGe ArchitectureS

7

FlAShenter fl ash-based SSd devices. Flash is a core technology that can be deployed in several ways, such as SSds in hosts, pci-e cards, and SSds in arrays. other new architectures like nV-based diMMS are also emerging in this space. Because it is orders of magnitude faster (without being analogous orders of magnitude more expensive), fl ash is a huge disruption in the economics of storage.

this puts pressure on existing storage systems in a few ways. First, fl ash’s additional performance capabilities drive signifi cantly more iopS through existing controllers than legacy systems were designed to handle. equally impactful to existing platforms is that fl ash’s architecture often needs special handling to address particular challenges.

in this illustration, we demonstrate the relative speed of different media in the data center.

1 Minute 2 WeekS 1-2 MonthS 10 YeArS

coMpleXitieS to conSider

like other technologies, individual fl ash devices (like SSds) have some architectural challenges that need to be addressed, either by storage controllers or by software. For example:

Wear balancing over time, fl ash cells wear out. it’s desirable to have them wear out at about the same time, so some storage controllers include the logic of special algorithms to spread wear evenly across cells.

Write amplifi cation Flash only allows writes to an empty cell; if a cell has content, it must be erased before it can be re-written to. Further increasing overhead is that while writes might be at the block level, erasures occur at the page level.

garbage collectionBecause cells must be empty to be written to, a cycle of “clean-up” needs to occur in order to dispose of outdated data. this background process can degrade both active read and write performance while it occurs.

Page 8: A Guide to Modern StorAGe ArchitectureS - Infinio · A Guide to Modern StorAGe ArchitectureS 4 VirtuAlizAtion traditional SAn and nAS arrays were designed for a world without virtualization.

A Guide to Modern StorAGe ArchitectureS

8

As is often the case in the lifecycle of technology, the drivers

of the challenges users face can also be the key to the

solution. industry leaders have leveraged the disruptors that

inspired virtualization, flash and cloud—greater speed and

cpu—to develop several distinct, innovative approaches

to balancing cost and performance. We’ll discuss these

approaches in the sections that follow.

but first, let’s take a broad look at the ways in which users have integrated flash into their systems.

in the early days before purpose-built hybrid arrays emerged, users handled their need for speed and performance by adding SSds to their existing arrays, and developing tiering strategies (eventually these were automated) to help them best use their mix based on the level/frequency of data access.

Similarly, some users created “all-flash” arrays by purchasing legacy architecture arrays filled with SSd drives. however, both of these approaches fell short in delivering the promise of flash — they didn’t leverage flash in the most effective ways, and they fell victim to make of the complexities of flash discussed earlier. the other critical thing about both of these approaches is this: the architecture of the datacenter and of the storage arrays stayed basically the same.

trAditionAl dAtA center Architecture

Page 9: A Guide to Modern StorAGe ArchitectureS - Infinio · A Guide to Modern StorAGe ArchitectureS 4 VirtuAlizAtion traditional SAn and nAS arrays were designed for a world without virtualization.

A Guide to Modern StorAGe ArchitectureS

9

hybrid arrays represent the next mainstream generation of disk arrays, with controllers that are better-suited to utilize fl ash resources. the latest hybrid arrays use fl ash for more specifi c tasks: as a read cache and a write log buffer for example.

these hybrid arrays access data from either fl ash or from its disk pool, but fl ash is not exposed directly to applications. the goal of this storage architecture is to present an optimal mix of fl ash to optimize the array’s ability to handle increased iopS, and spinning drives to optimize capacity utilization.

typically used in scenarios where there is a set of mixed enterprise workloads, hybrid arrays offer mainstream users an option for benefi ting from fl ash when optimizing cost is more important than occasional latency.

the hybrid landscape continues to rapidly expand and develop. For now, the hybrid array approach of combining SSds and spinning drives seems to be the new “status quo” for organizations purchasing new arrays.

hYBrid ArrAYS

WhY hYBrid ArrAYS?

hybrid arrays are most effective in environments with a mix of general workloads where a moderate price point is more important that guaranteed performance for all applications.

Many customers choosing hybrid arrays are doing so when their environments aren’t equipped to absorb major changes to datacenter architecture.

Page 10: A Guide to Modern StorAGe ArchitectureS - Infinio · A Guide to Modern StorAGe ArchitectureS 4 VirtuAlizAtion traditional SAn and nAS arrays were designed for a world without virtualization.

A Guide to Modern StorAGe ArchitectureS

10

All-FlASh ArrAYS

the all-flash trend began with putting SSds into existing arrays, but a range of the complexities sparked the emergence of a new generation of storage. As we discussed earlier, controller functions like wear leveling become imperative in arrays leveraging flash to avoid wearing the flash cells unevenly. Similarly, space management can be a challenge, too, since flash only writes to empty cells. And, cleaning up storage may incur slower cycles as multiple reads and writes execute while blocks are being erased. the newest all-flash arrays are built with controller logic that handle these technical challenges.

essentially, an all-flash array is like tacking a single, high performance/high cost tier onto your existing data center. it’s not just fast, but it’s a guarantee of fast for everything connected to it – an SlA no other architecture has yet to promise. But while the approach is ideal for handle a high level of mission-critical applications, it may be too expensive for many organizations. the deduplication that all-flash array vendors tout as the key to delivering spinning-drive economics may represent a false promise: that same deduplication can be applied to spinning drives as well to once again separate the costs of flash and spinning drives.

economics aside, the reality is that many organizations do not demand that much speed for every application.

WhY All-FlASh ArrAYS? All-flash-arrays are best-suited for environments where a significant chunk of applications need a guaranteed level of performance — and there is a budget to support this.

While hybrid arrays may offer low-latency access for 95% of workloads, AFAs guarantee that level of performance for all workloads.

Page 11: A Guide to Modern StorAGe ArchitectureS - Infinio · A Guide to Modern StorAGe ArchitectureS 4 VirtuAlizAtion traditional SAn and nAS arrays were designed for a world without virtualization.

A Guide to Modern StorAGe ArchitectureS

11

one of the most disruptive—and wildly successful—innovations to

the data center architecture is the shift to a model where storage is

processed using resources on the server-side. Server-side storage has

been implemented in multiple ways: hyper-converged, software-defined,

software services, but of these, hyper-converged infrastructure is the first to

be gaining mainstream acceptance in the marketplace.

the key point from a storage perspective is that processing is done on the

server side. Why? Because the server is loaded with memory and processor

cores; exploiting existing, underutilized resources is it’s holy grail.

By contrast to the dedicated storage arrays we’ve just discussed,

server-side storage processes storage across one or more servers and

is co-located with the applications it services.

the forms of server side storage vary in presentation—although not really

in the technology itself. in hyper-converged infrastructure, customers

purchase integrated hardware and software building blocks that deliver

compute, memory, storage and network capacity as an integrated unit for

a specific workload. in software-defined solutions, the software that runs

those same functions in the hyper-converged model, is typically available

separately to run on any choice of hardware platform.

ArchitectinG A neW dAtAcenter

Page 12: A Guide to Modern StorAGe ArchitectureS - Infinio · A Guide to Modern StorAGe ArchitectureS 4 VirtuAlizAtion traditional SAn and nAS arrays were designed for a world without virtualization.

A Guide to Modern StorAGe ArchitectureS

12

Architecturally, a hyper-converged system binds hardware—typically a server with direct-attached storage and fl ash cards—with the software needed to run virtual workloads, including: a hypervisor, system management, confi guration and virtual networking tools. these technologies do not connect to traditional storage stacks and instead offer next generation storage services such as scale-out performance, caching, encryption, deduplication.

But hyper-converged infrastructure has its limitations, not the least of which is the reality that users have to change their whole data center to implement it. While all-fl ash and hybrid arrays have unique attributes, users still buy and manage networking and servers the same way as they always have, having a lesser impact on the architecture of the data center. By contrast, hyper-converged infrastructure requires users to buy into a new “building block” for their environments, changing their management tools and processes. thus, these are typically deployed in small and

medium businesses or remote offi ce environments.

oTher conSideraTionS include:

• Storage Effi ciency: the scale-out architecture of hyper-converged infrastructure has driven a data protection scheme based on replication, rather than traditional rAid. While rAid for data protection might have a 10-20% overhead for capacity, the replicas necessary to protect hyper-converged infrastructure might see a capacity overhead closer to 300-400%.

• Pre-defi ned Scaling: compared to traditional it, there is less fl exibility to throttle pieces of the infrastructure that you need more or less of—expansion is done by buying another predefi ned block of all the resources.

• Multiple User Management Experience: in a hyper-converged environment, silos of it personnel may need to be reorganized to streamline management. Familiar server and storage monitoring tools are often replaced with different interfaces.

hYper-conVerGed inFrAStructure

WhYhYper-conVerGed?

enables simplifi ed, unifi ed design, management deployment and support by integrating the components of the it infrastructure.

provides predictable building blocks that can be aggregated together to meet growth needs.

Page 13: A Guide to Modern StorAGe ArchitectureS - Infinio · A Guide to Modern StorAGe ArchitectureS 4 VirtuAlizAtion traditional SAn and nAS arrays were designed for a world without virtualization.

A Guide to Modern StorAGe ArchitectureS

13

Many organizations who understand the benefi ts of leveraging server-side resources to improve storage performance seek a solution that enables this without massive disruption to datacenter architecture.

enter storage acceleration.

Storage acceleration enables it managers to improve storage performance by aggregating server-side resources. this approach creates a low-latency performance layer, enabling organizations to purchase or use lower-cost storage for capacity purposes. Whether organizations are looking to improve existing storage or design a new datacenter, storage acceleration enables organizations to manage the resources of performance and capacity separately without changing the architecture or the operations of the storage side.

this architecture can deliver the lowest cost/iopS by using less expensive commodity-based resources on the server side, and the lowest cost/GB by focusing shared arrays on being optimized for capacity.

From a technology standpoint, this approach enables organizations to separate the acquisition and management of performance resources from that of capacity resources.

From a business standpoint, it provides a way to add performance to an existing infrastructure without a rip-and-replace and its inherent cost—in hardware, software, it time investment and downtime. these resources can also be signifi cantly less expensive than the same hardware deployed within a proprietary package.

StorAGe AccelerAtion

WhY StorAGe AccelerAtion?

provides low-latency server-side access to the fastest storage resources, at a low $/iopS

enables organizations to maintain their existing infrastructure investment in shared storage platforms, even reducing $/GB on newer platforms

Page 14: A Guide to Modern StorAGe ArchitectureS - Infinio · A Guide to Modern StorAGe ArchitectureS 4 VirtuAlizAtion traditional SAn and nAS arrays were designed for a world without virtualization.

A Guide to Modern StorAGe ArchitectureS

14

Storage acceleration is still an emerging field, but infinio has been providing a solution in this space since 2013. the infinio solution is highly efficient with resources, transparent to existing storage operations, and non-disruptive. All of these qualities enhance the benefits of separating storage into its atomic qualities — capacity and performance — including:

• 10x improvement in latency

• SSd-class performance without additional hardware

• reduced performance costs ($/iopS)

• Scale-out i/o with application growth

• reduced capacity costs for any array ($/GB)

inFinio’S StorAGe AccelerAtion plAtForM

Page 15: A Guide to Modern StorAGe ArchitectureS - Infinio · A Guide to Modern StorAGe ArchitectureS 4 VirtuAlizAtion traditional SAn and nAS arrays were designed for a world without virtualization.

A Guide to Modern StorAGe ArchitectureS

15

inFinio StorAGe AccelerAtion For the VirtuAlized dAtAcenter.

At the core of infinio’s solution to deliver storage performance separately from storage capacity is an architecture built on the understanding that most datacenters contain significant amounts of duplicate data, especially across VMware clusters. infinio’s content-based architecture exploits this fact, tracking content (rather than location) which results in a performance layer with inline deduplication. it is this deduplication that enables infinio to deliver high performance (10X improvement in latency) on just small amounts of rAM - starting at just 8GB per host. When this deduplication is combined with infinio’s scale-out global architecture, just 5 nodes of infinio can have access to hundreds of GB of effective cache.

And it’s not just the efficiency that makes infinio different. core to the design of the product has always been a commitment to seamless integration into an existing environment. As such, infinio can be installed in under 30 minutes with no downtime, disruption, guest agents, or changes to storage configuration. turning acceleration on or off is a single click, as is removing infinio entirely at the end of an evaluation.

once implemented, infinio enables you to continue using your familiar storage tools, like snapshots, replication, and thin provisioning, as well as customizations you’ve made in your environment around backup system integration or reporting.

Page 16: A Guide to Modern StorAGe ArchitectureS - Infinio · A Guide to Modern StorAGe ArchitectureS 4 VirtuAlizAtion traditional SAn and nAS arrays were designed for a world without virtualization.

A Guide to Modern StorAGe ArchitectureS

16

inFinio eliMinAteS BottleneckS And enABleS You to ScAle-out to AchieVe the perForMAnce You need Within Your eXiStinG inFrAStructure. With inFinio, You enjoY:

WhAt You cAn eXpect

• 10x improved response time

• 65-85% reduction of reads from storage

• extended life for storage systems

• lower tco for new storage acquisition

• deduplication to drive high resource utilization

• Simple installation, enabling you to evaluate without downtime, disruption, or changes

• investment preservation, since it co-exists with your existing storage system tools and reports

We noticed the results almost instantly, with a visible reduction of storage latency on the VDI desktops and decreased workload on our filers.nathan Manzi, Systems engineer at Minara resources ”

Page 17: A Guide to Modern StorAGe ArchitectureS - Infinio · A Guide to Modern StorAGe ArchitectureS 4 VirtuAlizAtion traditional SAn and nAS arrays were designed for a world without virtualization.

A Guide to Modern StorAGe ArchitectureS

17

Attivio, one of infi nio’s earliest customers, has seen signifi cant storage performance

improvements over the long term, including:

• improved storage performance with no added hardware

• Sustained read offl oad of 88%; 93% of bandwidth offl oaded

• Sustained 5x performance improvement over 16 weeks

• installed with no downtime or service interruption

Why spend a lot of money on one isolated shelf of SSDs when you can get that benefi t across the environment for less with Infi nio?Sean lutner, Vp of it at Attivio

reAl-World SucceSS

Page 18: A Guide to Modern StorAGe ArchitectureS - Infinio · A Guide to Modern StorAGe ArchitectureS 4 VirtuAlizAtion traditional SAn and nAS arrays were designed for a world without virtualization.

A Guide to Modern StorAGe ArchitectureS

18

reAl-World SucceSS

When mobile workers complained about slow response and poor application

performance, Budd Van lines deployed infinio get business moving, achieving:

• improved Vdi performance (read response times decreased by 2.5x;

75% of requests offloaded)

• eased network bandwidth by offloading storage requests

• installed quickly without affecting production or users

“ If we had installed Infinio earlier, we would not have had to purchase 10Gb switches to support our growth and prepare for busy season. Infinio’s read offload saves us enough bandwidth traffic that we could have saved significant money by not buying those pricey switches.doug Soltez, Budd Van lines, Vp and cio ”

Page 19: A Guide to Modern StorAGe ArchitectureS - Infinio · A Guide to Modern StorAGe ArchitectureS 4 VirtuAlizAtion traditional SAn and nAS arrays were designed for a world without virtualization.

A Guide to Modern StorAGe ArchitectureS

19

Storage architectures—and their users—have had to respond to

wave-after-wave of disruptive innovations over the past ten years.

cloud. Virtualization. Scale out Applications. From spinning disks

to solid state, a mash-up of hybridized approaches that leverage

a mix of more than one—including server resources—the goal has

always been to simplify, to speed and to maintain predictable

performance. the moving target: the right size, for the right cost—

and never paying too much.

Storage acceleration may be the latest approach—but we don’t

think it’s just a passing trend. infinio’s system obtains the ability

to accelerate i/o (with an average of 10X improvement in latency)

with just small amounts of rAM. And, what’s more, it provides

that kind of performance with your current architecture and

operations. Whether you are improving performance in an

existing environment or building a new one—that is a

game changer.

For More on inFinio’S approach To acceleraTing The perForMance oF your daTa cenTer, call uS aT 617-374-6500, or ViSiT WWW.inFinio.coM.

© 2015 infinio Systems, inc. All rights reserved.