Insider's Guide- The Data Protection Imperative
Embed Size (px)
Transcript of Insider's Guide- The Data Protection Imperative
- 1. 1 STORAGE VIRTUALIZATION: AN INSIDERS GUIDE Jon William Toigo CEO Toigo Partners International Chairman Data Management Institute Copyright 2013 by the Data Management Institute LLC. All Rights Reserved. Trademarks and tradenames for products discussed in this document are the property of their respective owners. Opinions expressed here are those of the author.
- 2. Copyright 2013 by The Data Management Institute LLC. All Rights Reserved. 2 STORAGE VIRTUALIZATION: AN INSIDERS GUIDE Part 4: The Data Protection Imperative A confluence of three trends is making disaster preparedness and data protection more important than ever before. These trends include the increased use of server and desktop virtualization, growing legal and regulatory mandates around data governance, privacy and preservation, and increased dependency on automation in a challenging business environment as a means to make fewer staff more productive. Business continuity is now a mission-critical undertaking. The good news is that storage virtualization can deliver the right tools to ensure the availability of data assets the foundation for any successful business continuity or disaster recovery capability.
- 3. Copyright 2013 by The Data Management Institute LLC. All Rights Reserved. 3 STORAGE VIRTUALIZATION: AN INSIDERS GUIDE The Data Protection Imperative DATA PROTECTION MANAGEMENT: THE ESSENTIAL TASK OF BUSINESS CONTINUITY Data protection and business continuity are subjects that nobody likes to talk about, but that everyone in contemporary business and information technology must consider. Today, a confluence of three trends a kind of perfect storm is making disaster protection planning and disaster preparedness more important than ever before: First is the increased use of server and desktop virtualization technologies in business computing -- technologies that, for all their purported benefits, also have the downside of being a risk multiplier. With hypervisor-based server hosting, the failure of one hosted application can cause many other application guests to fail on the same physical server. While other efficiencies may accrue to server hypervisor computing, the risks that the strategies introduce must be clearly understood in order to avoid catastrophic outcomes during operation. A second trend underscoring the need for data protection and business continuity planning is the growing regime of regulatory and legal mandates around data preservation and privacy that affect a growing number of industry segments. Some of these rules apply to nearly every company and most carry penalties if businesses cannot show that reasonable efforts have been taken to safeguard data. Third, and perhaps most compelling, is the simple fact that companies are more dependent than ever before on the continuous operation of IT automation. In the economic reality, the need to make fewer staff more productive has created a much greater dependency on the smooth operation of information systems, networks and storage infrastructure. Even a short term outage can have significant consequences for the business. Bottom line: for many companies, business continuity and data protection have moved from nice-to-have to must-have status. Past debates over the efficacy of investments in preparedness are increasingly moot.
- 4. Copyright 2013 by The Data Management Institute LLC. All Rights Reserved. 4 Put simply, there is no safe place to construct a data center: historical data on weather and seismic events, natural and man-made disaster potentials, and other catastrophic scenarios demonstrate that all geographies are subject to what most people think of when they hear the word disaster. Moreover, from a statistical standpoint, big disasters those with a broad geographical footprint represent only a small fraction of the overall causality of IT outages. Only about 5 percent of disasters are those cataclysmic events that grab a spot on the 24 hour cable news
- 5. Copyright 2013 by The Data Management Institute LLC. All Rights Reserved. 5 channels. Most downtime is the result of equipment and software maintenance some call it planned downtime, though efforts are afoot to eliminate planned downtime altogether through clustering and high availability engineering. The next big slices of the outage pie chart involve problems that fall more squarely in the disaster category: those resulting software failures, human errors (carbon robots), and IT hardware failures. According to one industry study of 3000 firms in North America and Europe, IT outages in 2010 resulted in 127 million hours of downtime equivalent to about 65,000 employees drawing salaries without performing work for an entire year! The impact of downtime in tangible terms, such as lost revenues, and intangible terms, such as lost customer confidence, can only be estimated. One study placed idle labor costs across all industry verticals at nearly $1 million per hour. Despite this data, the truism still applies in disaster preparedness that fewer than 50% of businesses have any sort of prevention and recovery capability. Of those that do, fewer than 50% actually test their plans the equivalent of having no plan whatsoever. The reasons are simple. First, planning requires money, time and resources whose allocation may be difficult to justify given that the resulting capability may never need to be used. Second, plans are typically difficult to manage, since effective planning typically involves multiple data protection techniques and recovery processes that lack cost-effective testing and validation methods. Third, many vendors have embued customers with a false sense of security regarding the invulnerability of their product or architecture, conflating the notion of
- 6. Copyright 2013 by The Data Management Institute LLC. All Rights Reserved. 6 high availability architecture with business continuity strategy (the former is actually a subset of the latter). Constructing the plan itself follows a well- defined roadmap. Following an interruption event, three things need to happen: 1. The data associated with critical applications must be recovered to a usable form. 2. Applications need to be re- instantiated and connected to their data. 3. Users need to be reconnected to their re-hosted applications. These three central tasks need to occur quickly as the duration of an interruption event is usually what differentiates an inconvenience from a disaster. Taken altogether, the three tasks may be measured using the metric time to data. Time to data, sometimes referred to as a recovery time objective, is both the expression of the goal of a plan and a measure of the efficacy of a strategy applied to realize that goal. Data Recovery is Key The process for building a comprehensive continuity capability requires book length description. (One is being developed online as a free blook at book.drplanning.org.) The much condensed version has three basic components. To do a good job of developing a continuity capability, you need to know your data or more specifically, what data belongs to what applications and what business processes those applications serve. Data and apps inherit their criticality their priority of restore from the business processes that they serve. So, those relationships must be understood.
- 7. Copyright 2013 by The Data Management Institute LLC. All Rights Reserved. 7 The next step is to apply the right stratagems for data recovery, application re-hosting and reconnecting users to each application and its data based on that earlier criticality assessment. Third, plans must be tested both routinely and on an ad hoc basis. Testing is the long-tail cost of continuity plans, and the decisions we make about recovery objectives and the methods we use to build recovery strategies need to take into account how these strategies will be tested to see how we reduce the cost of a continuity program that virtually nobody wants to spend money on. As a practical matter, data recovery is almost always the slowest part of recovery efforts following an outage but this is contingent on a lot of things. First, how is data being replicated? Is it backed up to tape, mirrored by software or disk array hardware to an alternative hardware kit? Is the data accessible and in a good condition for restore at the designated recovery site? Chances are good that a company uses a mixture of data protection techniques today. Thats a good thing, since data is not all the same and budgetary sensibility dictates that the most expensive recovery strategies be applied only to the most critical data. Still, planners need to ensure that the approaches being taken are coordinated and monitored on an ongoing basis. From this perspective, a data protection management service that provides a coherent way to configure, monitor, and manage the various data replication functions would be a boon. With such a service in place, it would be much simpler to ascertain whether the right data is being replicated, whether cost- and time-appropriate techniques are being applied based on data criticality, and whether data is being replicated successfully on an on-going basis. As a rule, a built-in service is superior to one that is bolted on when it comes to data protection. It follows, therefore, that a data protection management service should be designed into the storage infrastructure itself and in such a way as to enable its use across heterogeneous hardware repositories.
- 8. Copyright 2013 by The Data Management Institute LLC. All Rights Reserved. 8 Moreover, the ideal data protection management service should be able to manage different types of data protection services such as those that are used today to provide defense in depth to data assets. Defense in depth is a concept derived from a realistic appraisal of the risks confronting data assets. Dif