Delphix Database Virtualization Platform Technical White Paper...Tier 1: Database Virtualization...
Transcript of Delphix Database Virtualization Platform Technical White Paper...Tier 1: Database Virtualization...
Delphix Database Virtualization Platform
Technical White Paper
2
Delphix Database Virtualization Platform
Revision: 5 June 2012
You can find the most up-‐to-‐date technical documentation at:
http://www.delphix.com/support
The Delphix Web site also provides the latest product updates.
If you have comments about this documentation, submit your feedback to:
© 2011 Delphix Corp. All rights reserved.
The Delphix logo and design are registered trademarks of Delphix Corp. in the United States and/or other jurisdictions.
All other marks and names mentioned herein may be trademarks of their respective companies.
Delphix Corp.
275 Middlefield Road, Suite 50
Menlo Park, CA 94025
www.delphix.com
3
Table of Contents
Executive Summary ................................................................................................................................... 4 Application Lifecycle Challenges ............................................................................................................... 5 Delphix Platform Overview ........................................................................................................................ 7 Delphix 3-Tier Architecture ........................................................................................................................ 8 Tier 1: Database Virtualization Layer ........................................................................................................ 8 Tier 2: Storage Optimization Layer .......................................................................................................... 10 Tier 3: Cloud Automation Layer ............................................................................................................... 11 High Availability Architecture ................................................................................................................... 13 Summary ................................................................................................................................................. 14
4
Executive Summary
From financials and order management to eCommerce and customer support, nearly every business process in the modern enterprise is powered by database driven applications. As a result, maximizing the agility and availability of application environments while minimizing associated costs has emerged as a top priority for CIOs across the globe. Unfortunately, this goal remains elusive and organizations continue to struggle with huge cost inefficiencies, crippling inagility, and unmitigated downtime risks within application environments.
Much of the cost inefficiency stems from the long tail of redundant data in application development lifecycles. According to an ESG Group survey, on average, organizations create up to 10 full copies of each production database for development, testing, training, reporting and other purposes. Continued double-‐digit growth in structured application data and added redundancy of data backups only magnify storage costs. While these copies are essential for a robust and agile production lifecycle, they also become a storage capital expenditure multiplier.
Beyond hard capital costs lies the bigger problem of application in-‐agility due to operational overhead. Ongoing application lifecycle projects for upgrades, customizations, expansions etc. require frequent movement of data across production and pre-‐production environments. For IT teams, this translates into regular database refresh and provisioning tasks that take days to weeks of coordinated effort each time.
In addition to cost and labor sprawl, organizations also face significant business downtime risks despite considerable investments in data protection. Simply put, the scale of modern businesses makes even edge case failures like database corruption a tangible risk that must be mitigated. Downtime costs for major enterprise applications range from $10,000 to over $70,000 per minute. At these levels, even a day or two of recovery time from disk or tape will add up to tens of millions of dollars in downtime costs.
Delphix addresses these growing challenges through patent-‐pending database virtualization software. Enterprises of all sizes, across the globe and across industries, are leveraging Delphix to realize:
• Greater Agility: 500% workforce multiplier • Reduced Costs: 10x reduction in application storage costs • Lower data risk: Reduced downtime costs and faster recovery
5
Application Lifecycle Challenges
Most enterprises are unaware of the full extent of the redundancy and complexity required to support production operations. Due to the criticality of applications like ERP and CRM, enterprises create several copies of production databases before going live with even small changes. Analysts estimate that enterprises make up to 10 copies on average for each production database. For most business-‐critical applications, full copies are created for development, quality assurance, user acceptance testing, reporting etc. While these copies are essential for a robust and agile production lifecycle, they also become a storage cost multiplier. Continued double-‐digit growth in structured application data and added redundancy from data backups to disk and tape only magnify storage costs.
Production: Tip of the Database Iceberg
Even worse, the need to constantly move and refresh data sets compounds the complexity. As applications grow and change, databases have to be refreshed in development, then moved to testing, QA, staging, and back to production. Some environments, such as staging for data warehouses, have to be refreshed weekly or daily. Using stale or partial data can prolong timelines or impede proper testing by adding risk to development efforts. Unfortunately, it takes the average IT organization up to several weeks of coordinated effort to provision a new copy of a database. Even refreshing a downstream database from a production source takes most organizations up to a week of cross team effort.
Production databases are generally backed up to disk and eventually to tape to enable recovery from failures. Physical disaster recovery protection is also often deployed for increased availability. However most organizations still lack protection for edge cases like logical corruption that would simply be propagated by physical disaster recovery solutions that simply copy data. Even though such failures are rare, their probability grows with time and scale. Near term quick recovery options like flashback storage only provide a small window of quick recovery. Once outside that window, recovery is slow and business downtime costs can quickly run into the millions.
6
According to the Standish group, average downtime costs for major database driven enterprise applications exceed $10,000 per minute. Specific applications such as those used for trading in the financial sector can cost far more, over $70,000. Even at low end, a 2 day recovery period would add up to: $10,000 / min X 48 hrs X 60 mins = $28,800,000 in downtime costs.
Standish Group – Downtime Cost Report
Downstream database copies are generally an afterthought when it comes to security or availability. Data backups are not less common in pre-‐production environments and yet outages can certainly set back development projects. Additionally, copies that are proliferated for project specific uses (such as functional testing) are often unaccounted for, rarely decommissioned, and generally persist in the environment creating significant risk of audit failures and data breaches.
7
Delphix Platform Overview
The Delphix solution answers a simple question: why make and move all these copies? What if a single, virtual authority could stay synchronized with production? From that single authority, all downstream copies could be served from a shared footprint—eliminating not only redundant infrastructure but also all of the operational time and complexity required to shuffle fresh data from place to place.
More importantly, virtualizing non-‐production databases can empower IT to better service the business—delivering on critical requests from lines of business to improve applications that drive sales or operational efficiency. What if developers or business analysts could provision or refresh databases in a self-‐service model through standard web browsers?
With database virtualization, enterprises can spend less on infrastructure, while simplifying operational IT complexity. More importantly, Delphix provides these powerful benefits with no changes required to production—enabling a high return on investment, at very low risk.
Delphix software installs as a virtual machine, on premise or on cloud infrastructure, using SAN storage—so it provides maximum flexibility for current and future infrastructure standards. Once installed, Delphix links to production, physical databases via standard APIs and looks like another application to the database. Delphix asks the physical, source database to send a copy of all its file and log blocks to the Delphix Server, which shrinks the data down to as little as 25% of the original size.
After the initial loading, Delphix maintains synchronization with source databases based on policy—e.g. once daily or within seconds of the last transaction. Once linked, Delphix maintains a TimeFlow of the source database—a rolling record of file and log changes retained by policy (e.g. two weeks). From any time within that retention window, a virtual database (VDB) can be instantly provisioned from the Delphix Server. VDBs are served from the shared storage footprint of the source database TimeFlow, so no additional hardware is required.
8
Multiple VDBs can be provisioned from any point in time in a TimeFlow, down to the second. With built-‐in log synchronization, multiple databases can be provisioned all at the exact same point in time, which is critical for applications that require federated or time-‐consistent datasets (e.g. SAP). Once provisioned, VDBs are independent, read-‐write databases, and changes made to VDBs by users or applications are written to new, compressed blocks managed by the Delphix software appliance.
Delphix virtualizes the data files and logs that make up a database, enabling a business to share those data blocks across lifecycle environments like development, testing, QA, and operational reporting. Users can do anything they normally do to a database—add/drop tables, change schemas, etc., and users and applications see no difference between a VDB and a physical database. Delphix currently supports Oracle 10 and 11g on Linux, Solaris, HP-‐UX, and AIX; support for other databases and versions is forthcoming.
Delphix 3-Tier Architecture
When it comes to databases, Delphix virtualizes the core: the data files and logs of a database. Delphix Server architecture has three technology tiers: § Database Virtualization:
abstracts database snapshots and log files to present fully functional, high-‐performance, read/write VDBs to database servers
§ Storage Optimization: efficiently manages the underlying storage to minimize the amount of capacity necessary in a shared VDB environment
§ Cloud Automation: web UI enables self-‐service VDB provisioning and ongoing management.
Tier 1: Database Virtualization Layer
The Delphix Database Virtualization layer establishes and maintains synchronization of data between source databases and a Delphix Server, creating a resource called a TimeFlow, a window of changes recorded for source databases that can be used to provision or refresh virtual databases from any retained point in time. TimeFlow can be synchronized on demand, on a scheduled daily basis as a batch process (e.g. at 2 AM each night), or in near-‐real time.
KEY FEATURES
§ Deployment in minutes on standard x86 servers
§ Sync with production databases via standard APIs
§ Provision virtual databases (VDBs) in seconds
§ Synchronize multiple databases to any point in time
§ Maximize performance with quality of service controls
§ Offload production workloads
§ Rapidly restore and recover databases
§ Reduce storage requirements by over 10x
9
LogSync: Time Machine for Databases
The Database Virtualization layer also provide fully integrated log management via LogSync: synchronization of changed log blocks, retention of logs by policy, and automatic application of appropriate logs via TimeFlow, allowing end users to provision or refresh VDBs with high granularity—down to the second.
LogSync provides powerful flexibility and recoverability for production environments. Application outages often result in loss or corruption of data in tables in databases. Repairing or recovering data can be a complicated and time-‐consuming process—if recovery is feasible at all. For instance, if a customer-‐facing application crashes, it may result in data loss to a portion of the application data (e.g. current open orders). When the production application returns, it may be some time before the data loss is identified. By that time, new data has been entered into the database. Historically, businesses might try to recover the data from a backup and then manually apply logs to get access to the right dataset at the right point in time. If logs are not archived separately, or backup data cannot be recovered, recovery may not be possible. With Delphix LogSync, it becomes trivial to access multiple points in time for a databases—down to the second—eliminating the hours or days it takes to recover backup data and manually apply log files from archives. The Database Virtualization layer enables a business to access multiple time points quickly and easily (e.g. seconds before failure), verify correct data, and then repair data through standard means—reducing risk of data loss and downtime for production applications.
Businesses architect database systems for performance, so virtualization of databases must address performance concerns. The Delphix Server has been designed to operate as a caching tier to augment the I/O performance of the storage subsystem assigned to the Delphix application.
Designed for Performance
10
The majority of I/O requests for databases are serviced by the memory in the database server where database executables run. Delphix extends the benefits of the memory in database servers by servicing I/O requests quickly from the memory in the Delphix Server. With one or more VDBs launched, many block requests will be serviced by Delphix memory, especially because VDBs have a high probability of sharing data blocks. As a result, Delphix can provide high performance even when provisioning VDBs over NFS.
Once primed by use, the Database Virtualization cache can provide more performance than would otherwise be possible for the underlying storage subsystem, especially for transaction workloads. Solid-‐state disks (SSDs) can be added to the Delphix Server to further boost the shared performance cache. Not only does Delphix caching technology service shared read requests, it also logs data to quickly commit writes and minimize inefficient disk spindle movements, while preserving data consistency in failure cases. Along with compression and decompression on the fly, Delphix preserves spindle movements for underlying storage—maximizing performance for concurrent, consolidated VDB workloads.
Finally performance levels can be guaranteed through intuitive quality of service settings, allowing businesses to consolidate with confidence—knowing that minimum performance can be guaranteed for developers or other users of VDBs.
Set Quality of Service Per VDB
Tier 2: Storage Optimization Layer
Storage Optimization comprises the second tier of the Delphix Server technology architecture. Modern storage arrays have long used block mapping and snapshot technologies to provide efficient storage for different points in time. Delphix wraps these mature snapshot and cloning technologies with application awareness and intelligence, abstracting key features—like log management and synchronization across copies—so they can be managed by database and application teams. Delphix Storage Optimization adds to snapshot efficiency by identifying database block boundaries and compressing databases at the block level, providing as much as 2-‐4x additional efficiency, while maintaining the ability to access data block by block—which is critical for performance.
11
Furthermore, Delphix filters incoming data streams—eliminating temporary, empty, or scratch blocks—driving even more data reduction. Storage arrays lack sophisticated application awareness; as a result, they generally cannot refresh data on top of an existing volume without requiring redundant storage at the point of refresh. With the Database Virtualization layer and the Storage Optimization layer, Delphix can stay synchronized and enable data refreshes, while maintaining storage efficiency. As a result, Storage Optimization provides the following benefits: § TimeFlow Reduction: up to 30x data reduction for storing multiple points in time for a
database compared to storing full copies § VDB Consolidation: 10 to 20 concurrent, consolidated VDBs per Delphix Server § Compression and Filtering: up to 4x data reduction, even over snapshot and cloning
technologies § Sync Ratio: up to 10x less data moved to refresh from source databases (Delphix only
requests changed blocks), reducing load on production databases and networks § Refresh Efficiency: maintains efficiency even while staying synchronized or refreshing data
from one or more source databases.
Tier 3: Cloud Automation Layer
Storage Optimization provides hard cost savings, but the Delphix Server Cloud Automation layer may actually provide the most strategic business benefits. Today’s application rollouts and development projects often face infrastructure and management delays. If a developer needs another environment to test an unforeseen issue, it can often trigger a chain of approvals, which can take days to weeks of review—or worse, require additional hardware procurement, which can takes months or quarters. With Delphix, DBAs can manage private clouds for their databases, where they can easily provision or refresh supporting environments for projects and developers. By lowering the time, complexity, and infrastructure requirements to stand up a new database sandbox, Delphix cuts through organizational complexity and dependencies, even enabling developer or analyst self service.
Private Cloud: Centralized Management or Self Service
Self-‐service not only transforms organizational behavior, it can accelerate and enable innovation that would otherwise be lost. Delphix Cloud Automation provides secure user management, support for LDAP, and granular user roles and privileges—so it can be managed centrally by DBAs or provide self-‐service provisioning for developers and analysts.
12
In addition, Delphix Cloud Automation has been designed to address all the complex workflows common in application and database projects. In database related projects, developers may need fresh data from production, QA environments may need copies of development databases, and promotion or cutover may require moving data back into production—multi-‐directional data flows. Delphix automates all directions of data flow for database projects. TimeFlow enables easy refresh or creation of new environments from production.
If QA needs the development copy, a DBA can use the Delphix feature that enables creation of a VDB from another VDB and the delegation feature that provides rights and privileges to specific users in a given environment. Finally, with the V2P (virtual to physical) feature, DBAs can easily promote or push a full, physical copy of a VDB to a production, staging, or UAT environment.
In environments where data security is a concern, users can automatically mask sensitive data, such as Social Security Numbers, through integrated support for pre and post scripting—obfuscating private information before VDBs become accessible to users. Delphix also automates all the complex, fault prone parameterization required to provision copies of databases, from changing Oracle SIDs to editing cache or Oracle SGA settings.
Templates for Configurations and Parameters
With Cloud Automation, these highly complex and error-‐prone configurations can be easily stored as templates, which can be easily re-‐applied when provisioning or refreshing VDBs. The end result is a solution designed to consolidate all the dynamic, lifecycle copies of databases created in business environments, while reducing the time and complexity of provisioning to nearly zero.
13
High Availability Architecture
A Delphix Server can be replicated to another Delphix Server, either in the same datacenter or across the WAN to an alternate site—with minimum bandwidth requirements. If a source Delphix Server fails, VDBs can be easily re-‐provisioned by the target Delphix Server with a few simple commands. In addition, by selectively replicating only masked VDBs, an enterprise can manage across different network zones for production and non-‐production environments.
Replicate Locally or Across Sites
Besides replication, Delphix enables network multi-‐path support for network redundancy, SAN multi-‐path support for SAN redundancy, optional software mirroring, and end-‐to-‐end data integrity checking. The Delphix stack itself has been designed with component availability and an internal architecture that enables online updates and patches for components. System-‐level upgrades can be performed with a fast restart. As a standards-‐based application, Delphix itself can be easily backed up via NDMP to all major third-‐party backup applications. Additionally, Delphix is delivered as a virtual or soft appliance and can be part of a VMWare cluster. In this configuration, another layer of high availability can be put in place by using VMotion to stand up an identical Delphix instance on a second machine in the event of a hardware failure.
Delphix is in Oracle’s standard ISV and certification programs. By policy, Oracle does not certify storage snapshot, cloning, and file system technologies, which are well understood and mature, and Delphix presents VDBs over standard protocols such as NFS. Oracle performs data integrity checks as part of standard operations, so consistency and integrity of VDBs are verified at every provisioning point—a far higher standard to meet than backup or archive products.
Finally, in most enterprise environments, lifecycle copies of databases for development and testing often do not have the same level of protection as production systems, in terms of replication, snapshots, or frequency of backups. Yet these copies embody a great deal of work and potential business value. Delphix enables the easy setup of TimeFlow policies for both source databases and VDBs, which provides down-‐to-‐the-‐second, instant recovery for production and non-‐production copies. Along with Delphix Server replication and backup, Delphix can improve the level of availability and reliability in enterprise datacenters, while radically reducing costs and infrastructure.
14
Summary
Database virtualization is the next logical step in the evolution of the datacenter. In many environments, it may be the single largest opportunity for operational simplification and capital cost savings available. Today, many projects—like an end-‐of-‐quarter adjustment to a CRM application to drive sales efficiency—never become reality due to the capital costs and operational overhead. Software infrastructure like server virtualization and Delphix, however, facilitate innovation, allowing businesses to capture potentially lost opportunity value. With Delphix, enterprises can spend less and move faster than the competition.
By combining consolidation and data reduction technologies, Delphix cuts database storage expenditures by over 90% and can even pay for itself instantly if new database storage needs to be procured. Many IT organizations have been forced to do more with less. The ability to slash complexity and time for provisioning by over 90% frees IT personnel to focus on higher priorities and projects with higher returns. The operational gains and elasticity of more copies without added cost can collectively yield a 5X workforce acceleration benefit. Finally, by enabling instant recovery of databases, Delphix acts as an extended recovery layer over existing data protection investments.
The Delphix virtual appliance can be deployed in under an hour on standard hardware or on cloud infrastructure, providing coverage across today and tomorrow’s systems. With Delphix, IT teams can quickly and easily manage a private cloud for databases, enabling self-‐service provisioning and refresh for developers and analysts. Effectively, Delphix provides the driving benefits of cloud adoption and broader virtualization, while overcoming the key barriers.
© 2012 Delphix Corp. All rights reserved.