Symantec NetBackup OpenStorage -...

15
DEDUPLICATION STORAGE White Paper Abstract Conceptually understanding OpenStorage software and the API-based integration with Symantec NetBackup provides a clear view of the business value and technical merits of the integration. This guide moves past the conceptual stage to solution planning and deploy- ment. Best practice guidelines are covered with the goal of eliminating implementation challenges. Knowledge and experience gained from assisting early adopters is logically presented for the overall benefit of those deploying an OpenStorage solution. Symantec NetBackup OpenStorage Data Domain Deduplication Storage Best Practices Guide

Transcript of Symantec NetBackup OpenStorage -...

Page 1: Symantec NetBackup OpenStorage - Veritasvox.veritas.com/legacyfs/online/veritasdata/DD-WP-OST-BestPractic… · duplication traffic exclusively through the registered interface. 4

DEDUPLICATION STORAGE

White Paper

AbstractConceptually understanding OpenStorage software and the API-based integration with Symantec NetBackup provides a clear view of the business value and technical merits of the integration. This guide moves past the conceptual stage to solution planning and deploy-ment. Best practice guidelines are covered with the goal of eliminating implementation challenges. Knowledge and experience gained from assisting early adopters is logically presented for the overall benefit of those deploying an OpenStorage solution.

Symantec NetBackup OpenStorageData Domain Deduplication Storage Best Practices Guide

Page 2: Symantec NetBackup OpenStorage - Veritasvox.veritas.com/legacyfs/online/veritasdata/DD-WP-OST-BestPractic… · duplication traffic exclusively through the registered interface. 4

2 SYMANTEC NETBACKUP OPENSTORAGE

Table of Contents

1 INTROdUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1 .1 TARGET AUdIENCE . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1 .2 ExECUTIvE SUMMARY . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 PlANNING . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 .1 NAMING CONvENTIONS . . . . . . . . . . . . . . . . . . . . . . . . 3

2 .2 NETwORKS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 .3 dOCUMENTATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3 OPTIMIzEd dUPlICATION . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3 .1 STORAGE UNITS ANd STORAGE SERvER ACCESS . . . . 9

3 .2 NETwORK CONSIdERATIONS . . . . . . . . . . . . . . . . . . . 10

3 .3 ThROTTlING OPTIMIzEd dUPlICATION . . . . . . . . . . 10

3 .4 OPTIMIzEd dUPlICATION FAIlURES . . . . . . . . . . . . . 10

3 .5 SEEdING REMOTE dATA dOMAIN SYSTEMS . . . . . . . 10

3 .6 dUPlICATION JOB CONFIGURATION OPTIONS . . . . . 11

4 dUPlICATION TO TAPE . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

4 .1 TAPE CREATION FROM ThE PRIMARY NETBACKUP COPY . . . . . . . . . . . . . . . . . . . 12

4 .2 TAPE CREATION FROM A NON-PRIMARY NETBACKUP COPY . . . . . . . . . . . . . . . 12

5 dISASTER RECOvERY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

5 .1 wIThIN ThE SAME NETBACKUP dOMAIN . . . . . . . . . 13

5 .2 TO A dIFFERENT NETBACKUP dOMAIN . . . . . . . . . . . 13

6 AddITIONAl REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . 13

7 CONClUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

8 APPENdIx – MIGRATION TO OPENSTORAGE . . . . . . . . . . 14

8 .1 MUlTIPlE PROTOCOlS ON ONE dATA dOMAIN SYSTEM . . . . . . . . . . . . . . . . . . . 14

8 .2 ExISTING BACKUPS – RETAIN OR dUPlICATE? . . . . . 14

8 .3 ARE STORAGE lIFECYClE POlICIES REqUIREd? . . . . 14

8 .4 NETBACKUP POlICY MOdIFICATION . . . . . . . . . . . . . 14

8 .5 lEGACY REPlICATION . . . . . . . . . . . . . . . . . . . . . . . . . 15

8 .6 dElETING lEGACY STORAGE UNITS . . . . . . . . . . . . . 15

Symantec NetBackup OpenStorageData Domain Deduplication Storage Best Practices Guide

Page 3: Symantec NetBackup OpenStorage - Veritasvox.veritas.com/legacyfs/online/veritasdata/DD-WP-OST-BestPractic… · duplication traffic exclusively through the registered interface. 4

SYMANTEC NETBACKUP OPENSTORAGE 3

4NetBackup media server load balancing, eliminating the need to manually divide client backups across NetBackup media servers utilizing OpenStorage storage units.

4Tape consolidation – Backup images from remote locations and branch offices can be replicated to a centralized location where they can be duplicated to tape under the control of NetBackup.

2 PlanningDeciding to change naming conventions halfway through a deploy-ment can be painful, even more painful if production backups were executed to previously named components that later need to be deleted such that they can be renamed.

Likewise, reconfiguring portions of the IP network that connect NetBackup media servers and Data Domain deduplication storage systems halfway through a deployment can also create a less than optimal experience when testing with production backups. Combining name changes with network changes is made worse when nothing is properly documented.

While configuration changes are both possible and supported, a production environment may not be the best place to learn these techniques for the first time. Production environments differ from lab environments in that the severity of a situation is likely to be less pronounced in the lab.

Creating a plan and documenting the configuration forms the foundation for a successful deployment and subsequent test phases.

2.1 Naming ConventionsNomenclature, the assigning of names to OpenStorage specific components, is considered important for numerous reasons. The theme is to use a naming convention that will be easily understood by the user, system engineer, and potentially any support personnel involved with the OpenStorage solution.

1 Introduction Data Domain deduplication storage with OpenStorage software is not difficult to install or configure. Deployment is straightforward in most environments. However, deployments involving multiple sites and a complex environment may experience issues with naming conventions, network infrastructure, or site generated documentation detailing the installation. Therefore, OpenStorage implementations should be well planned and documented so that they can be deployed more quickly with fewer challenges when compared to the use of ad-hoc techniques.

Deployment is often followed by a series of trials or a period of testing intended to prove that the solution is functioning as planned. In this guide, OpenStorage best practices are examined and discussed to assist in eliminating the bottlenecks associated with deployment and functional testing of the solution.

1.1 Target AudienceSystem users and vendor staff associated with and performing the OpenStorage deployment are encouraged to use this guide to take advantage of substantial real world knowledge gained from assisting other customers. A prerequisite understanding of OpenStorage terms and concepts is recommended and can be gained by reviewing the following documentation:

4Data Domain OpenStorage Primer

4Data Domain OpenStorage (OST) User Guide

4Veritas NetBackup™ Shared Storage Guide

1.2 Executive SummaryBackups and the creation of duplicate backup copies with OpenStorage are the focus of this document. Network configura-tions, optimized duplication, and disaster recovery are examined. Recommended best practices, as well as strategies that are not rec-ommended are covered with the goal of enhancing OpenStorage solution planning and deployment.

OpenStorage software provides API-based integration between Data Domain storage systems and NetBackup. The API gives NetBackup visibility into the properties and capabilities of the Data Domain storage system, control of the backup images stored in the system and Wide Area Network (WAN) efficient replication to remote Data Domain storage systems.

Supported with NetBackup 6.5 and higher, OpenStorage enabled Data Domain storage systems and the Symantec NetBackup OpenStorage Option provide key enhancements for disk-based data protection strategies:

4NetBackup optimized duplication - Backup image duplication based on Data Domain deduplication and WAN efficient replica-tion that is controlled, monitored, and cataloged by NetBackup.

4Integrated NetBackup reporting of Data Domain replication job status.

4Recovery of replicated backup images in their entirety or at a granular level via the NetBackup user interface.

4Sharing of OpenStorage storage units among heterogeneous NetBackup media servers.

Page 4: Symantec NetBackup OpenStorage - Veritasvox.veritas.com/legacyfs/online/veritasdata/DD-WP-OST-BestPractic… · duplication traffic exclusively through the registered interface. 4

4 SYMANTEC NETBACKUP OPENSTORAGE

2.1.1 Data Domain HostnameThis is the hostname used to identify a system. Hostnames are resolved via DNS or Hosts files. Hostnames are not specifically NetBackup objects, but when used as storage server names they are used to tell the media servers at the TCP/IP level how to connect to the storage server.

4Use the assigned fully-qualified hostnames.

4Do not use IP addresses in place of hostnames when registering storage servers as this will limit the ability to route optimized duplication traffic exclusively through the registered interface.

4Generally speaking, use DNS to resolve hostnames to IP addresses that are routable through the environment.

4Use Hosts files to resolve those same hostnames to non-routable IP addresses.

4Do NOT create secondary hostnames to associate with alternate or local IP interfaces.

High performance Data Domain systems can accommodate data transfer rates in single or multi-stream modes that exceed the maxi-mum performance of 1 GbE (Gigabit Ethernet) networks. Explicitly not recommended is the use of hostnames based on different network interfaces so as to increase the total available bandwidth to a single Data Domain system configured as a storage server. The reasoning behind this as well as the recommended best practice remedy is covered in the “Networks” section of this document.

2.1.2 Storage ServerThis is a logical object defined within NetBackup that “points” to a Data Domain system. NetBackup communicates with the storage server and uses credentials supplied by the tpconfig utility to request use of a disk pool for backup and restore operations.

4There should be only one storage server defined per Data Domain system.

4Use the Data Domain system’s fully qualified hostname as the storage server name.

4This name must be unique across the enterprise.

4The OpenStorage plug-in uses standard TCP/IP name resolution to find the corresponding Data Domain system.

At present, a single Data Domain system can potentially be config-ured as a storage server more than once within a single NetBackup domain. This would be true in cases where a given NetBackup media server could connect to the Data Domain system using dif-ferent network names over different interfaces. The recommended best practice to this challenge is covered in the “Networks” section of this document.

Additionally, a single Data Domain system can be illegally config-ured as a storage server on more than a single NetBackup instal-lation. Currently, this is an unsupported configuration. Symantec NetBackup operates under the premise that a NetBackup master

server “owns” the storage capacity associated with a specific Data Domain system configured as a storage server.

2.1.3 Logical Storage UnitAlso referred to as an “LSU”, a Logical Storage Unit is a disk target within a storage server:

4Create one LSU per Data Domain system.

4Name the object with a “-lsu” extension to allow for easy identification as a logical storage unit object.

At present, multiple LSUs can be configured on a single Data Domain system. This can create conflicts for advanced NetBackup features such as media server load balancing, intelligent capacity planning, etc. and is not recommended. When combined with other illegal configurations such as multiple storage servers on a single Data Domain system, this can negatively impact the ability of NetBackup to effectively take advantage of such advanced features.

2.1.4 Disk PoolA disk pool is a NetBackup object that correlates to one or more LSU’s.

4Create one disk pool per LSU.

4Name the object with a “-dp” extension to allow for easy identification as a disk pool object.

2.1.5 Storage UnitThis is a logical target that is created within NetBackup. A stor-age unit that is created of the type “disk” and the disk type of “OpenStorage (Data Domain)” has three properties: a storage unit name, a disk pool target, and a list of media servers.

4Create one storage unit per disk pool.

4Name the object with a “-su” extension to allow for easy identification as a storage unit object type.

4Only media servers that have credentials defined for the storage server can use the storage unit.

4Select either “use any available media server” or a specific list of media servers as appropriate.

The selection of multiple media servers within the storage unit definition effectively enables media server load balancing. Client backups that can potentially connect to two or more media servers can take advantage of media server load balancing, where the least loaded media server (best candidate media server) is selected for use by the NetBackup master server.

2.1.6 Storage Unit GroupThis is a logical target that is created within NetBackup. A storage unit group is a collection of storage units that can be used based on selection criteria.

4To allow for N+1 redundancy consider the optional use of storage unit groups.

Page 5: Symantec NetBackup OpenStorage - Veritasvox.veritas.com/legacyfs/online/veritasdata/DD-WP-OST-BestPractic… · duplication traffic exclusively through the registered interface. 4

SYMANTEC NETBACKUP OPENSTORAGE 5

NetBackupMaster Server

NetBackupMedia Server

NetBackupMedia Server

LANLAN

LAN

WAN

Data DomainOpenStorageStorage Units

Figure 2: Geographically dispersed configuration Figure 2 shows a configuration with two NetBackup media servers and two Data Domain systems. Each NetBackup media server has a local network connection to a co-located Data Domain system. A WAN also connects the Data Domain systems. Backup traffic uses the local network connection between a given NetBackup media server and its co-located Data Domain system. Optimized duplication traffic uses a separate NIC on each Data Domain system. This configuration may be preferred in cases where backup and restore data transfer rates require the use of a 10 GbE network, and a lower bandwidth network is able to accommodate optimized duplication traffic.

NetBackup Media Servers

Data DomainOpenStorageStorage Unit

NetBackupClient

LANLAN

Figure 3: NetBackup media server load balancing Figure 3 shows a NetBackup client that can be backed up through a number of different NetBackup media servers. The OpenStorage storage unit has been configured so that each NetBackup media server can access its resources. This enables NetBackup media server load balancing, where the least loaded media server is used to fulfill a backup request. Additionally, this configuration allows NetBackup to bypass an offline media server when fulfilling a backup or restore request.

4Create one storage unit group per OpenStorage storage unit using a “-sug” extension.

4The preferred storage unit could be the first storage unit in the list of storage units.

4A second storage unit can be added to the group for use should the first storage unit be rendered non-operational.

4Select the “Failover” storage unit selection algorithm.

The use of storage unit groups is optional. Use of the “Failover” selection algorithm is a best practice as it facilitates sending the same backups to the same Data Domain OpenStorage Server which will equate to a higher data deduplication ratio. In the event that the preferred storage unit enters a non-operational state, backups will be sent to an alternate storage unit. This methodology may be of interest for mission critical or otherwise important backup jobs. Alternatively, you may elect not to use storage unit groups for backups that do not require N+1 redundancy.

2.2 NetworksVarying degrees of network complexity are associated with a given OpenStorage deployment. At a minimum, a single Data Domain system configured as a storage server is network connected to a NetBackup media server. NetBackup optimized duplication adds additional requirements as does a configuration that leverages media server load balancing.

This section reviews sample network topologies:

4NetBackup media server and Data Domain systems sharing a common LAN configured for optimized duplication.

4NetBackup media servers and Data Domain systems in geographically different locations configured for optimized duplication.

4NetBackup media server load balancing with a Data Domain system.

NetBackupMedia Server

LAN

Data DomainOpenStorageStorage Units

Figure 1: Optimized duplication with a common LAN Figure 1 shows a simple example of a NetBackup master/media server LAN connected to two Data Domain systems. In this use case both backup and optimized duplication traffic use the same NIC (Network Interface Card) on a given Data Domain system.

Page 6: Symantec NetBackup OpenStorage - Veritasvox.veritas.com/legacyfs/online/veritasdata/DD-WP-OST-BestPractic… · duplication traffic exclusively through the registered interface. 4

6 SYMANTEC NETBACKUP OPENSTORAGE

Other solutions that may pose similar issues are worth noting.

4Combining an OpenStorage storage unit and a basic disk stor-age unit on the same Data Domain system can create capacity reporting issues. While the simultaneous use of all Data Domain supported protocols is possible, NetBackup intelligent capacity management will not be aware of space allocated to basic disk or VTL (Virtual Tape Library) operations.

NetBackupMedia Server

Data DomainOpenStorageStorage Unit

Disk Pool = DataDomain-dpBasic Disk = /backup/nfsBasic Disk = /backup/cifs

LSU = DataDomain-lsuNFS Export = /backup/nfsCIFS Share = /backup/cifs

Figure 5: Combining OpenStorage and basic disk storage units Using a single Data Domain system as an OpenStorage storage unit and a basic disk storage unit (NFS mount or CIFS share) is not recommended. NetBackup assumes complete and total ownership of any OpenStorage storage unit space. This example is also applicable to the sharing of a Data Domain system between OpenStorage and VTL protocols.

4Using an OpenStorage Storage Unit for NetBackup catalog backups.

Figure 6: NetBackup limitations NetBackup catalog backups cannot be written to “DiskPool” type storage units, which include OpenStorage storage units. This NetBackup limitation is known to exist in product versions 6.5.0 through 6.5.2. The limitation may eventually be removed in a future version of NetBackup.

Typical deployments may employ a combination of local and geographically dispersed components that leverage NetBackup media server load balancing as well as optimized duplication.

2.2.1 Problematic ConfigurationsThe high data transfer rates achievable with Data Domain DD690 systems can easily exceed the available bandwidth of a single 1 GbE network connection. To date, challenges have been associated with attempts to achieve high data transfer rates while circumventing the need for a 10 GbE backup network.

One creative solution utilizes multiple 1 GbE network connec-tions from a NetBackup media server to a single DD690 system. Although single stream performance would still be bound to the 1 GbE limitations of approximately 125 MB/s, the potential aggregate data transfer rate would be much greater.

4Capacity reporting within NetBackup is skewed as NetBackup believes that there are four separate physical storage servers.

4Manual assignment of NetBackup policies to one of four storage units are required in order to route backups over a specific network interface.

4NetBackup media server load balancing does not function normally. Storage unit groups can be created that utilize “load balance” selection criteria to overcome this challenge.

4Administrative complexity and overhead is increased with additional network names, storage servers, LSUs, disk pools, storage units, and possibly storage unit groups.

NetBackupMedia Server

Data DomainOpenStorageStorage Unit

Eth0Eth1Eth2Eth3

Figure 4: Attempting to overcome the single 1 GbE bottleneck This configuration requires using four unique network names for a single Data Domain system. As a result it also requires four unique storage server instances within NetBackup. Adding four unique LSUs to the configuration yields the ability to create four unique disk pools on the single Data Domain system. The four disk pools are used in configuring four storage units. This configuration results in numerous issues and should be avoided.

Page 7: Symantec NetBackup OpenStorage - Veritasvox.veritas.com/legacyfs/online/veritasdata/DD-WP-OST-BestPractic… · duplication traffic exclusively through the registered interface. 4

SYMANTEC NETBACKUP OPENSTORAGE 7

10 GbE Private Network

Figure 8: Dedicated backup network

Dedicated backup networks provide a number of tangible benefits.

4Dedicated backup networks segregate NetBackup media server and storage unit traffic from other network traffic. Contention issues are constrained to backup and recovery jobs. Known available bandwidth can be managed from the perspective of achieving aggressive data protection and recovery service levels.

4Dedicated backup networks lay the foundation for a scalable infrastructure should data protection network bandwidth requirements change over time.

4Data Domain recommends the use of a 10 GbE network infrastructure in cases where single stream or aggregate data transfer rates in excess of 125 MB/s are required between a single NetBackup media server and the Data Domain system.

When deploying a Data Domain system that can accommodate data transfer rates exceeding the capabilities of 1 GbE networks, the use of a 10 GbE infrastructure overcomes data transfer rate bottlenecks. Single stream performance that exceeds 125 MB/s dictates the need for a 10 GbE network connection. Aggregate performance that exceeds 125 MB/s from a single NetBackup me-dia server also dictates the need for a 10 GbE network connection.

4Network topology without a 10 GbE infrastructure

As discussed previously in the naming conventions section of this document, strongly recommended is the use of only one storage server and LSU per Data Domain system. This restricts the ability to use multiple 1 GbE interfaces between a single NetBackup media server and a single Data Domain system configured as a storage server.

The network connecting NetBackup media servers to a given Data Domain system can incorporate the use of multiple 1 GbE interfaces so long as there is only one connection per NetBackup media server.

4Using an OpenStorage Storage Unit in more than one NetBackup domain.

NetBackupMaster Server 1

Data DomainOpenStorageStorage Unit

Storage serverconfigured foruse in multiple

NetBackupdomains

NetBackupMaster Server 2

LAN

MultipleNetBackupdomains

Figure 7: Unsupported storage server in multiple NetBackup domains Using OpenStorage storage units in a multiple NetBackup master server configuration appears attractive as two sites could potentially replicate to each other and effectively serve as disaster recovery vehicles for each other. The configuration is not supported by NetBackup however, as only one NetBackup master server can effectively own or control a given storage server.

2.2.2 Recommended ConfigurationsBest practice recommendations are based on known reference deployments that exhibit desirable behavior and performance characteristics. Simplicity is preferred over complexity. Ease of deployment, simplified administration, and predictable results have yielded these general themes:

4Data Domain recommends interconnecting NetBackup media servers and Data Domain systems using a dedicated backup area network.

There is no fundamental reason to commingle NetBackup client network traffic with the network that connects NetBackup media servers and storage servers. Whenever possible, the network used for NetBackup media server to Data Domain system communica-tions should be segregated from other production networks.

While not always possible based on customer criteria and pre-existing NetBackup media server and network infrastructure deployments, the use of a dedicated backup network is preferred when compared to mixed use network configurations.

Page 8: Symantec NetBackup OpenStorage - Veritasvox.veritas.com/legacyfs/online/veritasdata/DD-WP-OST-BestPractic… · duplication traffic exclusively through the registered interface. 4

8 SYMANTEC NETBACKUP OPENSTORAGE

10 GbE Private Network 10 GbE Private Network

1 GbE Public Network

Figure 10: Separate replication network for optimized duplication Using a separate network for replication controlled by NetBackup optimized duplication is optional, and is usually deployed in cases where the source and destination Data Domain systems reside in geographically different locations. Note this also serves to separate regular backup and restore network traffic from replication traffic that may be a traveling over a wide area network.

2.3 DocumentationWith multiple NetBackup media servers, multiple Data Domain systems, and the potential use of multiple networks combined with different geographical locations, the importance of documenting the deployed solution cannot be overemphasized. Proper documen-tation enables various site and vendor groups including manage-ment, data protection administrators, and network administrators to understand and maintain the deployed solution. Should the need arise to modify, alter, or enhance the solution, documentation lays the groundwork for moving forward. Should the need for technical support or other assistance be required, documentation can assist in rapid problem isolation and resolution.

4Topology Diagram: This basic diagram consists of a map of physical components labeled using the recommended naming conventions. Also included are the individual networks and IP addresses of the components. This common sense approach makes it possible for others within or outside of the organiza-tion to quickly understand the overall view of the deployed solution should one person be on vacation or otherwise unable to assist when needed.

4Data Collection: Collecting and recording relevant configura-tion information is consistent with the creation of best practice documentation. On the NetBackup master server and each NetBackup media server used for the OpenStorage solution the following commands (or their equivalent) should be executed:

1 GbE Private Network

1 GbE Private Network

1 GbE Private Network

1 GbE Private Network

Figure 9: Recommended use of multiple 1 GbE networks In the example shown in figure 9, each of four NetBackup media servers connects to a specific NIC on a Data Domain system configured as a single storage server. Each NetBackup media server is configured to use DNS or a local host file such that the storage server name resolves to a specific interface on the Data Domain system. This configuration accommodates NetBackup media server load balancing as it utilizes a single storage server, single disk pool, and a single storage unit. By default the storage unit is defined to allow all four NetBackup media servers to use the shared disk pool resource. This 1 GbE topology imposes limits on maximum single stream as well as aggregate data transfer rates from any single NetBackup media server to the Data Domain system. The combined data transfer rate of all NetBackup media servers can result in an aggregate data transfer rate that seeks to better utilize resources and achieve the maximum throughput possible on the Data Domain system.

4Optimized duplication traffic can use the same network connection as the NetBackup media server, or it can use an alternate NIC.

Backup and recovery data streams are fully inflated, where every byte of the stream passes over a network connection. Optimized duplication replicates deduplicated data between source and destination targets, and typically requires only a fraction of the network bandwidth consumed by backup or recovery jobs. The choice in deciding what network interface to use for optimized duplication is usually based on deployment requirements.

In cases where optimized duplication traffic flows between geographically different locations, some customers have chosen to use a separate dedicated network connection. This connection links source and destination Data Domain systems specifically for the purpose of replication controlled by NetBackup initiated optimized duplication. User requirements to track WAN link usage may also prefer this approach.

Page 9: Symantec NetBackup OpenStorage - Veritasvox.veritas.com/legacyfs/online/veritasdata/DD-WP-OST-BestPractic… · duplication traffic exclusively through the registered interface. 4

SYMANTEC NETBACKUP OPENSTORAGE 9

3 Optimized DuplicationSimple in principle, optimized duplication is also simple to configure once requirements are understood.

3.1 Storage Units and Storage Server AccessThe first item to consider is source and destination storage units. If both reside on the same NetBackup media server, that NetBackup media server needs credentials to access both the source and destination Data Domain OpenStorage servers. When the source and destination storage units reside on different NetBackup media servers, the NetBackup media server initiating the optimized duplication job requires credentials to access both the source and destination storage servers. What this means is that even though a particular NetBackup media server may never directly backup to or recover from a particular Data Domain OpenStorage server, it still needs access credentials when the Data Domain OpenStorage server is an optimized duplication destination.

10 GbE Private Network 10 GbE Private Network

1 GbE Public Network

NetBackup Media Server NetBackup Media Server

Optimized duplicationsource storage unit

Optimized duplicationdestination storage unit

Figure 11: Separate source and destination NetBackup media servers Figure 11 depicts optimized duplication between two OpenStorage storage units. The NetBackup media server initiating an optimized duplication job needs to have credentials to access both the source and destination OpenStorage storage units.

Credentials are set by means of the NetBackup tpconfig command on each NetBackup media server requiring access to a given OpenStorage storage unit. This allows the NetBackup media server to use the OpenStorage storage unit for backup and recovery jobs, as well as for optimized duplication. In cases where optimized duplication uses a destination OpenStorage storage unit that may be geographically distant from the NetBackup media server initiat-ing optimized duplication, the storage unit definition should not

For host configuration:4uname –a

4hostname

For network configuration:4ifconfig

4cat /etc/hosts

4route

4cat /etc/resolv.conf

4grep hosts /etc/nsswitch.conf

For NetBackup configuration:4/usr/openv/netbackup/bin/admincmd/bpstulist –L

4cat /usr/openv/netbackup/bp.conf

4cat /usr/openv/netbackup/bin/version

For OpenStorage configuration:4/usr/openv/netbackup/bin/admincmd/bpstsinfo –pi –stype

DataDomain

4/usr/openv/netbackup/bin/admincmd/nbdevquery –listdp

4/usr/openv/netbackup/bin/admincmd/nbdevquery –listdv –stype DataDomain

4/usr/openv/netbackup/bin/admincmd/nbdevquery –liststs

4/usr/openv/volmgr/bin/tpconfig –dsh –all_hosts

4/usr/openv/volmgr/bin/tpconfig –dsh –stype DataDomain

On each Data Domain system the following commands should be executed:

For host configuration:4hostname

4system show version

4user show list

For network configuration:4net show settings

4net show hardware

4net show dns

4net hosts show

4net aggregate show

4net config

For OpenStorage configuration:4ost status

4ost lsu show

4ost show user-name

4ost show connections

Page 10: Symantec NetBackup OpenStorage - Veritasvox.veritas.com/legacyfs/online/veritasdata/DD-WP-OST-BestPractic… · duplication traffic exclusively through the registered interface. 4

10 SYMANTEC NETBACKUP OPENSTORAGE

The net effect of network bandwidth throttling may impact recovery point objectives for disaster recovery. Other effects might include the queuing of jobs as optimized duplication jobs contrib-ute to destination storage unit concurrent jobs. Once a storage unit’s maximum concurrent jobs parameter has been reached, new jobs requiring the use of the storage unit will be queued as they await storage unit resource availability.

Based on service level requirements, it might be possible to limit the quantity of network bandwidth required by limiting the amount of data that needs to be replicated. One possibility worth considering is the optimized duplication of only full backups, where incremental backups are not duplicated. Different storage lifecycle policies can be employed for full and incremental backups should this method-ology align with service level objectives.

3.4 Optimized Duplication FailuresWhen an optimized duplication job fails, duplication job retry will attempt to use conventional duplication. This equates to sending the fully inflated complete backup image from the source OpenStorage storage unit through one (or possibly two) NetBackup media server(s) to the destination OpenStorage storage unit.

At present there is no known way to configure a NetBackup storage lifecycle policy such that a failed optimized duplication job will not be retried. However, a manually driven NetBackup utility “nbstlutil” can be used to cancel pending duplication operations.

3.5 Seeding Remote Data Domain SystemsNew deployments looking to utilize optimized duplication may have network bandwidth limitations between sites that could cause the first week of jobs to elongate substantially. One solution to this dilemma is to seed the remote Data Domain OpenStorage Server locally, and then after a week or so relocate the Data Domain system to the intended site.

The challenge this presents is that while the remote system is in transit optimized duplication failures may occur. A solution is to adjust the appropriate NetBackup storage unit so that no failures occur. By setting the storage unit “maximum concurrent jobs” parameter to a value equal to zero, optimized duplication jobs will enter a queued state instead of failing, as shown in figure 13.

allow the geographically distant NetBackup media server to use the storage unit for backup or recovery jobs. This is easily accomplished from within the NetBackup storage unit dialog window as shown in figure 12.

Figure 12: Storage Unit dialog – Use only specific media servers In the example shown in figure 12, the NetBackup storage unit named “dd120b-stu” has been configured to allow only the NetBackup media server named “NBU65OST_Media2” to use it for backup and restore jobs. The NetBackup media server named “NBU65OST_Media1” has credentials to ac-

cess the storage unit for the purpose of initiating optimized duplication jobs.

3.2 Network ConsiderationsReplicating backup images under the control of NetBackup optimized duplication includes the ability to use the same network that is used for backup and restore operations, or to use a differ-ent network.

The network used for optimized duplication is based on network name resolution on the source Data Domain system. The destina-tion Data Domain system is known to the source Data Domain system based on the IP address supplied by DNS, or by a local hosts file entry. Populating the source Data Domain systems hosts file with the desired IP address of the destination Data Domain system is all that is required to use a specific NIC and network. If this value is not present, NetBackup will perform optimized duplication using the same network it uses to access the source and destination Data Domain systems.

3.3 Throttling Optimized DuplicationThrottling is controlled at a global level on each Data Domain system. The ability to limit the rate of network bandwidth used by the replication process can be implemented based on various criteria such as a scheduled or temporary rate. Caution should be exercised as throttling back network bandwidth consumption may elongate optimized duplication job run times.

Page 11: Symantec NetBackup OpenStorage - Veritasvox.veritas.com/legacyfs/online/veritasdata/DD-WP-OST-BestPractic… · duplication traffic exclusively through the registered interface. 4

SYMANTEC NETBACKUP OPENSTORAGE 11

4Data Domain recommends using NetBackup storage lifecycle policies to control optimized duplication.

4Storage lifecycle policies facilitate setting different retention periods for backup and duplication jobs.

4Data Domain recommends using “Fixed” retention periods versus “Staged capacity managed” and “Expire after duplica-tion” retention period types.

4Data classification can be used in conjunction with storage lifecycle policies if desired, but doing so is not a requirement for optimized duplication.

Figure 14: Storage lifecycle policy The example shown in figure 14 contains a backup storage destination equal to storage unit “dd120a-stu” with a fixed retention period of one week. The example also includes a duplication destination equal to “dd120b-stu” with a fixed retention period of six months. The storage lifecycle policy has optionally been assigned a data classification value equal to “Platinum”. When the duplication task is executed it will result in an optimized duplication job that

appears in the NetBackup activity monitor.

Storage lifecycle policy duplication relies upon certain default settings that control the point at which a duplication job will be launched. An optional configuration file can be created to custom-ize lifecycles to run duplication jobs based on customer require-ments. Out of the box defaults for NetBackup version 6.5 include:

4MIN_KB_SIZE_PER_DUPLICATION_JOB 8192

4MAX_KB_SIZE_PER_DUPLICATION_JOB 25600

4MAX_MINUTES_TIL_FORCE_SMALL_DUPLICATION_JOB 30

Optimized duplication testing with backup images less than 8 GB in size may be delayed by up to 30 minutes as a result of the default settings. In a “test only” environment it may make sense to alter the default value for the “MAX_MINUTES_TIL_FORCE_SMALL_DUPLICATION_JOB” to a value of less than 30 minutes.

In some environments the default settings may be appropriate. The default settings can be adjusted by creating a “LIFECYCLE_PARAMETERS” file. The “Veritas NetBackup™ Administrator’s Guide, Volume I” should be consulted for additional information before adjusting these values.

Figure 13: Storage Unit dialog – Maximum concurrent jobs Figure 13 shows the storage unit “Maximum concurrent jobs” parameter set to a value of zero. Use this technique when relocating a Data Domain system from a local site to a final destination site so that related optimized duplica-

tion jobs will enter a queued state instead of failing.

3.5.1 Storage Unit Maximum Concurrent JobsData Domain systems have NVRAM, memory, model and DD OS (Data Domain Operating System) dependant recommendations regarding maximum write, read, and replication stream counts. The best practice recommendation is to set the NetBackup storage unit “maximum concurrent jobs” parameter to equal the combined backup and replication stream count values for a given Data Domain system based on NVRAM, memory, model, and DD OS version. Restore jobs are not considered in this setting as they are typically performed infrequently as required.

Stream count information is available from Data Domain for setting the NetBackup storage unit ‘maximum concurrent jobs parameter”.

NetBackup monitors the number of jobs running on a particular storage unit in order to enforce the maximum number of current jobs value. Monitored jobs include backup, duplication, and restore operations. The storage unit “maximum concurrent jobs” parame-ter however, does not distinguish between backup and duplication jobs. There is no facility within NetBackup that can be used to limit backup and duplication jobs separately. Solution architects should be cautious and monitor the number of simultaneously executing backup and duplication jobs based on prescribed limits in order to assure optimal performance.

3.6 Duplication Job Configuration OptionsNetBackup storage lifecycle policies provide an ideal vehicle for initial backups as well as the ability to create duplicate backup images. Storage lifecycle policy duplication tasks initiate optimized duplication jobs on OpenStorage storage units.

Page 12: Symantec NetBackup OpenStorage - Veritasvox.veritas.com/legacyfs/online/veritasdata/DD-WP-OST-BestPractic… · duplication traffic exclusively through the registered interface. 4

12 SYMANTEC NETBACKUP OPENSTORAGE

Take for example the execution of a storage lifecycle policy used in conjunction with optimized duplication. The initial backup to the first Data Domain OpenStorage Server will be “copy 1”, and it will also be the primary copy. Optimized duplication of this copy to a second Data Domain OpenStorage Server will result in the creation of “copy 2”. “Copy 2” will be a non-primary copy as long as “copy 1” is still being retained, or until “copy 2” is manually set to primary.

Users seeking to create tape-based copies from “copy 2” may elect to allow “copy 1”, the primary copy, to expire. At the point where “copy 1” expires, “copy 2” is set to primary by NetBackup. Execution of a properly configured NetBackup Vault Option policy will then use “copy 2” to create “copy 3”. The criteria for configur-ing a NetBackup Vault Policy that accomplishes this objective is based on criteria that selects backups which occurred in the past, such that “copy 1” no longer exists.

Figure 16: NetBackup Vault Option profile The NetBackup Vault Option provides the ability to specify granular selection criteria of backup images for duplication. In the example shown above, backups started between 16 and 15 days prior to the execution of a Vault job will be selected for inclusion. This strategy allows “copy 1” of a backup image to expire, and “copy 2” (which has been set to primary by NetBackup) to be used to fulfill a duplication request.

4.2 Tape Creation from a Non-Primary NetBackup CopyThe “bpduplicate” command can be executed via command line or a script based solution with the copy number (cn) parameter. This enables the creation of tape based duplicate backup images from non-primary NetBackup image copies. See the NetBackup Command Guides for additional information.

4 Duplication to TapeRequirements to retain long term copies of backup images on removable tape media are easily integrated with OpenStorage solutions. Multiple means of accomplishing this objective currently exist, with additional functionality likely to be forthcoming in new NetBackup versions.

4NetBackup supports the duplication of backup images, not media or specific tape cartridges.

4NetBackup supports the creation, cataloging, and tracking of up to ten copies of a particular backup image.

4The default value for “Maximum backup copies” is two.

4The “Maximum backup copies” parameter can be adjusted with the NetBackup administrative GUI via “Host Properties > Master Servers > Global Attributes.

Figure 15: Maximum backup copies By default the NetBackup global attribute “Maximum backup copies” is set to a value of two. As shown in figure 15, altering the value to accommodate additional copies is easily performed via the administrative GUI.

4.1 Tape Creation from the Primary NetBackup CopyThe default NetBackup behavior in versions 6.5.2 and prior is to create all duplicates using the primary backup image copy as source data. Regardless of backup image copy number, the primary copy (by default) will be used to fulfill the duplication request.

Four basic methods of initiating a duplication job using the primary copy currently exist within NetBackup:

4Ad-hoc GUI based duplication job initiation using the NetBackup catalog utility

4Command line or script driven “bpduplicate” commands

4Storage lifecycle policies

4NetBackup Vault Option

The underlying requirement for these techniques is that the desired source backup image is the primary NetBackup copy, which may not always be the case.

Page 13: Symantec NetBackup OpenStorage - Veritasvox.veritas.com/legacyfs/online/veritasdata/DD-WP-OST-BestPractic… · duplication traffic exclusively through the registered interface. 4

SYMANTEC NETBACKUP OPENSTORAGE 13

Setting a non-primary copy to primary is easily accomplished by right-clicking the image and then selecting “Set Primary Copy” from the pop-up menu.

Figure 19: Setting primary copy Setting a non-primary backup image copy to primary can be accomplished via a pop-up menu. This enables the use of a particular backup image to fulfill restore requests. This can be useful in cases where recovery from a specific geographical location is desired or in cases where the original primary copy is not available.

5.2 To a Different NetBackup DomainOpenStorage currently provides no automated ability to synchro-nize or share NetBackup catalog data across multiple NetBackup domains. OpenStorage backups can be imported into foreign NetBackup domains using standard NetBackup import procedures.

6 Additional ReferencesData Domain secure access customer support site:https://my.datadomain.com/

OpenStorage (OST) User Guide OpenStorage (OST) Quick Start

Symantec:http://www.symantec.com/business/support/documentation.jsp? language=english&view=manuals&pid=15143

NetBackup Administration Guides NetBackup Shared Storage Guide NetBackup Vault Administrators Guide NetBackup Command Guides

5 Disaster RecoveryUsing optimized duplication to create duplicate backup images assists in accommodating a variety of disaster recovery scenarios.

5.1 Within the Same NetBackup DomainThis scenario assumes that the NetBackup instance in which recovery is to be performed is the same NetBackup instance that performed the initial backup and subsequent optimized duplica-tion job.

The first copy of a backup image created by NetBackup is known as “copy 1”. When initially written, “copy 1” is also known as the “primary copy”. As of NetBackup version 6.5.2 and prior, the primary copy is the copy that is used to fulfill duplication requests as well as restore requests. An optimized duplication job will create “copy 2” of the backup image.

If “copy 1” has a retention period of two weeks and “copy 2” has a retention period of one year, “copy 1” will be the primary copy until it expires, at which point “copy 2” will become the primary copy.

In the case where “copy 1” has not expired and is still the primary copy, and a need arises to recover data from “copy 2”, “copy 2” must be set to primary such that it can be used to fulfill the restore request. Setting a particular copy to primary can be performed via the NetBackup GUI catalog utility.

Figure 17: NetBackup catalog “copy 1” – primary copy The NetBackup catalog utility can be used to select the primary copy of a backup image.

5.1.1 From a Non-Primary Backup CopyAssuming a particular backup image has been duplicated, “copy 2” can be used to fulfill a restore request if it is the primary copy. Similar to the way “copy 1” can be queried from the NetBackup catalog utility, “copy 2” can be queried using the same methodology.

Figure 18: NetBackup catalog “copy 2” – not primary The NetBackup catalog utility can be used to select “copy 2” of a backup image.

Page 14: Symantec NetBackup OpenStorage - Veritasvox.veritas.com/legacyfs/online/veritasdata/DD-WP-OST-BestPractic… · duplication traffic exclusively through the registered interface. 4

14 SYMANTEC NETBACKUP OPENSTORAGE

8.2 Existing Backups – Retain or Duplicate?At some point during the migration process, all ongoing backup jobs will theoretically use newly configured OpenStorage storage units. Backups performed to legacy basic disk or VTL media manager storage units will still exist based on the retention periods of the backup policies used to perform the backups. Should these backups be duplicated to new OpenStorage storage units and then be expired, or should they simply be left to expire naturally?

When the retention period of backups written to legacy basic disk or VTL media manager storage units is relatively short, natural expiration may be the logical choice as it imposes no additional administrative overhead into the migration process. Other fac-tors to consider are the number of backups that would need to be duplicated, and the quantity of data that would need to be duplicated. Duplicating a large number of backup images or a large quantity of backup data may not be realistic based on the additional workload it will impose on the backup infrastructure.

Duplicating existing backups to new OpenStorage storage units has they key advantage of allowing the legacy infrastructure to be deleted. Once all backup images on basic disk or VTL media manager storage units have been duplicated to OpenStorage storage units, the old storage units can be deleted.

8.3 Are Storage Lifecycle Policies Required?Storage lifecycle policies provide a plan based view of backup and duplication jobs, and can be associated with a data classification rank. While not explicitly required they provide a simple and effec-tive vehicle to perform backups followed by optimized duplication between two storage servers.

In cases where a single Data Domain system is configured as a storage server, or where optimized duplication between storage servers will be performed using alternative means, the use of a storage lifecycle policy is optional.

8.4 NetBackup Policy ModificationOnce OpenStorage storage units are configured, existing NetBackup policies that use basic disk or VTL media manager stor-age units can be updated to use the OpenStorage storage units. Change existing NetBackup policies by selecting the appropriate OpenStorage storage unit or storage lifecycle policy as appropriate and save the NetBackup policy. All backups performed with the NetBackup policy from this point moving forward will use the new OpenStorage storage unit. Optimized duplication will be invoked automatically by NetBackup if configured within a selected storage lifecycle policy.

7 ConclusionData Domain support for Symantec NetBackup OpenStorage advances the ability to use disk as disk, store more data on disk with inline deduplication, and simplifies the creation of backup copies with optimized duplication.

Creating duplicate backup copies with optimized duplication enables advanced disaster recovery strategies. Disaster recovery copies of backup images are created faster, and are available at the disaster recovery location sooner when compared to tape-based solutions.

8 Appendix – Migration to OpenStorage

Existing deployments using Data Domain systems as basic disk storage units or VTL media manager storage units may at some point be migrated to an OpenStorage solution. This appendix item explores the questions and strategies involved with migration.

4Can a Data Domain system be used simultaneously as both an OpenStorage storage unit and a basic disk storage unit?

4Can a Data Domain system be used simultaneously as both an OpenStorage storage unit and a VTL media manager storage unit?

4Should existing backup images be retained until their expiration, or should they be duplicated to a new OpenStorage storage unit?

4Are storage lifecycle policies required for use with OpenStorage?

4How are existing NetBackup policies modified to use OpenStorage storage units?

4Should legacy Data Domain replication be de-configured?

4Can existing replicas be imported into the NetBackup catalog?

8.1 Multiple Protocols on One Data Domain SystemWhile not specifically recommend, the simultaneous use of multiple protocols on the same Data Domain system is supported. One Data Domain system can simultaneously serve as a storage server, basic disk storage unit, and VTL. This functionality is particularly useful when migrating from basic disk or VTL usage to an OpenStorage solution.

Page 15: Symantec NetBackup OpenStorage - Veritasvox.veritas.com/legacyfs/online/veritasdata/DD-WP-OST-BestPractic… · duplication traffic exclusively through the registered interface. 4

DEDUPLICATION STORAGE www.datadomain.com

Data Domain | 2421 Mission College Blvd., Santa Clara, CA 95054 | 866-WE-DDUPE, 408-980-4800

Copyright © 2008 Data Domain, Inc. All rights reserved. Data Domain, Inc. believes information in this publication is accurate as of its publication date. This publication could include technical inaccurancies or typographical errors. The information is subject to change without notice. Changes are periodically added to the information herein; these changes will be incorporated in new additions of the publication. Data Domain, Inc. may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time. Reproduction of this publication without prior written permission is forbidden. The information in this publication is provided “as is”. Data Domain, Inc. makes no representations or warranties of any kind, with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Data Domain and Global Compression are trademarks of Data Domain, Inc. All other brands, products, service names, trademarks, or registered service marks are used to identify the products or services of their respective owners.WP-OSTBPG-1208

8.5 Legacy ReplicationAt the point where legacy replication is no longer required, it can be disabled. OpenStorage optimized duplication requires that the Data Domain replication license remain intact on both source and destination storage servers.

8.5.1 Can Legacy Replicas Be Imported?Existing replicated backup images that were created without OpenStorage continue to have the same limitations they had before the OpenStorage solution was implemented. If the source copy of a particular backup image is cataloged by NetBackup, any replica copies of the backup image are in effect already cataloged and cannot be imported. It is important to note that legacy replicas are not known to NetBackup as “copy 2” of a particular backup image. These replicas were created without the knowledge of NetBackup and have the same backup identifier as the source image.

Note that duplicate copies of backup images created with OpenStorage optimized duplication do not need to be imported because the NetBackup catalog is already aware of their existence and is tracking them accordingly.

8.6 Deleting Legacy Storage UnitsOnce the OpenStorage solution has been implemented and all pre-existing backup images have expired, the legacy storage unit components can safely be deleted from the NetBackup configuration.