NFS File Migration to Dell EMC Isilon · PDF fileNFS FILE MIGRATION TO A DELL EMC ISILON ......

WHITE PAPER

NFS FILE MIGRATION TO A DELL EMC ISILON CLUSTER

Guidance for optimal data migration of NFS workflows

ABSTRACT

This paper provides technical information and recommendations to help you migrate

data from a single NFS protocol workflow on another NAS vendor to a Dell EMC Isilon

storage cluster. It includes the best practices for planning, setting up, and executing the

migration.

December 2016

2

The information in this publication is provided “as is.” DELL EMC Corporation makes no representations or warranties of any kind with

respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular

purpose.

Use, copying, and distribution of any DELL EMC software described in this publication requires an applicable software license.

DELL EMC2, DELL EMC, the DELL EMC logo are registered trademarks or trademarks of DELL EMC Corporation in the United States

and other countries. All other trademarks used herein are the property of their respective owners. © Copyright 2016 DELL EMC

Corporation. All rights reserved. Published in the USA. 12/16 White Paper H12517

DELL EMC believes the information in this document is accurate as of its publication date. The information is subject to change without

notice.

DELL EMC is now part of the Dell group of companies.

3

Table of Contents

INTRODUCTION ........................................................................................................................6

Assumptions ...................................................................................................................................... 6

Audience ........................................................................................................................................... 6

Prerequisites ..................................................................................................................................... 7

The challenge of data migration ........................................................................................................ 7

Risk management ............................................................................................................................. 7

Data integrity ..................................................................................................................................... 8

Data availability ................................................................................................................................. 8

PROJECT PHASES AND METHODOLOGY OVERVIEW ........................................................8

Discovery and planning phase .......................................................................................................... 8

Key aspects of the planning phase ............................................................................................................ 8

Migration approach and requirements ........................................................................................................ 9

Migration methodology ............................................................................................................................. 10

Migration toolset selection ........................................................................................................................ 10

Testing the migration methodology ................................................................................................. 10

Data migration testing .............................................................................................................................. 10

User acceptance testing ........................................................................................................................... 11

Cutover methodology testing ................................................................................................................... 11

Rollback strategy testing .......................................................................................................................... 11

Executing the migration ................................................................................................................... 11

Data transfer ............................................................................................................................................ 12

Cutover .................................................................................................................................................... 12

Acceptance .............................................................................................................................................. 12

Rollback ................................................................................................................................................... 12

Repetition ................................................................................................................................................ 12

Post-migration ................................................................................................................................. 13

SINGLE PROTOCOL NFS DATA MIGRATION ..................................................................... 13

Challenges of a single protocol NFS data migration ........................................................................ 13

Data-specific considerations ........................................................................................................... 13

Migration requirements and customer data collection ..................................................................... 15

Requirements gathering ........................................................................................................................... 15

Current infrastructure and data analysis ................................................................................................... 16

Determine migration methodology ................................................................................................... 16

4

Migration methodology considerations ..................................................................................................... 16

Migration sequencing ............................................................................................................................... 17

Type of migration ..................................................................................................................................... 18

Host-based migrations ............................................................................................................................. 18

Isilon-based migrations ............................................................................................................................ 20

Migration tool selection and use ............................................................................................................... 21

Migration tools ......................................................................................................................................... 22

rsync ........................................................................................................................................................ 22

Isilon-based migrations – isi_vol_copy ..................................................................................................... 26

Isilon-based migrations from VNX—isi_vol_copy_vnx .............................................................................. 28

MIGRATION PREPARATION ................................................................................................. 29

Infrastructure and environment setup .............................................................................................. 29

Source host preparation .................................................................................................................. 29

Migration host preparation—Source and target access ................................................................... 30

Isilon cluster configuration preparation ............................................................................................ 31

Additional Isilon cluster considerations ............................................................................................ 32

NFS group membership limitation ................................................................................................... 32

Isilon guidelines for large workloads ................................................................................................ 35

MIGRATION APPROACH—TESTING AND PROOF OF CONCEPT .................................... 37

DATA VALIDATION ............................................................................................................... 37

PERFORMANCE .................................................................................................................... 38

USER ACCEPTANCE TESTING—DATA AND WORKFLOW TESTING .............................. 38

START OF MIGRATION EXECUTION ................................................................................... 39

PRE-CUTOVER PREPARATION ........................................................................................... 39

CUTOVER EVENT .................................................................................................................. 40

THE GO OR NO-GO DECISION ............................................................................................. 41

ROLLBACK ............................................................................................................................ 41

MIGRATION EVENT COMPLETION ...................................................................................... 42

STEADY STATE ..................................................................................................................... 42

CONCLUSION ........................................................................................................................ 42

APPENDIX: SAMPLE MIGRATION USE CASE .................................................................... 43

6

Introduction

This white paper outlines the recommended approach for migrating single protocol network file system (NFS) data from

other network-attached storage (NAS) systems to a Dell EMC® Isilon

® storage cluster. Single protocol NFS data is defined

as data read, written, or modified using NFSv2 or NFSv3 protocols. This paper includes best practices on Isilon cluster

configuration, tool selection, and host setup to optimize

an NFS-based data migration. The paper also includes best practices to optimize performance, management, and support.

Although this paper addresses a single NFS protocol data migration, the approach and many of the best practices

described can be used as a foundation for other types of data migration.

Much of the relevant information for planning, provisioning, and supporting end-user directories on an Isilon storage cluster

is available through white papers and guides from Dell EMC Isilon at http://support.emc.com. As a result, this paper avoids

duplicating this other content and includes only the information that pertains to setting up and operating an Isilon cluster as

a destination for a single protocol NFS data migration.

Assumptions

This document focuses on the data migration of NFS; it does not specifically address the migration to an Isilon cluster of

NFS exports, local users and groups, or any other NFS configuration from another NAS system.

This document should not be used in a multiprotocol migration. A multiprotocol migration requires many additional

considerations, and the specific actions required for such a migration may be different.

Additionally, this paper assumes that:

The source data is accessed only through a single protocol NFS workflow. Further, authentication and authorization for POSIX users (UID/GID) need to be consistent on the source and destination migration clusters. Lightweight Directory Access Protocol (LDAP), Network Information Services (NIS), distributed local files, and Active Directory (AD) with RFC2307 enabled (SFU) are all centralized methods of authentication that should be successfully implemented before the migration begins.

An authoritative and consistent source for user authentication is assumed (for example, AD with RFC2307, LDAP, NIS, and so on). If there are multiple sources of authentication from disparate clusters targeted for migration (that is, several local different/etc./password files), then they must be inspected for collisions and manually combined, if necessary, to prevent UID and GID collisions. Identity management that is optimized with preference for a single external directory service is recommended. If necessary, work with the customer to establish a consistent and authoritative set of users and groups.

Workflows will be for NFSv3 with the possibility of adding Server Message Block (SMB) workflows in the future; NFSv4 is not addressed in this document. The Dell EMC Isilon file system will have a “balanced” on-disk identity setting for global access control list (ACL) policy.

Files with POSIX mode bits only are in scope for this document; additional permissions such as SMB ACLs are not covered.

Multiple source clusters may have identical export directories. A directory consolidation plan must be developed to address directory name collisions in the newly created single namespace on the Isilon cluster.

Files and directories have unique permissions that restrict access to the intended users and groups; a typical migration preserves and transfers them. Post-migration permission transformation is not covered in this paper.

Audience

This paper is intended for experienced system and storage administrators who are familiar with file services and network

storage administration.

The document assumes you have a working knowledge of the following:

NAS systems

The NFS storage protocol, as is appropriate for the specific migration requirements

The Isilon scale-out storage architecture and the Dell EMC Isilon OneFS®

operating system

Additional Isilon features, including Dell EMC Isilon SmartConnect™, SmartPools® policy

management, SnapshotIQ™, and SmartQuotas™

File system management concepts and practices, including provisioning, permissions, and performance optimization

http://support.emc.com/

7

Integration practices for connecting and establishing authentication relationships using centralized sources (LDAP, NIS, local files, and so on)

Basic shell commands, command line operation, and basic shell scripting

While this paper is intended to provide a consolidated reference point for data migrations to an Isilon storage cluster, it is

not intended to be the authoritative source of information on the technologies and features used to provide and support

a file services platform. In the event that additional services are required, Dell EMC IT services are available to assist with

streamlining data migrations, reducing risk, and minimizing impact.

Prerequisites

Some of the features that are described or recommended in this document may require separate per-node licensing from

Dell EMC Isilon. For more information, please contact your Dell EMC Isilon representative.

The challenge of data migration

The migration of a storage system’s data and all the existing user access permissions is a complex process. Moving the

data while limiting downtime and protecting the data can be challenging. While you execute a migration, it is critical that

access to the data is available at all times and that data integrity is ensured to protect against data loss or corruption.

It is critical to understand that a data migration is a unique project. Few environments are the same, and, as a result, each

migration should be considered

a unique event. No preexisting approach will necessarily be appropriate for all migrations. That is not to say that common

approaches cannot be used after you evaluate and understand the requirements of a specific migration. The goal of this

white paper is to introduce the recommended approach to designing and executing

an NFS data migration to an Isilon cluster. Every migration is different; this paper provides some examples and guidance.

Dell EMC Professional Services provides expertise in working with customers to build an individual plan that meets their

needs. Whether you will use Dell EMC Professional Services for the migration or you plan to manage the migration in

house, the following are key areas that must considered before the start of any migration project:

Investigate the composition of the source data. Is it a deep directory structure or a wide structure with many files per directory? The type and number of files and directories will directly influence how the migration is planned and executed.

Understand the sequence of a migration project—this is critical to its success. The ability to predict and manage the time required to execute the data movement is paramount—it may be the single biggest factor that will affect the project’s success.

Maintaining data availability throughout the lifecycle of the data migration project is also critical. Little of today’s data can be unavailable for days at a time. In order to maintain data availability, you will need a strategy to maintain access to data throughout the migration process.

Risk management It is not uncommon to have a number of challenges or perceived problems that can potentially act as blocking issues or barriers to executing the migration. With sufficient planning and testing, such perceived risks can be addressed and managed successfully.

Common risks associated with data migrations include the following:

Amount of data (that is, large total volume, high file counts) and the potential for needing an extended period of time and effort to migrate it

Potential for performance impacts to the existing data solution and the customer network during the migration

Maintaining continuous access to the customer’s data throughout the migration

Potential for required changes to the data permission models on the new target system for the migration

Inherent challenges with moving multiple client connections to the new target system

Maintaining a consistent security model after the migration

Execution of the actual cutover event—with many moving elements that can increase the probability of error

This paper helps you understand these challenges and risks, and enables you to develop a data migration methodology

that manages these risks while implementing Isilon best practices to facilitate and optimize the migration.

8

Data integrity

Data integrity is critical. The data must be moved exactly as it is, and any modification to the data during migration may

impact the availability of the data and the success of the migration. The goal of the project is to ensure that the data is

successfully migrated and that its integrity is not compromised during its movement. In most cases, this includes the

migration of all relevant file metadata, as well as the underlying data blocks. A complete backup of the source data should

be made, and the validity of the backup should be verified before the migration begins.

Extreme cases where the data integrity must be maintained can be addressed with the use of checksums. An MD5

checksum can be calculated on each file on both the source and destination systems post-migration, which can verify bit-

for-bit integrity.

Data availability

Any migration activity will require a transition or cutover from the existing source systems to the new destination systems

so that a customer’s data clients (end users) can continue to access their data once the data has been moved. This

cutover will require a window of time when the data is unavailable to a customer’s end users. Minimizing this time is the

goal of all migrations, and the time needed for this

process is often determined by the type and the amount of data. A number of migration strategies can be employed to

reduce the period when data is unavailable during the cutover.

Project phases and methodology overview

A data migration project should be broken into distinct phases. The goal of the project phases is to develop a robust and

repeatable migration strategy that aids the migration’s execution and leads to a successful migration cutover.

Discovery and planning phase

The goal of the discovery and planning phase is to design a migration methodology and plan that enables you to execute

the project with minimal risk and downtime. Following are components of the discovery and planning phase:

Qualify the project

Identify the migration scope

Understand expectations

Identify risks

Define the timeline

Identify all migration requirements (that is, a rollback plan)

Key aspects of the planning phase

During this phase, a detailed review of the existing source environment and data is undertaken, and then developed into a

plan to migrate the data to the new target Isilon environment. The planning phase should be completed and validated

before the other project phases are started. The key aspects of the planning phase include discovery of the existing

infrastructure, the data, and the Isilon cluster.

Infrastructure discovery

This is the point where the infrastructure of the existing storage system, network architecture, and the network path

between the source data and the Isilon cluster

are evaluated.

If, for example, multiple source filers are to be combined into a single Isilon cluster, then an extensive analysis of existing

exports should occur. Export directory naming collisions (that is, server1:/exports/home and

server2:/exports/home are to be combined from different source filers) should be investigated to verify that the

data and directories contained within them will be able to coexist in a unified export on the target Isilon cluster (for example,

isilon:/ifs/data/home):

For example:

9

server1:/exports/home

/user1

/user2

/user3

server2:/exports/home

/user4

/user5

/user1

If the /exports/home directories are to be unified into a single /home on the Isilon cluster, the duplicate /user1 directory

must be addressed. Possible solutions include combining (if the user is the same) or renaming the directories (if there are

different users), along with making a corresponding UID change, if necessary. Work with the customer to address these

issues and craft a solution that will be minimally disruptive to their existing environment.

Data discovery

This is the point where you analyze the data and workflows that you plan to migrate and determine how they map to the

target end state on the Isilon cluster.

Quotas

If quotas are utilized on the source volume, they will need to be recreated on

the target Isilon cluster. Implementation of the quotas, however, should not take place until after the migration is complete

to avoid any potential issues while transferring data.

Isilon cluster configuration design and discovery

This is the point where activities such as the design of the Isilon network, disk pools, shares, and authentication can affect

the migration design. The configuration of the cluster is critical to the success of the migration.

The output of the discovery phase influences the migration design and drives the execution of the project.

Migration approach and requirements

The analysis of the data that you collected during the discovery phase drives the migration requirements and the migration

plan. The migration requirements break down into subcategories that ask the questions what, how, and when:

o What—What are you migrating?

All the data or a subset of the data Replicate the existing data as it is or transform it during the migration Copy the data but implement a new security model Take a hybrid approach

o How—How are you going to migrate the data, security, and workflows?

Tools used to copy the data and security Cutover strategy; how will client connections be moved? If the data has a rapid rate of change, how will you accommodate it? Data is static and can be moved without impact Full data copies and follow-up incremental copies to gather recently

updated data Clients access this data currently by method x/y/z Limit access to the old data and redirect during the cutover

o When—When are you implementing the cutover?

A single mass event Several large cutover events A series of smaller cutovers sustained over a longer time frame Rolling migration

Once these requirements have been clearly defined, a migration methodology can be developed to address them.

10

Migration methodology

Analysis of the migration requirements leads to the development of a migration methodology. The migration methodology

follows a waterfall methodology with phases generally occurring on completion of the prior phase. Although the preparation

for upcoming phases can occur before prior phases are completed, the execution is defined by the completion of its

dependent phase.

Figure 1. Sample migration plan

Figure 1 shows a sample migration plan. The plan addresses how all aspects of the migration are achieved: sequencing,

tools, timing, communication, and implementation. After you develop a migration plan, a proof of concept can help

you evaluate the approach and test the phases of the plan.

Migration toolset selection

After you finish the discovery phase and develop a methodology, you can select a toolset (host based, array based, and so

on) to copy the data and permissions.

Testing the migration methodology

After you develop a migration methodology, you must review, validate, and test the migration plan. A test migration is

usually run on a subset of the data. Running a

test migration is also invaluable in helping to estimate the performance and timing

of a migration.

Data migration testing

Testing the actual data movement process and execution is the first phase in testing the overall methodology. The data

migration testing determines whether the proposed methodology meets the requirements and accomplishes the goals of

the project.

The role of data migration testing is as follows:

Validates the tool selection; does the tool do what you want it to? Does it copy the data and attributes? Does it preserve hard and/or soft links?

Validates the data transfer; is the data moved as expected?

Validates that the permissions are copied over; are the permissions correct, functional, and operational?

Benchmarks the performance of the data transfer; how long does it take to run full and incremental data transfers?

Tests the new data: Is the data available? Are the read/write settings correct? Does the new workflow function correctly?

Gives you the option to experiment with different methods, tools, and flags

Enables you to tune the process to achieve the best results

11

The testing should give you confidence that the data will be accessible and available after all the data

and users are transferred to the new system.

User acceptance testing

Before you execute the full migration and cutover, user acceptance testing (UAT) should be undertaken against the new

storage system and a sample of the migrated data. UAT validates that the data is ready for cutover by checking that:

Data is accessible; users and applications can access the data correctly

Permission models are correct; the required security is applied to the migrated data

Workflows are operational; there are no issues with using the data

Cutover methodology testing

Cutover methodology testing helps determine how you will move client connections—and how the clients will respond to

the cutover. Through testing, you can gauge how long it takes to move the connections, what kind of issues may occur,

and how to troubleshoot any issues. Testing the cutover strategy thoroughly provides feedback on how to execute the final

cutover.

Rollback strategy testing

You should also test your rollback methodology. The rollback testing should validate that your plan to failback or abort a

migration works so that you are prepared in case any issues occur during the cutover. Be sure to validate that access to

the data on the old system can be restored quickly and efficiently without affecting users.

Executing the migration

Once all the methodology and processes have been completed and validated, you can move on to the main migration.

The core migration phases are as follows:

Data transfer—all the data is migrated from the old system to the new system

Cutover—connections and clients are moved to the data on the new storage

Acceptance—the new data source is ratified

Rollback—a process used only if required

Steady state and repeat—migration phase is considered complete but additional separation migrations may occur

Post-migration monitoring—the new system and data are monitored following the cutover

12

Data transfer

A standard approach to data transfer is to execute an initial “full” data copy to move all the initially identified data and to

follow it up with a series of incremental copies, which move only the data that had changed since the initial full copies had

run. This gives you the most flexibility in executing the cutover, as the additional incremental copies will be substantially

shorter to execute than the initial large data copies.

Cutover

After data migration, the process of actively moving clients from the old storage system to the new storage system will

occur during a cutover event.

With a high-level cutover plan, the following steps will occur:

For the old source—remove write access to ensure that clients are unable to write any new data

Execute a final incremental copy to migrate any remaining data from the old system to the new system

Test new target data and connectivity; selective UAT

Go or no go—decide whether to move forward with the cutover event

Update the client-to-storage connection mechanisms—Domain Name System (DNS), Distributed File System (DFS), virtual IPs (VIPs), and so on

Monitor the new storage system—monitor load and connections as the clients are transferred

Validate clients—review and validate that clients can successfully connect and operate

Validate workflow—verify that business operations work as expected

The cutover is complete

Acceptance

Once the data and client connections have been migrated over to the new storage solution, the storage availability and

workflow acceptance of the new data and storage solution must be validated.

Once you begin writing new data to the new storage system, the ease with which you can roll back to the old storage

system diminishes. If you were to roll back to the old system, any changed data would need to be copied back to the old

environment. Unless this newly written data can be discarded, rewritten, or manually reconciled, Dell EMC Isilon strongly

recommends that you execute a rollback before any significant changes are made to the data using the new storage

system.

Rollback

Be sure that you have a fully tested rollback plan in place. A rollback may be needed for a variety of reasons:

Client connectivity or storage name resolution issues develop following the cutover.

The final incremental copy is not completed during the outage window, so not all data was migrated.

An unplanned IT outage or issue occurs at the same time as the migration.

Data access on the new storage system is invalid and workflows are impaired.

The goal of a rollback plan is to quickly restore access to the old data storage solution. Assuming that the cutover was

executed correctly, restoring the prior data access should be straightforward, and it should be possible to implement the

data restore with minimal additional disruption. The primary goal is to restore access within the cutover window so that no

additional downtime and interruption to data occurs. It is critical to have a tested rollback plan that can be used if an issue

with the cutover occurs.

Repetition

After you validate the data transfer through cutover and client acceptance, most migration projects consist of multiple

migration cycles. The methodology can be executed again on different datasets in migration waves that encompass the

entire project.

13

Post-migration

Following the migration cutover, it is important to monitor both the new storage system and the old storage system. You

should find that client connections are moving to the new system and that active data connections are no longer initiated

on the old storage system. Can clients connect to and work with the new storage system without issues? As connection

counts increase on the new storage system, you should monitor the load and performance, and make performance

adjustments as needed.

You should be monitoring the following items during and after the cutover:

New system: System load and performance, number of connections, movement of users, security, and performance

Old system: Are users still connecting to it? Are there legacy connections being made to it from old applications?

If user quotas were utilized on the source system, it is appropriate to reimplement them on the target system following the

migration.

The Dell EMC Isilon OneFS SmartLock® feature should likewise be implemented post-migration. Files should be committed

only once the data has been completely migrated.

You should have a transition plan for what you will do with the old storage system. Some common approaches are as

follows:

Keep it around for a while but with administrator access only

Provide read-only access for users

Mothball the system while the new system transitions

Decommission it

Purge the data after a defined retention period has been reached

Single protocol NFS data migration

Although this paper addresses a single protocol NFS data migration, the approach and many of the best practices

described can be used as the foundation for other types of data migrations.

Challenges of a single protocol NFS data migration

Moving large amounts of data presents a number of challenges:

It is difficult to perform such a migration without downtime. Most source clusters are overloaded, requiring that the data be available at all times and that systems operate at near capacity. This supports the need for migration. However, the migration itself can present a significant additional load to the source cluster.

A large number of exports may need to be migrated. You must move not only the data but also the exports and export permissions. This introduces a second type of migration (configuration) that must be undertaken during the project.

The consolidation of multiple source filers into a single unified namespace and directory structure can be difficult to manage.

There may be a large number of differently connected clients that require separate cutover and validation events.

NFS exports may be mounted deeper in the exported tree.

There may be restricted exports to specific hosts and unique permissions. Export options must be verified.

There may be a high rate of change. Often large environments contain a large number of concurrently connected clients. In such cases, you must account for the rapid rate of data change during and after cutover.

Data-specific considerations

When you design a migration strategy, you should determine how you would like your data to appear after it has been

migrated.

14

Consider the scenario of multiple smaller filers being consolidated into a single Isilon cluster. There may be duplicate

UID/GID collisions if the source filers are not using

a consistent source of authentication. In that case, manual remediation may be necessary to combine and fix user and

group accounts that have duplicate IDs.

For example:

filer1: has users user1 (UID:305), user2 (UID:423), and user3 (UID:424).

filer2: has users user1 (UID:305), user4 (UID:423), and user5 (UID:424).

The UIDs for user2 (UID:423) and user4 (UID:423) are the same, so one user would have to change to a different UID and

the file ownership corrected during the migration. The same issue would occur with user3 (UID:424) and user5 (UID:424).

One user would need to change to a new UID and file ownership would need to be corrected before the users would be

able to both use the new cluster. One possible resolution would be to make user4 UID:1423 and change ownership of all

their files prior to the cutover. Likewise, you could make user5 UID:1424 and modify ownership on all of their files.

Keep in mind that OneFS will store all UID/GID information regardless of the source. OneFS does not require NFS

authentication.

Duplicate export paths present a similar issue:

filer1: has exports /vol/share/acct, /vol/share/work, and /vol/share/eng

filer2: has exports /vol/share/acct2, /vol/share/work and /vol/share/engineering

The problem is that /vol/share/work is the same from both source filers. This issue must be discussed with the

customer prior to migration. A plan for directory consolidation must be developed to deal with export path collisions. One

typical solution is to have an additional directory layer that identifies the original source filer:

isilon: would have exports: /ifs/data/filer1/acct, /ifs/data/filer2/acct2, /ifs/data/filer1/work,

/ifs/data/filer2/work, /ifs/data/eng, and /ifs/data/engineering

With this methodology, duplicate export paths can safely be consolidated from multiple source filers into a single cohesive

namespace. Clients, however, must be updated to reflect the updated export paths.

The metadata of files, in particular, can add complexity to a migration. You must identify the metadata that you want to

migrate with the data. The following metadata can affect your migration strategy:

File properties such as access time, created time, modified times, and owners

File attributes such as read only or archive (an Isilon cluster does not support compressed and encrypted)

Extended proprietary file attributes that are in use; these are not supported on Isilon clusters

Local users and groups; are these defined on the files?

Deduplication is in use, or archive stub files or Mac OS X resource forks are present

Other data-specific considerations include:

Date/access time/creation time retention requirements—These may not be preserved across migrations depending on which tool is used, for example, creation time is not preserved with isi_vol_copy.

Symbolic links will break—Depending on where the symbolic links connect to, the underlying paths will probably change after a migration and require rebuilding.

Automount maps will need to be repointed (NIS, NIS+)—Similar to symbolic links, the export paths may change and the hostname may also change.

Character encoding—Verify that it is the same on source and target; international characters in file names may be problematic.

How does the data need to appear post-migration:

Direct replication of all data and attributes

Move the data, then make updates, fix problems, change the security, and so on; migrate just the data and implement an entirely new security model

15

Migration requirements and customer data collection

Before you can plan your migration, you must collect requirements.

Requirements gathering

The data migration planning begins with identifying the data that you want to move from the old storage system to the new

storage system. Here is what you need to document:

Current state—what is the current state of:

Source infrastructure

Existing storage platforms

Network design and implementation

Name resolution infrastructure: DNS, DFS, or global namespace

Servers, clients, OS, and applications

Source infrastructure configurations

Volumes

Shares/exports

Access

Authentication

Source data

Logical structure—data layout and directory depth

Is the structure wide or deep?

Physical structure—total size, minimum/maximum/average file size

Number of files

Source data security

Current security model and how file access is enforced

Local users and groups

POSIX permissions

LDAP users and groups

Target state—what will be the target state:

Target infrastructure: Isilon cluster

Network configuration

Target configurations:

Directory layout and structure

Shares/exports

Access and authentication model

Target data

Logical structure—same as the source or new system

Physical structure—same as the source or new system

Target data security

Keep it the same as the current security model

Migrate and change the security model

Move the data and implement a new security model

16

How to gather the data:

Interview stakeholders

Gather documents: network diagrams, run books, and infrastructure and application details

Create a list of exports

Develop storage reports, and so on

Review the share permissions

Examine the directory structure (shallow versus deep), file composition (small versus large), and the number of files

Current infrastructure and data analysis

Start the migration design phase by collecting the data needed to develop the migration requirements.

Best practice

Create and utilize a standardized data collection and migration planning document, along with a standard target

configuration guide. By using a structured document to gather and collect all your source data and information, you can

identify your migration requirements, which will lead to clear migration design decisions.

Why: This will simplify and consolidate migration planning and implementation.

You need to collect the following information:

The amount of data; the actual file data, not compressed or deduplicated data

If there is deduplicated data, the amount of such data; this number will need to be added to the total

The number of directories and files; identify the directory trees and the quantity of them

The directory structure: shallow and flat, wide and deep, or otherwise

The number of directories with more than 10,000 files in them

The number of exports; are there share name collisions or reuse on multiple source hosts?

The way these exports are used—for home directories, application, or group use

How permissions are applied to source—at the individual or at the group level

The number of source locations; single source system or multiple

How clients access data; protocols and how they resolve storage names

The rate of file changes; how often and where files are changing

Networking architecture; source systems and network between it and the Isilon cluster

Source system load; understand what load the source storage is under and how much additional overhead from the migration would be tolerable

Determine migration methodology

After you collect information on the source system, the data, and the infrastructure, you are ready to develop a migration

methodology.

Migration methodology considerations

The elements described in the text that follows details some of the key things you must consider as you develop a

migration methodology.

Logical migration design

By analyzing the structure and layout of the source data, you can make logical migration design decisions—structuring the

migration into distinct executable units. A goal of the migration methodology is to identify logical boundaries that facilitate

the cutover of your clients and workflows.

Some logical migration boundaries are as follows:

Hosts, filers, servers, and arrays

Volumes

17

Exports

Directories—users or groups

Best practice—Define migration boundaries

Identify logical migration boundaries: Identify clearly well-defined data structures for migration and cutover—for example,

entire exports or directories. Be aware of the size of data inside a migration boundary, as the size of the data affects the

outage window required to complete

a cutover.

Why: This best practice organizes the migration into segment waves, making the migration easier to manage.

After you segment the logical boundaries into distinct migration phases, you can address other elements of your workflow,

such as metadata, that you need to migrate.

File attributes and security

Most data migration also includes the migration of the file’s metadata: ownership, access times, creation time, and security

descriptors. Before you can execute your migration, determine how you plan to handle metadata and file security.

Common migration approaches:

Migrate data files on an as-is basis (with no change to permissions or ownership).

Migrate data and permissions but also set the destination (recalibrate the permissions).

Migrate the data only. Create new permissions on the destination, or create a new security design.

Migrate away from an existing security model and implement a new model. Dell EMC Isilon recommends that you use a central authentication scheme on Isilon clusters. For example, if the NAS system that you are migrating from uses several directory services, you should consider consolidating the directory services into a single directory service for the new NAS system.

Best practice—Understand the attributes of the source data before the migration

Be data aware: Identify any DOS attributes, nonstandard extended file attributes, and nonstandard permissions that are

not supported by an Isilon cluster. Also, identify your local users or groups and have a plan to deal with them.

Why: Before you execute the migration, you may need to take additional steps to prepare the data for migration so that it

will be available on the new storage system.

Migration sequencing

The execution of a migration will likely require multiple iterations of the data transfer. If the source data is constantly

changing, try to find a window when the source data can be locked in a read-only state, or deny access to clients. Once

access to the source data is removed, the final data transfer can take place. Otherwise, differences between the data on

the source and the target system might result.

The recommended approach for a data migration is to use a multistep migration. A multistep migration consists of an initial

“full” or “level 0” data copy. The initial data copy is followed by a series of “incremental copies” that update only the new or

changed data. The initial data copy moves an entire copy of the source data. It can often take a long time to execute

because all the data must be assessed and transferred over the network to the migration target.

After the initial copy completes, additional differential transfers copy only the data that has changed since the initial full data

copy was executed. Additionally, any data that is deleted on the source will also be deleted on the target through the

incremental process. The size of an incremental copy is affected by the rate of change of the source data.

You should run multiple over-the-top incremental copies to guarantee the integrity and consistency of data that encounters

any issues during the initial full copy. Incremental copies will also keep the two data sources in sync with each other and

require less catch-up work on final copy.

A final incremental data copy should always be executed as part of the migration cutover plan to ensure that all the latest

data is on the new target storage system.

18

Best practice—Run initial full copies followed by incremental copies

Run initial full copies followed by multiple incremental copies. Always execute a final incremental data copy during cutover

to ensure that the latest data from the source

is migrated.

Why: Executing multiple migration passes will ensure that all the data is transferred and that the latest version of the files

will be stored on the target storage system.

Type of migration

You must determine how the migration will be executed. There are two possibilities: an indirect execution from a host and a

direct execution from an Isilon cluster. With host-based migration, an intermediary host executes a copy process between

the source system to the target system through the host, as shown in Figure 2.

LAN/WANLAN/WAN

“Target” Isilon cluster

rsync intermediary host

“Source” NAS storage array

Figure 2. Host-based migration

With a host-based migration, all data is transferred through an intermediary host en route to the Isilon cluster.

If the source system is supported, the Isilon cluster can execute a direct source to the Isilon data copy by using the Isilon

OneFS isi_vol_copy command, which will copy data using the Network Data Management Protocol (NDMP).

Host-based migrations

A host-based approach might be selected for a number of reasons:

The source system does not support an Isilon-based migration—isi_vol_copy is not supported.

Connectivity is restricted—storage is on different networks; a host may bridge the networks.

There is flexibility in execution—separate the execution from the administration of the storage systems.

There are security restrictions—these can be used to limit access to systems.

In a host-based migration, the toolset executing the migration makes a connection to the source and to the target system,

and then copies the data through the host. For the purpose of this paper, the primary host-based tool is rsync.

19

Best practice—Select a suitable host

Select a suitable host to run the migration that has adequate network bandwidth and processing power.

Why: Because all the data will move through the host, incorrect sizing may lead to a

bottleneck or an interruption in the migration. Using multiple hosts may facilitate

multistreamed migrations in which you can maximize network usage and the Isilon

nodes by executing multiple migrations concurrently. A host with 10 Gb network

connectivity is highly recommended.

Some common considerations are as follows:

Adequate resources to execute the migrations—CPU, RAM, and network 10 gigabit network infrastructure where possible

Connectivity between the host and the source and target storage systems

Availability; the host is stable and reliable—no reboots or downtime occur

Dedicated host—not running a lot of other parallel workloads and restricted user access

The migration host needs to be as optimized for the migration workload and have high network throughput as much as

possible, so it will send and receive all the data to be migrated. Figure 3 shows an Isilon-based migration.

LAN/WANLAN/WAN


Isi_Vol_Copy via NDMP protocol


Figure 3. Isilon-based migration

With an Isilon-based migration, data is pulled directly from the source system to the Isilon cluster utilizing isi_vol_copy.

Another method of migration can be achieved by directly running the Linux rsync replication utility on the Isilon cluster itself.

This approach is shown in Figure 4.

20

LAN/WANLAN/WAN


rsync


Figure 4. Isilon-based migration with rsync on individual nodes

Similar to the OneFS isi_vol_copy tool, rsync can be run natively on the individual Isilon nodes against locally mounted

NFS source exports that are mounted directly on each Isilon node. Data is transferred directly from the source cluster to

the Isilon cluster, reducing latency and network congestion and eliminating the need for external host computers to move

the data.

Isilon-based migrations

If the source system is capable of supporting an Isilon-based migration by use of isi_vol_copy or by direct access with

rsync, the connectivity exists, and the migration methodology supports utilizing this approach, a direct migration may be a

more applicable technique. The main advantage of the direct approach is there is no need for an intermediary host to

execute the process or for the data to traverse the external host.

Type of Isilon-based migration

There are two primary types of Isilon-based migrations:

NDMP-based migration with isi_vol_copy

Rsync-based migration—use the UNIX rsync tool to connect and either push or pull data directly to the Isilon target, running natively on the Isilon cluster

NetApp migration

A NetApp Isilon-based NDMP migration requires the following:

Isilon requirements: Isilon OneFS 6.5.5.6 or later

NetApp requirements: Data ONTAP 7.x or Data ONTAP 8.x operating in 7-mode

It is anticipated that additional source systems will be supported in future releases of Isilon OneFS.

As in all migration strategies, it is critical to evaluate the migration methodology against the selected approach to determine

if the method selected will facilitate your migration goals.

Best practice—Evaluate migration approach

Evaluate and select the most appropriate migration approach by selecting the method that meets your specific migration

requirements, provides cutover flexibility, and optimizes

data throughput.

21

Why: The selected approach will impact the migration schedule and planning.

Once you have identified the migration approach you will use, you can select the appropriate migration tool.

Migration tool selection and use

The data migration requirements will help define the tool selected to execute the data migration.

Tool selection

A number of tools are available and will work with your migration. Any file copy method that can connect over NFS to the

source and target storage can be used to move data between the systems. Dell EMC Isilon recommends that you use a

tool that can be automated and which provides robust functionality—a tool that can copy attributes, security, logging, and

so on.

The common NFS data copy tools are shown in Table 1.

Tool Advantages Disadvantages

isi_vol_copy Is included with Isilon OneFS

Pulls across all user and group

permissions

Supports both SMB and NFS protocols

Utilizes NDMP

Provides a direct source-to-target

migration

Is supported only against specific

source storage systems—NetApp

systems running OnTap 6.5 and

later with NDMP v4

Has limited error reporting

isi_vol_copy_vnx Is included with Isilon OneFS

Pulls across all user and group

permissions

Supports both SMB and NFS protocols

Utilizes NDMP

Provides a direct source-to-target

migration

Is supported only against specific

source storage systems (VNX 7.x

OE and Celerra DART 5.6.x or

later)


rsync Is the Dell EMC preferred tool

Is designed for synchronizing directories

Sends only differences in data when files

change

Can be used with lots of available

switches

Can be scripted

Is open source—widely available


tar, cpio Can be scripted

Is open source—widely available

Is good for one full push of data


Is designed for backup and restore,

not active copying

22

Tool Advantages Disadvantages

Does not make incremental copies

Table 1. Summary of NFS copy tools

Tool versions

It is important to understand that different tools may behave differently on different hosts. Dell EMC Isilon strongly

recommends that you test tool versions and observe their behavior.

Best practice—Use the correct tool for the job

For NetApp filer: use isi_vol_copy

For VNX: use isi_vol_copy_vnx

For general NFS filers: use rsync

Why: Using the correct tool for the job will give you the best chance for a successful migration.

Best practice—Use the correct version of the tool

Rsync is available on nearly every UNIX and Linux distribution, as well as natively on the Isilon cluster. You must use the

correct version for the OS for the host that is running the tool.

Why: Using the correct version of the tool will optimize throughput and performance, that is, use a 64-bit version if your

host OS is 64-bit.

Best practice—Use the latest version of migration tools

You should always use the latest versions of the chosen file copy tool.

Why: Performance is optimized when you use the latest versions of a file copy tool, and they often have newer features

and bug fixes.

Migration tools

The following section provides an overview of the primary NFS migration tools that can be used in Isilon data migrations.

rsync

Overview:

The rsync tool provides a method for copying files, directories, and subdirectories from NFS exports to other NFS exports

with the ownership and attributes intact. It was designed to efficiently synchronize files and directories from one location to

another, minimizing data transfer while using delta encoding where appropriate. If the source and destination have many

files (and parts of files) in common, the utility need only transfer the differences between them. Incremental change copies

are thus extremely efficient.

Rsync can operate in both a local and remote mode (as a service) and behaves similarly to rcp. It can “pull” or “push” files

from filers.

Rsync should be run as a root to preserve file permissions and ownership. It can also use Secure Shell (SSH), if

necessary, for secure environments.

Source code is available, and rsync is implemented on nearly every modern operating system.

23

Usage:

Rsync [options] <source> <destination>

Features:

Enables you to copy file data, ownership, and time stamp information

Is extremely efficient for incremental copies

For a full list of rsync features and switches, run the following from a command shell:

man rsync

Sample rsync command:

rsync $OPTS [$SOURCE:]$SOURCEDIR [$TARGET:]$TARGETDIR

where variables are typically defined:

$OPTS=--force –ignore-errors –delete-excluded –exclude –from=$EXCLUDES –delete –

backup –a”

$EXCLUDES=/path/filestoexclude

$SOURCE=name of source filer

$SOURCEDIR=/path/sourcefiles

$TARGET=name of isilon node destination system

$TARGETDIR=/path/targetdir

Rsync by default runs in a local mode, but with the addition of [$HOST:] in front of either $SOURCEDIR or $TARGETDIR,

it can transfer files remotely between systems.

For example:

rsync –avh /tmp/foo root@host2:/tmp/bar

If run from the source system, it would transfer the local directory (/tmp/foo) to the remote host (host2) and place them in

the target directory (/tmp/bar).

Note that a shell script is usually created to automate and distribute the rsync jobs. Entire migrations can be automated

and incremental passes scripted to run automatically. Review the scripts with the customer to verify that the sequence of

commands matches the expected migration plan.

Best practice—rsync and compression

If the data is mostly binaries or large uncompressible files, Dell EMC Isilon does not recommend that you use

compression, as this will slow the migration considerably. Text files, however, will readily compress and, if the source data

consists of text files, using this option will greatly speed the migration. Know the source file composition.

Why: Trying to compress non-compressible data will greatly slow the migration.

24

Best practice—Exclude snapshots from replication

It generally does not make sense to migrate snapshots because they will not automatically work on the target system as

intended. Therefore, exclude them from the migration to speed the process.

Why: Snapshots will not migrate.

For example:

rsync –avh –exclude=’.snapshot*’ /tmp/foo root@host2:/tmp/bar

If run from the source system, this would transfer the local directory (/tmp/foo) to the remote host (host2) and place data in

the target directory (/tmp/bar) while excluding snapshots.

Best practice—Watch for spaces in names

Be aware that spaces in file and directory names can cause problems.

For example, a directory named “/spaces in my name” and a file named “some file.avi” would require special handling on

both a command line and in a script:

rsync –av foo@foomachine:’/spaces\ in\ my\ name/some\ file.avi’ /local_directory/

The “\” character is used before spaces to prevent the shell from parsing the next word as a separate argument.

Why: Spaces in file names and directories can cause scripts to fail. Because of this, be alert for them.

Best practice—Starting rsync switches

Suggested initial rsync switches:

--a=archive mode; equals –rlptgoD (no –H,-A,-X)

--delete=delete extraneous files on the target system (useful on incremental copies if the source data has been deleted)

--force=force the deletion of directories even if they are not empty (during incremental passes if directories are deleted on

the source)

--compression=compress the data being transferred (if the data is compressible)

Why: Dell EMC Isilon recommends that you start with a baseline of switches and test the copy, validate the results and

behavior of the copy, and make the appropriate adjustments to the rsync switches. No single default set of switches will

work for all migrations. Remember, rsync can be run multiple times incrementally, and different directories/exports may

require different options.

You should become familiar with many of the rsync switches and their use. The following highlights a few possible options

that you should be familiar with. It is important to recognize that each migration will require different switches because of

the unique requirements of each dataset.

A few useful switches to be aware of include the following:

-r,--recursive

Recurses into directories

-l,--links

Copy symlinks as symlinks-p,--perms

Preserves permissions

-h

Outputs numbers in a human readable format

25

--progress

Shows progress during the transfer

-z,--compress

Compresses the file data during the transfer

-g,--group

Preserves the group

-o,--owner

Preserves the owner

-D

Preserves special files and device files

--protect-args

Enables you to transfer files that contain white space; you can either specify --protect-args or escape the white space with

a “\”

-stats

Provides detailed list of the total number of files, files transferred, benchmarks, and an average transfer speed

-t,--times

Preserves modification times

-v,--verbose

Increases verbosity

-n,--dry-run

Performs a trial run with no changes being made

--exclude=PATTERN

Excludes files matching PATTERN

--exclude-from=FILE

Reads exclude patterns from FILE

Symbolic links and hard links

Be aware of symbolic links within the source file system. They may not point to the same target after migration if paths

change.

Rsync has multiple methods of dealing with symbolic links. Choose the most appropriate option after consulting with the

customer. These links can also be addressed in a separate migration pass.

By default, links are not transferred at all. A message such as “skipping non-regular file” is generated for any symbolic links

that rsync encounters. Switches to deal with links include:

--links

Symbolic links are recreated with the same target on the destination. Note that --archive implies --links.

26

-L,--copy-links

Symbolic links are “collapsed” by copying their referent, rather than the symbolic link.

-H,--hard-links

Preserves hard links

--safe-links

Ignores symbolic links that point outside the tree that is being replicated. This is useful for preventing sensitive system files

such as /etc/passwd from being inadvertently copied.

Best practice—Know the rsync switches

You should understand all the rsync switches and when and how to use them.

Why: Different migrations will require the use of different switches to meet the requirements of the data copy and the final

state of the migrated data. Discuss this with the customer before the migration begins to determine the optimal selection

of switches.

Best practice—Parallelizing the rsync processes

Examine the source directory structure and look for obvious ways to divide the source directory tree into smaller, more

manageable chunks.

For example, if you have a file system of 4,000,000 files, it might take six hours to complete the process in this

hypothetical example. Consider if the file system tree was divided into something like the following:

drwxr-xr-x 2 root root 179 Jul 19 15:00 directory_a

drwxr-xr-x 2 root root 179 May 1 00:00 directory_b

It would cut the migration time in half if you could run two simultaneous rsync jobs at the same time (assuming the content

of the directories was balanced nearly equal):

rsync –av –include=”/directory_a*” –exclude=”/*” –progress remote::/ /localdir/

rsync –av –include=”/directory_b*” –exclude=”/*” –progress remote::/ /localdir/

The best performance would result from spreading requests across multiple Isilon nodes and multiple source network

interfaces. Multiple rsync jobs can be run on individual nodes as well, but these processes tend to be network limited. You

want to spread the load across as much of the Isilon cluster as possible, maximizing the available bandwidth on each

node. Be aware if the NFS option “map root to nobody” is implemented, as this may affect access to files.

Why: You will see increased performance, but you may be limited by network bandwidth and source cluster throughput.

Isilon-based migrations – isi_vol_copy

Overview:

Isi_vol_copy is a native Isilon OneFS tool that supports data migration through the use of the NDMP. The tool allows the

cluster to mimic the behavior of a backup target and allows the data to be copied directly from the source system to the

Isilon, which preserves permissions and ownership.

Usage:

isi_vol_copy <src_filer>:<src_dir> [options] <dest_dir> [ -full | -incr ]

[-sa user: | user:password]

[-sport ndmp_src_port]

[-dhost dest_ip_addr]

27

[-maxino maxino]

[-h]

Features:

Utilizes native NDMP functionality and connectivity

Supports full and incremental backup levels

Migrates data and all security, and attribute information

Will restore the set of permissions and ACLs that existed on the source data

Will migrate NFS and SMB source data

Does not impact or interact with client data access

Provides a dedicated data transfer pipe between the source and cluster

Starting with OneFS 7.0.2, it supports the Backup Restartable Extension, so that full backups can be interrupted and restarted from a checkpoint

Limitations:

Source NAS arrays have limits on the number of NDMP threads and simultaneous backup jobs; you should therefore avoid overloading the source NAS system.

It can be limited by source filer network bandwidth.

Sample isi_vol_copy command

isi_vol_copy <source_filer_IP>:/<source> -sa<ndmpuser>:<ndmppassword> /ifs/data/<source_filer> -full

Best practice—isi_vol_copy target data use

Do not alter data on the target Isilon system until after the isi_vol_copy has completed.

Why: This will create problems and you may have to reperform a full copy.

Best practice—Simultaneous isi_vol_copy use

Do not execute multiple isi_vol_copy going to the same target, that is, don’t have all your isi_vol_copy migrations going to

the same target directory. For example:

filer1:/vol/sourcedir -> isilon:/ifs/data

filer2:/vol/sourcedir2 -> isilon:/ifs/data

Why: This creates problems for the copy process and may require remediation after migration.

Instead: Use an additional directory level:

filer1:/vol/sourcedir -> isilon:/ifs/data/filer1/sourcedir

filer2:/vol/sourcedir2-> isilon:/ifs/data/filer2/sourcedir2

If consolidation is required, this can occur after the data is migrated and any potential merging of identically named

subdirectories can be addressed.

Best practice—isi_vol_copy use

isi_vol_copy is optimized to stream as much data as possible across a network; always monitor load on the source and

target systems for any potential impact.

Why: Because isi_vol_copy is optimized to stream as much data possible, don’t overwhelm older source systems and

create potential link saturation or disk problems, especially if there are users connected and they are attempting to access

files.

28

Best practice—isi_vol_copy limits

Dell EMC Isilon recommends that you use fewer than 40 million files per volume transfer when using isi_vol_copy.

Why: All programs have limits, and this is the recommended maximum when using isi_vol_copy for each individual

transfer. Larger source volumes should be broken up into smaller chunks (that is, use a separate isi_vol_copy stream for

multiple subdirectories instead of one large transfer of an entire volume).

Once the initial copy is complete, then incremental copies can be run:

isi_vol_copy filer1:/vol/sourcedir –sa root:<password> /ifs/data/filer1/source –

incr

Important: Do not start an incremental copy job until a full copy has been completed. A successful full copy must finish

before an incremental copy is started. Unlike rsync, which automatically does incremental copies, isi_vol_copy must be

explicitly called with –incr to perform an incremental copy.

Subsets of source directories and volumes can be migrated.

An entire volume does not need to be migrated; subdirectories can be migrated individually as well.

For example:

A volume, /export/vol1, is exported containing subdirectories /work, /scratch, /tmp, and /home. You could migrate the entire

vol1 or any/all of the individual subdirectories under vol1, for example, /export/vol1/work and /export/vol1/home might be

the only necessary directories to move.

Isilon-based migrations from VNX—isi_vol_copy_vnx

Overview:

isi_vol_copy_vnx is a native Isilon OneFS tool that supports data migration through the use of the NDMP for VNX. The tool

allows the cluster to mimic the behavior of a backup target and allows the data to be copied directly from the source system

to the Isilon cluster, preserving permissions and ownership.

Usage:

isi_vol_copy_vnx <src_filer>:<src_dir> [options] <dest_dir> [ -full | -incr ]

[-sa user: | user:password]

[-sport ndmp_src_port]

[-dport ndmp_data_port]

[-dhost dest_ip_addr]

[-h]

Features:

Utilizes native NDMP functionality and connectivity

Supports full and incremental backup levels

Migrates data and all security and attribute information

Will restore the set of permissions and ACLs that existed on the source data

Will migrate NFS and SMB source data

Does not impact or interact with client data access

Provides a dedicated data transfer pipe between the source and cluster

Limitations:

Source filers have a limit on the number of NDMP threads and simultaneous backup jobs; you should therefore avoid overrunning the source filer.

It can be limited by source filer network bandwidth.

Check with Dell EMC Isilon Support for the latest compatibility with tools, DART codes, and OneFS.

29

Migration preparation

After you finish planning the migration and selecting the tools you will use, you can prepare the source and target systems

for the migration.

Infrastructure and environment setup

Network connectivity

Because all the data in the migration will traverse the network, you should optimize the network infrastructure and

connectivity between the source system(s) and the target Isilon cluster.

Common recommendations include the following:

Maximize network bandwidth; 10 Gb/s is preferred to 1 Gb/s, optimized end to end, with a Maximum Transmission Unit (MTU) of 9000 bytes

Limit hops and latency between the source and target storage systems

Isolate migration traffic so that it does not to compete with client access

Limit potential network bottlenecks that can occur with routers, firewalls, Intrusion Detection System (IDS), and shared network infrastructure

Best practice—Optimize the network for the migration traffic

Optimize the migration network path; try to limit other production traffic from this network and limit network devices the

traffic traverse (firewalls, IDS, and so on). Ideally, try to create a dedicated private migration network that can be

optimized for only the migration traffic.

Why: Separating the migration traffic from other network traffic will allow for maximum throughput and reduce potential

impact to existing production traffic by limiting

network saturation.

Migration account

In order for the migration data to be copied from the source to the target system, the tool accessing the data must be able

to access all of the source and target data.

Commonly used migration accounts:

root

User accounts created explicitly for the execution of isi_vol_copy, for example, ndmp

The account used to connect to the source and target storage systems will depend on the security model implemented in

the environment.

Best practice—Use a specific migration account to execute migration tasks

Use a specific migration account or an account with group membership that has the required access to all source and

target data, that is, root.

Why: Using a dedicated account will allow for oversight and management of the migration

data access. It will also allow migration tasks and users to be separated from other

production accounts.

Source host preparation

The source data storage system should be prepared and optimized for the migration.

Best practice

Access can be restricted to the source cluster exports, preventing users from changing data on the source cluster instead

of the migration target clusters. Change exports to read-only status once the migration and incremental copies are

30

completed to prevent clients from writing to them.

Why: This allows you to separate migration events from normal production access. This process can be used post-

cutover to deny read/writes from being made by a normal user to the source cluster. This prevents updates to the data

during data cutover and post-migration while still allowing administrative access.

Migration host preparation—Source and target access

The migration host should be prepared and optimized for running the migration copies:

Limit workload and access to optimize throughput

Restrict access and reduce service issues with the host

Prepare all migration jobs as scripts

Test and validate network throughput

Best practice—Watch out for root_squash

On the source cluster, exports sometimes restrict access by using root_squash to prevent root users from connecting

remotely and having root privileges. But this access is needed for migrating data. Instead, use the option

“no_root_squash” to turn off root squashing.

Why: Root access (or its equivalent) is needed to migrate all files and directories.

We can also set root squashing on the Isilon NFS exports as shown in Figure 5.

Figure 5. Setting root squashing on Isilon NFS exports

31

In Figure 5, “User/Group Mappings: Use default: Map root users to nobody” has been set for the export /ifs/data/work. This

should be disabled to allow the root full access to the file system. Note that this can be set on an export-by-export basis. In

addition, root access to a specific client, 192.168.43.200, has been restricted in the figure example. Typically, this would be

the host performing the migration.

Isilon cluster configuration preparation

All primary setup and configuration of the Isilon cluster should be completed before data migration begins. The

configuration includes, but is not limited to, the following:

Authentication provider integration—ensures that all authentication providers are online and fully operational.

If local users, all UIDs and GIDs are created and tested.

Access zone and role-based access control (RBAC) setup—complete any zone and RBAC setup.

Exports for clients are created and tested.

Networking design and setup—complete the setup and implementation of the network configuration.

SmartPools—complete the implementation of any SmartPools policies to limit post-migration work.

Dell EMC Isilon SyncIQ®—prepare any existing SyncIQ policies to operate alongside any data migration events.

SnapshotIQ—prepare any SnapshotIQ policies to operate alongside any data migration events.

SmartLock—execute all preliminary SmartLock work prior to migration.

SmartQuotas—disable SmartQuotas until the migration is completed.

Dell EMC Isilon recommends that you use a dedicated Isilon migration directory export to execute all migrations against.

Using a dedicated administrative migration export with the appropriate access and configuration can facilitate the migration

without impacting workflows or data permissions. Normal user clients will not mount this export; it will be mounted only by

migration hosts. Data can then be moved into place after the migration is completed with minimal disruption.

Best practice—Implement a logical NFS export path structure methodology

Be aware of export rules and how they interact with each other:

Path—should be unique, though nesting is possible if used with caution

Permission options—restricted by user ID mapping and IP addresses

Security—netgroups, authentication system, and Kerberos

The order of evaluation is path, client ACLs, then security types (unix, krb5).

For example, if you have exports:

/ifs/data --client=8.8.8.0/24

/ifs/data/something --client=10.10.10.0/24

and your client IP is 10.10.10.10, you would not have access to /ifs/data/something because the export, /ifs/data, has a

different IP restriction. The client must be able to traverse the path if the exports are nested. Access for 10.10.10.0/24

would need to be added to the /ifs/data export:

/ifs/data --client=8.8.8.0/24 --client=10.10.10.0/24

/ifs/data/something --client=10.10.10.0/24

Why: Complex and restrictive rules may prevent clients from connecting to exports that are nested. Clients may

encounter problems mounting a directory that is nested from a different export. If the parent directory has more restrictive

permissions, a client may not be able to mount a child export of that directory.

Best practice—Create the NFS exports

Create the new Isilon NFS exports prior to data migration.

Why: This will allow the creation and setup of the exports and export permissions prior to data migration and cutover for

initial testing and access validation/UAT.

32

Best practice—Do not use default /ifs export

Not using this export enables you to mount subdirectories and have open access to the whole file system.

Why: This should be used for easy setup use only, as it can be a potential security issue.

Best practice—Create the correct NFS export permissions

Set up the correct export permissions on the newly created user exports.

Why: Setting the correct export permissions will allow you to test and validate workflows when test migrations are

undertaken and maintain security.

Note: The migration methodology may include adding an explicit “deny” permission settings on users or directories, so

that they cannot write data to these exports until the cutover has been executed, as well as specific IP addresses to

prevent clients from accessing the exports.

Additional Isilon cluster considerations

The following are some additional Isilon cluster considerations that may need to be addressed prior to and during a data

migration.

NFS group membership limitation

The NFS standard by default does not support membership in more than 16 groups per individual user. This limitation can

be addressed on an Dell EMC Isilon cluster by enabling “Map Lookup UID” under NFS Settings ->NFS Export Settings -

>Export Behavior Settings -> Map Lookup UID as shown in Figure 6.

33

Figure 6. Increasing the number of group memberships in an Isilon cluster

Additional information can be found in the Dell EMC document “NFS supplemental groups limited to 15 in OneFS 6.5.4 and

earlier” (Article Number:000089550).

Production or preproduction cluster

An important consideration when planning and executing a data migration is the current status of the Isilon cluster. Is the

cluster in production or will the migration mark the initial cutover to active production traffic? The primary goal should be to

lessen any impact on a production cluster during migration activities, so appropriate steps should be taken to address

these concerns.

Common factors to be aware of while migrating to a cluster are as follows:

Administratively destructive actions

The saturation of network links

Cluster load and ingest, and impact they have on production workflows

Access zones and role-based access control

If the cluster uses an Isilon access zone or RBAC, the migration methodology may need to be adjusted to accommodate

this configuration. Currently, OneFS 7.0 only allows for NFS exports in the default System Zone and no other zones.

34

NFS RPC threads

By default, the number of NFS RPC threads is set to 16 per node. This number can be increased for specific workflows.

Consult Dell EMC Isilon Support for assistance.

Isilon OneFS SmartConnect or direct node connections

The current status of the cluster may dictate that you should try to optimize and segregate migration traffic within the

clusters network configuration. You can do this by:

Using SmartConnect to autobalance traffic

Separating migration traffic from existing production traffic by using a direct node or separate SmartConnect zone connection for migration traffic

If you use SmartConnect, you should validate and optimize the configuration before you transfer data across the network.

Isilon OneFS SyncIQ considerations

If the data to be migrated will be replicated to a secondary cluster through a SyncIQ policy, additional planning should be

undertaken to address the impact of the data migration and its interaction with active SyncIQ policies. With this scenario,

you should:

Pause active SyncIQ policies if they include migration paths

Schedule SyncIQ jobs to run outside of data copy windows

Utilize SmartConnect zones for copying and SyncIQ replication

Isilon OneFS SmartPools

Any SmartPools data policies should be in place prior to data migration, or additional cluster overhead may be required to

move data within the cluster post-migration.

Isilon OneFS SnapshotIQ

Any active SnapshotIQ policies should be analyzed for any impact during the data migration.

Antivirus integration

You should review and disable any active antivirus scanning policies that may be running against the target data.

35

Best practice—Disable antivirus scanning

Disable active antivirus scanning on migrated data during the initial full and incremental copies.

Why: The large influx of data associated with the migration can place an excessive load on

the antivirus scanning architecture and create a slowdown and potential bottleneck for

inbound data.

Isilon guidelines for large workloads

Be careful not to exceed the maximum configuration values listed in Table 2.

Guideline

Tested or

default

value

Theoretical or

max. practical

value Comments

NFS max. read

size

128 KB 1 MB This applies to both NFS3 and NFS4.

Prior to OneFS 7.0, the maximum read size was 128 KB.

NFS max. write

size

512 KB 1 MB This applies to both NFS3 and NFS4.

Prior to OneFS 7.0, the maximum write size was 512 KB.

NFS3

connections (per

node)

1000 N/A The number of TCP sockets available on the node is

typically what limits NFS connections. Unlike the Isilon

SMB server, nfsd uses file handles instead of file

descriptors to represent files and internally handles work

items much differently.

1,000 connections is a very conservative tested limit and

represents 1,000 mounts over 20 exports. NFS

connection testing is an ongoing test effort.

No maximum connection limit for NFSv3 has been

established at this time.

NFS3 exports

(per cluster)

750 2,000 Beyond 2,000 connections, manageability becomes a

problem. Cluster size does not matter.

nfsd threads (per

node)

16 16 This is a kernel limit, which is exposed via the Linux sysctl

command.

The value should not be changed without first consulting

Dell EMC Isilon Support.

This limit represents the maximum number of

simultaneous work items the server can service.

Simultaneous work items beyond this number are queued

and serviced when resources become available.

File name length 255 B 255 B Note: Most Unicode character encodings (like UTF-8,

which is the OneFS default) allow for multiple bytes per

character (UTF-8 allows for up to 4 B/character).

So, of the 255 B in the filename, that could represent 255

36

Guideline

Tested or

default

value

Theoretical or

max. practical

value Comments

characters, 63 characters, or some number of characters

in between.

Path length 1024 B 1024 B This is the maximum absolute path (for example,

/ifs/data/foo/bar/baz/) length that can be passed into a

syscall, not the maximum depth of a directory in the file

system (see “Directory depth”).

Directory depth 8470 Unlimited* * No specific hard limit is in place, but several other limits

could come into play (inode limits, metadata storage

limits, and so on). In tests, command line (shell) utilities

begin to experience problems at a depth of 8470 (EBADF

from many commands). At a depth of about 30000,

internal utilities (for example, Job Engine TreeDelete) also

begin to fail.

For utilities that make calls with absolute paths (for

example, cd /1/2/3/…), depth will be limited by “path

length” as described above. This applies to path-based

OneFS commands like “isi snapshot” and “isi quota”). For

utilities that access relative paths (for example, cd 1, cd 2,

cd 3…), these higher limits may apply, although the value

of extraordinarily deep directories is questionable.

37

Guideline

Tested or

default

value

Theoretical or

max. practical

value Comments

File size 4 TB 4 TB This is the hardcoded OneFS limit. Note that Job Engine

performance can be impacted on files larger than 1 TB

due to inefficient per-file threading.

Table 2. Isilon guidelines for large workloads

Migration approach—Testing and proof of concept

Once you have developed the migration approach, selected the toolset, and prepared the infrastructure for the data

migration, you can proceed with your initial testing of the methodology. The goal of the testing is to validate the outcome—

is data migrated, are the permissions moved, and are the timestamps moved? The testing phase also allows you to tune

and modify the migration approach to optimize all parts of the process.

The recommended testing approach is as follows:

Run the full copy—benchmark and monitor

Review and validate—potentially look at tuning or tweaking the methodology and re-run the full copy

Run the incremental copy—benchmark and monitor

Review and validate—potentially look at tuning or tweaking the methodology and re-run the incremental copy

Continue to run the incremental copy—and continue to monitor it

Dell EMC Isilon recommends that you test different copy methodologies to tune and optimize the throughput while meeting

your migration requirements.

Best practice—Execute multiple test migrations to validate the methodology

Dell EMC Isilon recommends that you execute multiple migration tests on smaller subsets of different data.

Why: Because different data will tend to have different properties and access profiles, it is important to test all data types

and how the migration methodology may need to be modified for different datasets.

Critical areas to evaluate and monitor during data migration testing are the following:

Network performance—throughput, saturation, and impact

The time to execute a full data copy—will allow for refinement of project plans

The time needed to execute an incremental copy after a set number days after the data change occurs—will help define cutover windows

Cluster load, source load, and host load—will help tune and refine the migration methodology

Best practice—Test all phases of the migration methodology

Execute all steps in the migration methodology to identify the time involved and to verify that the proposed methodology

fulfills all the migration requirements.

Why: It is important to identify issues with the methodology before executing production migrations and cutovers.

Data validation

After you migrate the data, you must validate the data and the file attributes. You must verify that:

File data copied correctly—data is intact and integrity is maintained

File security, ownership, and attributes migrated correctly

File timestamps are correct

38

Next, review the access control entries on a file by running the ls –led followed by a file name as is shown in Figure 7.

Figure 7. Reviewing access control entries on a file

You should also validate the data. Common methods include the following:

File size compares

Checksum/file hash compares—MD5 checksums

Tools—MD5, sum, and checksum

Audit and review directory structures

Once you have reviewed the data attributes directly, it is critical that you validate that the data works in client workflows.

Performance

A migration often moves a large amount of data. You must ensure that the migration methodology, toolset, and

environment are optimized for performance and throughput to work within the migration timeline. The common areas to

focus on when evaluating performance are as follows:

Identify bottlenecks—attempt to identify the worst-performing component

Disable antivirus scanning processes on target and/or source file systems during initial migration copies to minimize CPU impact on client access and its potential impact on extending the length of copy times.

WAN bandwidth physical (circuit limitations) and concurrency impact other systems that are replicating data (SAN, backup, and so on) over a shared link. This could affect replication performance for SyncIQ jobs that need to be run to completion before certain cutovers can be conducted.

Review the timing of the execution—how time of day and day of week tests were executed versus performance

Collect metrics on the data copies, network throughput, source, host, and target systems—evaluate the copy as a whole

Best practice—Time the incremental copies

Benchmark the incremental copies by timing how long they take to execute so that you can plan and orchestrate the

cutover phases appropriately.

Why: Knowing how long an incremental copy will take will likely help you determine the length of time required to execute

a cutover and will help you determine the data outage window that may occur.

User acceptance testing—Data and workflow testing

The final step of migration data testing is the UAT in which the data is tested for integrity with existing workflows. Dell EMC

Isilon recommends that you use test workflows because this data should be considered test data only at this time and may

be removed by later migration steps.

Best practice—Check workflows with test migrated data

Review all workflows on test migrated data.

Why: It is critical that you validate that newly migrated data can be integrated into workflows (that is, user home director

or group share access, and so on) at cutover time without issues. By testing the workflows, you can ensure that cutovers

will occur without incident.

39

Start of migration execution

After you complete all the testing and validation, you can begin to move into the production migration phases. All the

information obtained from testing and tuning should be used to modify and optimize the overall methodology so that the

production migrations are as clean and quick as possible.

The migration execution phases are as follows:

Execute the initial full copy

Execute incremental copies to keep the new storage up to date and as close to production as possible

Based on performance, you can execute multiple migrations simultaneously if they are supported and if the network infrastructure can support the additional load

Multiple hosts (if using a host-based migration) can replicate data in parallel; be aware of load placed on the source cluster.

To reduce contention, coordinate hosts so that they are not all writing to the same subdirectory. Spread hosts across the

Isilon cluster to maximize network bandwidth and processing power.

Alternative migration methodology: Mount the source NFS exports directly on the individual Isilon nodes and run the rsync

commands directly on the cluster. The benefit with this method is that if you have multiple nodes, your transfer bandwidth

scales, and you remove the “middle man” from the process. Data moves directly from the source filer to the Isilon cluster,

without having to go to a host first and then back out to the Isilon cluster. As a result, latency is greatly reduced and

migration times will drop.

Best practice—Continue to run incremental copies

Continue to run incremental copies, even if the cutovers are not scheduled.

Why: This will keep the source and target data more in sync and require less data transfer during the final pre-cutover

copy.

Depending on the size of the data migrations, the initial full copies may take a while to execute. During this time, you can

prepare for the final cutover events.

Pre-cutover preparation

After you start to migrate data, you can begin to prepare your cutover events.

Best practice—Create a detailed migration plan

Create a detailed migration plan with all the specific steps and timing of the migration execution.

Why: This document will dictate the commands and work that are being executed. The plan controls the entire migration

from start to finish. All roles, tasks, and responsibilities are defined.

The detailed migration plan dictates how the migration is executed.

Best practice—Create a cutover document

Create a cutover document that defines the high-level cutover tasks, responsibilities, and timings. The document should

outline the phases and sequence in which tasks are executed.

Why: This document will outline the sequence of events that need to occur during a cutover. It can be used to track and

monitor the progress of the cutover.

Best practice—Create a schedule and define outage windows

Have a well-defined cutover schedule and outage window.

Why: The schedule helps execute the migration cutover. The outage window can be scheduled when you have clearly

determined that access to storage will be unavailable and that you can make storage system changes without impact to

40

clients.

Best practice—Create a communication plan

Have a communication plan.

Why: This communication plan will clearly outline the protocols needed to keep all users up to date on the status of a

migration and enable storage administrators to stay focused on the execution of the cutover and not be distracted by

information requests from end users.

Best practice—Prepare the DNS name resolution infrastructure for cutover

Lower the DNS time to live (TTL).

Why: This will facilitate the cutover of clients using DNS name resolution by reducing the time between authoritative

updates to DNS.

Additional pre-cutover preparation steps often include the following:

Prepare the DFS namespace, if applicable

Create CNAMEs in DNS

Update scripts used by clients for storage connections

Prepare clients and applications

Cutover event

Once you have migrated the data and prepared the environment for the cutover, the actual final migration event can occur.

In general, the high-level cutover sequence resembles the following steps:

Initiate the migration cutover window—communicate the event

Restrict access or make source data read only—prevent new writes to the old data source

Execute a final incremental—copy all final data to the new storage system

Validate final incremental—validate that the source data is ready for the cutover

Execute final testing—the final cutover testing is completed

Make a go or no-go call on a full cutover—decide if the migration should continue or roll back

Update connection and name resolution protocols; DNS, CNAMEs, DFS, and scripts

Enable new storage to read/write—enable writes to the new storage system

Continue testing and user acceptance—continue to test and monitor as production traffic moves over to the new storage system

Execute the redirection of client to the new storage system—initiate the client redirection process

Monitor—assess the cutover, new storage system, and clients

When executing a cutover event, the best practices that follow are recommended.

Best practice—Follow the cutover schedule

Follow a cutover schedule.

Why: By following a well-defined schedule, you can monitor and control the migration. Dell EMC Isilon recommends that

you execute cutovers during off hours or when the number of active connections is low.

Best practice—Test the migrated cutover data

Prepare a number of data and workflow tests to execute against the migrated data. Have a number of well-defined

production use cases, data tests, and test users available to conduct post-cutover testing and review.

Why: Having a well-defined use case and users to validate the migration cutover will help you in making the decision to

continue with the cutover.

41

Best practice—Monitor clients and application during migrations

Monitor client and application connections to the new storage system during the cutover.

Why: This will verify that your cutover methodology is working as defined and that clients are moving and connecting to the

new storage system successfully.

Best practice—Develop a client connection remediation plan

Have a client remediation methodology plan in place and which is ready to execute against clients that exhibit any issues

connecting to the new data targets.

Why: Have a well-defined strategy to handle client connection issues, including a dedicated support line, email address,

or an IT desk.

The go or no-go decision

During the migration cutover window, a critical point will be reached. This threshold determines whether you will continue

with the cutover or abort the cutover and roll back to the existing storage system.

Common abort cutover situations include the following:

Final incremental does not complete in the outage window

Cutover methodology fails; clients are not connecting correctly

Security issues with the new storage system

Workflow issues post-cutover

Load and availability problems

Other unknown issues

Best practice—Clearly define your cutover criteria

It is critical that you have a series of cutover criteria that clearly defines when a migration will continue or be aborted and

rolled back.

Why: The criteria removes uncertainty, provides help with decision-making, and dictates the best action to take.

Once you begin to write data to the new storage system, reverting to the old system becomes much more complicated

because you now need to reconcile data with the original storage system.

Rollback

If a decision to abort the cutover is made, there should be a well-defined rollback plan in place that was developed and

tested ahead of time so that you can restore data access as quickly as possible.

Rollback plan:

Prevent any new writes to the new storage system

Move client connections back to the old storage system

Enable writes to the old storage system

Best practice—Develop a rollback plan

Have a clearly defined rollback strategy that is easy to implement and which can restore user access to data quickly and

cleanly. Also, make sure the plan is tested.

Why: A rollback plan will help you restore client data access quickly in the event that a migration cutover event fails.

If any data has already been written to the new storage system and a rollback is executed, then steps to remediate this

data must be taken to restore the new data back to the original storage system.

42

Common strategies for reconciling data during a rollback are as follows:

Manually reconcile the data—identify and manually move any data from the new storage system to the oldone.

Perform a reverse incremental—have migration type jobs to run in a reverse direction to update the old storage system.

Discard the data—consider the data as noncritical and decide to reconcile it.

Rewrite the data from the client or application to the old storage system—allow applications and clients to rewrite the data.

The goal of any rollback strategy is to limit the impact on end users and restore data access as seamlessly as possible. It is

for this reason that your migration cutover criteria should be well defined and that the rollback strategy should have been

well tested in the event that you need to implement it.

Migration event completion

After you successfully complete a cutover, you should continue to monitor the new storage system.

Best practice—Monitor the new storage post-cutover

Continue to monitor the cutover storage after the cutover event for any issues that may result from the cutover.

Why: Production load and workflow may be unpredictable; closely monitor the new storage system to rectify any post-

migration issues.

Steady state

Repetition

Most data migrations will constitute multiple cutover events. Once you have developed a well-structured migration

methodology, these additional cutover events should be run with the same plan and strategy.

Lessons learned

After you complete a migration, you should assess the success and failures of the methodology. If an additional migration

needs to be performed, the lessons you’ve learned from the migration process will enable you to refine the process.

Ask yourself the following questions:

What worked during the migration and cutover?

What did not work during the migration and cutover?

Can the migration methodology be modified or optimized?

Conclusion

The goal of this paper is to supply you with solid guidance for conducting an NFS single protocol file migration from a NAS system

to an Isilon cluster. The guidance is based on

a comprehensive set of industry knowledge and best practices on the technical aspects and process of data migration. As stated

in the beginning of this document, this paper does not aim to be an exhaustive authoritative source on the subject of NFS single

protocol migrations, but rather a comprehensive reference document that covers the key areas that will help ensure your success.

Dell EMC can provide comprehensive services, including migration services, Isilon training and education, and residency

services, to reduce risk and maximize system uptime and service levels during and after a data and system migration.

43

Appendix: Sample migration use case

This appendix provides a sample high-level overview of how to collect information for and plan a migration of NFS source

directories from a single NFS server. Keep in mind that the information in this section provides only a skeleton of some of the

information that you would want to collect, as well as an overview of the strategy that you will want to define for your migration.

The use case that follows answers the following question at a high level: What are the recommendations and best practices as

well as supported Isilon configurations to migrate all directories and data?

Tables A-1 through A-3 show a sample NFS filer storage system to an Isilon cluster.

Table A-1. Source data

Source configuration and data Sample directory structure

Single source system—NFS filer, 4 x 1 Gb Ethernet

ports

Total data: 25 TB

Max. file size: 4 GB

Min. file size: 0 B

Avg. file size: 256 KB

File count: 8,000,000

8 top-level exports:

acct

engineering

home

production

RandD

scratch

temp

work

User home directories: each user has a single home

directory under a higher level share, /exports/home.

/exports

/acct

/engineering

/home

/production

/RandD

/scratch

/temp

/work

44

Exports and permissions

drwxr-xr-x 10 root wheel 512 Aug 27 14:50 .

drwxr-xr-x 21 root wheel 512 Aug 27 14:50 ..

drwxrwxr-x 2 root rd 512 Aug 27 14:50 RandD

drwxrwxr-x 2 root acct 512 Aug 27 14:50 acct

drwxrwxr-x 2 bob eng 512 Aug 27 14:50 engineering

drwxrwxr-x 2 root users 512 Aug 27 14:50 home

drwxrwxr-x 2 prod prod 512 Aug 27 14:50 production

drwxrwxrwx 2 root wheel 512 Aug 27 14:50 scratch

drwxrwxrwx 2 root wheel 512 Aug 27 14:50 temp

drwxrwxr-x 2 root eng 512 Aug 27 14:50 work

Table A-2. Exports and permissions

Additional source information Additional Isilon information

All LDAP: single domain

Source system network connectivity:

4 x 1 Gb

No firewalls, IDS, or QoS

Monthly full backups

Antivirus scanning in place

Same data center as the Isilon cluster

No deduplication or offline files

DNS: 2 CNAMEs

No routing or VLAN restrictions

Dell EMC Isilon X200 x 3: ~61 TB

OneFS 7.0.x

LDAP authentication

3 x LACP (2 x 1 Gb/s each)

SmartQuotas

SnapshotIQ

Table A-3. Additional source data information

Requirements

All data and permissions are moved as is with no changes. All existing POSIX permissions and ownership is retained.

Eight cutover events; 12-hour window—Saturday 8:00 P.M. through Sunday 8:00 A.M.

One migration is performed per weekend.

Each user has a defined quota; quota limits are to be replicated on the Isilon cluster.

Migration project assumptions (including, but not limited to):

Customer will have approved change controls submitted for any migration activity.

Migration plan and design will have been reviewed and approved by the customer prior to the start of the cutovers.

Any recommended array OS upgrades (and firmware updates) necessary for the migration will be applied before any migration cutover activity occurs.

45

Source NAS and target Isilon systems must be in a known good state prior to conducting the migration.

The customer will have successfully completed a full system backup and verified its reliability.

Strategy

For an Isilon-based migration, mount source NFS exports directly on individual Isilon nodes

Conduct pilot migration; validate methodology, document performance metrics, refine and tune rsync switches and scripts; test migrated data with clients and users

Investigate sizes and file counts in each export; this will help determine the order for the migration (that is, the largest directories will take the longest time and should be started first)

Validate change rates and time to execute incremental copies

Have completed a full backup of all file systems that are to be migrated before the cutover; verify that the backup is good

Develop a detailed project timeline and cutover schedule with the customer

Develop a backout plan, and review it with the customer

Execute the migration in phases: execute initial full copies, followed by nightly incremental copies

Use a DNS update methodology to redirect clients

During the cutover, make sure that each source file system is changed to a read-only stare after the source directory is successfully replicated to prevent clients from making any changes

Reduce DNS TTLs in advance of cutover windows

Develop client communication; the customer should provide dedicated cutover IT support desk/personnel

Source system configuration

If possible, restrict access so that clients cannot modify data during the cutover (change to read only).

Isilon configuration

Pre-create NFS shares with identical share permissions

Disable snapshots on data until the cutover is completed

Disable antivirus scanning

Toolset selection

use rsync

Example:

rsync --a --delete sourcefiler:/exports/acct /ifs/data/exports/acct --exclude ‘.snapshot*’

Migration testing

Map source and target directories from the migration host

Replicate a small test set of data

Use a small set of test users to validate full data access (discard or overwrite data following the test)

Migration

This example directory structure can be broken down into at least eight separate rsync jobs. Several of the jobs can be run

in parallel, assuming the source cluster can tolerate the additional load while the migration is occurring. Monitor the source

cluster and scale it up or down accordingly.

If the customer has a preference for the order of directory migration, then plan the transfers accordingly. Otherwise, start

with the largest directories because they will take the most time to replicate.

Example:

Initial migration testing indicated that three simultaneous rsync jobs would be an acceptable additional load on the source

cluster. Run three rsync jobs concurrently so that you do not overload the source cluster. Note that in other scenarios,

additional nodes could run additional rsync jobs in parallel, if the source cluster has enough performance and bandwidth to

accommodate this.

Mount an individual export to a different node and start an rsync:

Isilon node 1:

46

Mount filer1:/export/acct with a command similar to:

mount filer1:/export/acct /import/acct

(mounts the remotely exported directory to a created directory locally)

Rsync to /ifs/data/filer1/acct with a command similar to:

rsync –av /import/acct /ifs/data/filer1 --exclude ‘.snapshot*’

(this would copy files in “archive” mode, which ensures that symbolic links, devices, attributes, permissions, ownerships,

and so on are preserved in the transfer. There is no compression. Exclude any snapshot directories from the transfer).

Repeat with Isilon node 2:

Mount filer1:/export/engineering and rsync to /ifs/data/filer1/engineering.

Do the same with Isilon node 3:

Mount filer1:/export/home and rsync to /ifs/data/filer1/home.

Monitor the source cluster to verify that it is not overloaded with the additional strain of migrating data. The idea is to avoid

impacting clients as the migration progresses.

Repeat this process with the remaining shares until all of the source data is migrated.

Once an initial full copy of the source data has been completed, incremental copies should be run to propagate any

changes that were made once the migration began.

On the day of the cutover, final incremental copies should be run and access to the source cluster should be restricted if

possible to prevent clients from writing data that may not be migrated.

User acceptance testing

Verify that data and permissions on the Isilon cluster are replicated correctly following the final copies

Review user access and verify that users have connectivity and correct permissions

Use a small set of test users to validate that full data access occurs during the migration event to identify any problems early on

Verify that a user can read/write to a file, create a new file and directory, and traverse the directory structure

Monitor performance of the Isilon cluster and client connections as load increases

Cutover plan

Make source file systems read only

Execute final incremental copies

Update DNS

Verify that clients can connect to Isilon

Test automated workflows, if possible

Initiate user logoff and re-logon

Rollback plan

Reverse DNS update

Make old source file systems read/write and remove any restrictions

Remove any connections to the Isilon cluster and stop exports

Note that any data that was written to the Isilon cluster is considered lost; no reverse or reconciliation process will be performed.

Exit criteria

DNS resolves to new target storage and shares

Confirm that client can successfully read/write data to directories

Confirm that workflows were successfully completed with no user or permission problems

Verify that there are no connectivity issues

Post-cutover

Conduct a customer meeting to review and triage the migration

Create documentation of the migration protocol

Conduct a “lessons learned” discussion with both the internal customer and their team

NFS File Migration to Dell EMC Isilon · PDF fileNFS FILE MIGRATION TO A DELL EMC ISILON ......

Documents

Transcript of NFS File Migration to Dell EMC Isilon · PDF fileNFS FILE MIGRATION TO A DELL EMC ISILON ......