NFS File Migration to Dell EMC Isilon · PDF fileNFS FILE MIGRATION TO A DELL EMC ISILON ......
Transcript of NFS File Migration to Dell EMC Isilon · PDF fileNFS FILE MIGRATION TO A DELL EMC ISILON ......
WHITE PAPER
NFS FILE MIGRATION TO A DELL EMC ISILON CLUSTER
Guidance for optimal data migration of NFS workflows
ABSTRACT
This paper provides technical information and recommendations to help you migrate
data from a single NFS protocol workflow on another NAS vendor to a Dell EMC Isilon
storage cluster. It includes the best practices for planning, setting up, and executing the
migration.
December 2016
2
The information in this publication is provided “as is.” DELL EMC Corporation makes no representations or warranties of any kind with
respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular
purpose.
Use, copying, and distribution of any DELL EMC software described in this publication requires an applicable software license.
DELL EMC2, DELL EMC, the DELL EMC logo are registered trademarks or trademarks of DELL EMC Corporation in the United States
and other countries. All other trademarks used herein are the property of their respective owners. © Copyright 2016 DELL EMC
Corporation. All rights reserved. Published in the USA. 12/16 White Paper H12517
DELL EMC believes the information in this document is accurate as of its publication date. The information is subject to change without
notice.
DELL EMC is now part of the Dell group of companies.
3
Table of Contents
INTRODUCTION ........................................................................................................................6
Assumptions ...................................................................................................................................... 6
Audience ........................................................................................................................................... 6
Prerequisites ..................................................................................................................................... 7
The challenge of data migration ........................................................................................................ 7
Risk management ............................................................................................................................. 7
Data integrity ..................................................................................................................................... 8
Data availability ................................................................................................................................. 8
PROJECT PHASES AND METHODOLOGY OVERVIEW ........................................................8
Discovery and planning phase .......................................................................................................... 8
Key aspects of the planning phase ............................................................................................................ 8
Migration approach and requirements ........................................................................................................ 9
Migration methodology ............................................................................................................................. 10
Migration toolset selection ........................................................................................................................ 10
Testing the migration methodology ................................................................................................. 10
Data migration testing .............................................................................................................................. 10
User acceptance testing ........................................................................................................................... 11
Cutover methodology testing ................................................................................................................... 11
Rollback strategy testing .......................................................................................................................... 11
Executing the migration ................................................................................................................... 11
Data transfer ............................................................................................................................................ 12
Cutover .................................................................................................................................................... 12
Acceptance .............................................................................................................................................. 12
Rollback ................................................................................................................................................... 12
Repetition ................................................................................................................................................ 12
Post-migration ................................................................................................................................. 13
SINGLE PROTOCOL NFS DATA MIGRATION ..................................................................... 13
Challenges of a single protocol NFS data migration ........................................................................ 13
Data-specific considerations ........................................................................................................... 13
Migration requirements and customer data collection ..................................................................... 15
Requirements gathering ........................................................................................................................... 15
Current infrastructure and data analysis ................................................................................................... 16
Determine migration methodology ................................................................................................... 16
4
Migration methodology considerations ..................................................................................................... 16
Migration sequencing ............................................................................................................................... 17
Type of migration ..................................................................................................................................... 18
Host-based migrations ............................................................................................................................. 18
Isilon-based migrations ............................................................................................................................ 20
Migration tool selection and use ............................................................................................................... 21
Migration tools ......................................................................................................................................... 22
rsync ........................................................................................................................................................ 22
Isilon-based migrations – isi_vol_copy ..................................................................................................... 26
Isilon-based migrations from VNX—isi_vol_copy_vnx .............................................................................. 28
MIGRATION PREPARATION ................................................................................................. 29
Infrastructure and environment setup .............................................................................................. 29
Source host preparation .................................................................................................................. 29
Migration host preparation—Source and target access ................................................................... 30
Isilon cluster configuration preparation ............................................................................................ 31
Additional Isilon cluster considerations ............................................................................................ 32
NFS group membership limitation ................................................................................................... 32
Isilon guidelines for large workloads ................................................................................................ 35
MIGRATION APPROACH—TESTING AND PROOF OF CONCEPT .................................... 37
DATA VALIDATION ............................................................................................................... 37
PERFORMANCE .................................................................................................................... 38
USER ACCEPTANCE TESTING—DATA AND WORKFLOW TESTING .............................. 38
START OF MIGRATION EXECUTION ................................................................................... 39
PRE-CUTOVER PREPARATION ........................................................................................... 39
CUTOVER EVENT .................................................................................................................. 40
THE GO OR NO-GO DECISION ............................................................................................. 41
ROLLBACK ............................................................................................................................ 41
MIGRATION EVENT COMPLETION ...................................................................................... 42
STEADY STATE ..................................................................................................................... 42
CONCLUSION ........................................................................................................................ 42
APPENDIX: SAMPLE MIGRATION USE CASE .................................................................... 43
5
6
Introduction
This white paper outlines the recommended approach for migrating single protocol network file system (NFS) data from
other network-attached storage (NAS) systems to a Dell EMC® Isilon
® storage cluster. Single protocol NFS data is defined
as data read, written, or modified using NFSv2 or NFSv3 protocols. This paper includes best practices on Isilon cluster
configuration, tool selection, and host setup to optimize
an NFS-based data migration. The paper also includes best practices to optimize performance, management, and support.
Although this paper addresses a single NFS protocol data migration, the approach and many of the best practices
described can be used as a foundation for other types of data migration.
Much of the relevant information for planning, provisioning, and supporting end-user directories on an Isilon storage cluster
is available through white papers and guides from Dell EMC Isilon at http://support.emc.com. As a result, this paper avoids
duplicating this other content and includes only the information that pertains to setting up and operating an Isilon cluster as
a destination for a single protocol NFS data migration.
Assumptions
This document focuses on the data migration of NFS; it does not specifically address the migration to an Isilon cluster of
NFS exports, local users and groups, or any other NFS configuration from another NAS system.
This document should not be used in a multiprotocol migration. A multiprotocol migration requires many additional
considerations, and the specific actions required for such a migration may be different.
Additionally, this paper assumes that:
The source data is accessed only through a single protocol NFS workflow. Further, authentication and authorization for POSIX users (UID/GID) need to be consistent on the source and destination migration clusters. Lightweight Directory Access Protocol (LDAP), Network Information Services (NIS), distributed local files, and Active Directory (AD) with RFC2307 enabled (SFU) are all centralized methods of authentication that should be successfully implemented before the migration begins.
An authoritative and consistent source for user authentication is assumed (for example, AD with RFC2307, LDAP, NIS, and so on). If there are multiple sources of authentication from disparate clusters targeted for migration (that is, several local different/etc./password files), then they must be inspected for collisions and manually combined, if necessary, to prevent UID and GID collisions. Identity management that is optimized with preference for a single external directory service is recommended. If necessary, work with the customer to establish a consistent and authoritative set of users and groups.
Workflows will be for NFSv3 with the possibility of adding Server Message Block (SMB) workflows in the future; NFSv4 is not addressed in this document. The Dell EMC Isilon file system will have a “balanced” on-disk identity setting for global access control list (ACL) policy.
Files with POSIX mode bits only are in scope for this document; additional permissions such as SMB ACLs are not covered.
Multiple source clusters may have identical export directories. A directory consolidation plan must be developed to address directory name collisions in the newly created single namespace on the Isilon cluster.
Files and directories have unique permissions that restrict access to the intended users and groups; a typical migration preserves and transfers them. Post-migration permission transformation is not covered in this paper.
Audience
This paper is intended for experienced system and storage administrators who are familiar with file services and network
storage administration.
The document assumes you have a working knowledge of the following:
NAS systems
The NFS storage protocol, as is appropriate for the specific migration requirements
The Isilon scale-out storage architecture and the Dell EMC Isilon OneFS®
operating system
Additional Isilon features, including Dell EMC Isilon SmartConnect™, SmartPools® policy
management, SnapshotIQ™, and SmartQuotas™
File system management concepts and practices, including provisioning, permissions, and performance optimization
7
Integration practices for connecting and establishing authentication relationships using centralized sources (LDAP, NIS, local files, and so on)
Basic shell commands, command line operation, and basic shell scripting
While this paper is intended to provide a consolidated reference point for data migrations to an Isilon storage cluster, it is
not intended to be the authoritative source of information on the technologies and features used to provide and support
a file services platform. In the event that additional services are required, Dell EMC IT services are available to assist with
streamlining data migrations, reducing risk, and minimizing impact.
Prerequisites
Some of the features that are described or recommended in this document may require separate per-node licensing from
Dell EMC Isilon. For more information, please contact your Dell EMC Isilon representative.
The challenge of data migration
The migration of a storage system’s data and all the existing user access permissions is a complex process. Moving the
data while limiting downtime and protecting the data can be challenging. While you execute a migration, it is critical that
access to the data is available at all times and that data integrity is ensured to protect against data loss or corruption.
It is critical to understand that a data migration is a unique project. Few environments are the same, and, as a result, each
migration should be considered
a unique event. No preexisting approach will necessarily be appropriate for all migrations. That is not to say that common
approaches cannot be used after you evaluate and understand the requirements of a specific migration. The goal of this
white paper is to introduce the recommended approach to designing and executing
an NFS data migration to an Isilon cluster. Every migration is different; this paper provides some examples and guidance.
Dell EMC Professional Services provides expertise in working with customers to build an individual plan that meets their
needs. Whether you will use Dell EMC Professional Services for the migration or you plan to manage the migration in
house, the following are key areas that must considered before the start of any migration project:
Investigate the composition of the source data. Is it a deep directory structure or a wide structure with many files per directory? The type and number of files and directories will directly influence how the migration is planned and executed.
Understand the sequence of a migration project—this is critical to its success. The ability to predict and manage the time required to execute the data movement is paramount—it may be the single biggest factor that will affect the project’s success.
Maintaining data availability throughout the lifecycle of the data migration project is also critical. Little of today’s data can be unavailable for days at a time. In order to maintain data availability, you will need a strategy to maintain access to data throughout the migration process.
Risk management It is not uncommon to have a number of challenges or perceived problems that can potentially act as blocking issues or barriers to executing the migration. With sufficient planning and testing, such perceived risks can be addressed and managed successfully.
Common risks associated with data migrations include the following:
Amount of data (that is, large total volume, high file counts) and the potential for needing an extended period of time and effort to migrate it
Potential for performance impacts to the existing data solution and the customer network during the migration
Maintaining continuous access to the customer’s data throughout the migration
Potential for required changes to the data permission models on the new target system for the migration
Inherent challenges with moving multiple client connections to the new target system
Maintaining a consistent security model after the migration
Execution of the actual cutover event—with many moving elements that can increase the probability of error
This paper helps you understand these challenges and risks, and enables you to develop a data migration methodology
that manages these risks while implementing Isilon best practices to facilitate and optimize the migration.
8
Data integrity
Data integrity is critical. The data must be moved exactly as it is, and any modification to the data during migration may
impact the availability of the data and the success of the migration. The goal of the project is to ensure that the data is
successfully migrated and that its integrity is not compromised during its movement. In most cases, this includes the
migration of all relevant file metadata, as well as the underlying data blocks. A complete backup of the source data should
be made, and the validity of the backup should be verified before the migration begins.
Extreme cases where the data integrity must be maintained can be addressed with the use of checksums. An MD5
checksum can be calculated on each file on both the source and destination systems post-migration, which can verify bit-
for-bit integrity.
Data availability
Any migration activity will require a transition or cutover from the existing source systems to the new destination systems
so that a customer’s data clients (end users) can continue to access their data once the data has been moved. This
cutover will require a window of time when the data is unavailable to a customer’s end users. Minimizing this time is the
goal of all migrations, and the time needed for this
process is often determined by the type and the amount of data. A number of migration strategies can be employed to
reduce the period when data is unavailable during the cutover.
Project phases and methodology overview
A data migration project should be broken into distinct phases. The goal of the project phases is to develop a robust and
repeatable migration strategy that aids the migration’s execution and leads to a successful migration cutover.
Discovery and planning phase
The goal of the discovery and planning phase is to design a migration methodology and plan that enables you to execute
the project with minimal risk and downtime. Following are components of the discovery and planning phase:
Qualify the project
Identify the migration scope
Understand expectations
Identify risks
Define the timeline
Identify all migration requirements (that is, a rollback plan)
Key aspects of the planning phase
During this phase, a detailed review of the existing source environment and data is undertaken, and then developed into a
plan to migrate the data to the new target Isilon environment. The planning phase should be completed and validated
before the other project phases are started. The key aspects of the planning phase include discovery of the existing
infrastructure, the data, and the Isilon cluster.
Infrastructure discovery
This is the point where the infrastructure of the existing storage system, network architecture, and the network path
between the source data and the Isilon cluster
are evaluated.
If, for example, multiple source filers are to be combined into a single Isilon cluster, then an extensive analysis of existing
exports should occur. Export directory naming collisions (that is, server1:/exports/home and
server2:/exports/home are to be combined from different source filers) should be investigated to verify that the
data and directories contained within them will be able to coexist in a unified export on the target Isilon cluster (for example,
isilon:/ifs/data/home):
For example:
9
server1:/exports/home
/user1
/user2
/user3
server2:/exports/home
/user4
/user5
/user1
If the /exports/home directories are to be unified into a single /home on the Isilon cluster, the duplicate /user1 directory
must be addressed. Possible solutions include combining (if the user is the same) or renaming the directories (if there are
different users), along with making a corresponding UID change, if necessary. Work with the customer to address these
issues and craft a solution that will be minimally disruptive to their existing environment.
Data discovery
This is the point where you analyze the data and workflows that you plan to migrate and determine how they map to the
target end state on the Isilon cluster.
Quotas
If quotas are utilized on the source volume, they will need to be recreated on
the target Isilon cluster. Implementation of the quotas, however, should not take place until after the migration is complete
to avoid any potential issues while transferring data.
Isilon cluster configuration design and discovery
This is the point where activities such as the design of the Isilon network, disk pools, shares, and authentication can affect
the migration design. The configuration of the cluster is critical to the success of the migration.
The output of the discovery phase influences the migration design and drives the execution of the project.
Migration approach and requirements
The analysis of the data that you collected during the discovery phase drives the migration requirements and the migration
plan. The migration requirements break down into subcategories that ask the questions what, how, and when:
o What—What are you migrating?
All the data or a subset of the data Replicate the existing data as it is or transform it during the migration Copy the data but implement a new security model Take a hybrid approach
o How—How are you going to migrate the data, security, and workflows?
Tools used to copy the data and security Cutover strategy; how will client connections be moved? If the data has a rapid rate of change, how will you accommodate it? Data is static and can be moved without impact Full data copies and follow-up incremental copies to gather recently
updated data Clients access this data currently by method x/y/z Limit access to the old data and redirect during the cutover
o When—When are you implementing the cutover?
A single mass event Several large cutover events A series of smaller cutovers sustained over a longer time frame Rolling migration
Once these requirements have been clearly defined, a migration methodology can be developed to address them.
10
Migration methodology
Analysis of the migration requirements leads to the development of a migration methodology. The migration methodology
follows a waterfall methodology with phases generally occurring on completion of the prior phase. Although the preparation
for upcoming phases can occur before prior phases are completed, the execution is defined by the completion of its
dependent phase.
Figure 1. Sample migration plan
Figure 1 shows a sample migration plan. The plan addresses how all aspects of the migration are achieved: sequencing,
tools, timing, communication, and implementation. After you develop a migration plan, a proof of concept can help
you evaluate the approach and test the phases of the plan.
Migration toolset selection
After you finish the discovery phase and develop a methodology, you can select a toolset (host based, array based, and so
on) to copy the data and permissions.
Testing the migration methodology
After you develop a migration methodology, you must review, validate, and test the migration plan. A test migration is
usually run on a subset of the data. Running a
test migration is also invaluable in helping to estimate the performance and timing
of a migration.
Data migration testing
Testing the actual data movement process and execution is the first phase in testing the overall methodology. The data
migration testing determines whether the proposed methodology meets the requirements and accomplishes the goals of
the project.
The role of data migration testing is as follows:
Validates the tool selection; does the tool do what you want it to? Does it copy the data and attributes? Does it preserve hard and/or soft links?
Validates the data transfer; is the data moved as expected?
Validates that the permissions are copied over; are the permissions correct, functional, and operational?
Benchmarks the performance of the data transfer; how long does it take to run full and incremental data transfers?
Tests the new data: Is the data available? Are the read/write settings correct? Does the new workflow function correctly?
Gives you the option to experiment with different methods, tools, and flags
Enables you to tune the process to achieve the best results
11
The testing should give you confidence that the data will be accessible and available after all the data
and users are transferred to the new system.
User acceptance testing
Before you execute the full migration and cutover, user acceptance testing (UAT) should be undertaken against the new
storage system and a sample of the migrated data. UAT validates that the data is ready for cutover by checking that:
Data is accessible; users and applications can access the data correctly
Permission models are correct; the required security is applied to the migrated data
Workflows are operational; there are no issues with using the data
Cutover methodology testing
Cutover methodology testing helps determine how you will move client connections—and how the clients will respond to
the cutover. Through testing, you can gauge how long it takes to move the connections, what kind of issues may occur,
and how to troubleshoot any issues. Testing the cutover strategy thoroughly provides feedback on how to execute the final
cutover.
Rollback strategy testing
You should also test your rollback methodology. The rollback testing should validate that your plan to failback or abort a
migration works so that you are prepared in case any issues occur during the cutover. Be sure to validate that access to
the data on the old system can be restored quickly and efficiently without affecting users.
Executing the migration
Once all the methodology and processes have been completed and validated, you can move on to the main migration.
The core migration phases are as follows:
Data transfer—all the data is migrated from the old system to the new system
Cutover—connections and clients are moved to the data on the new storage
Acceptance—the new data source is ratified
Rollback—a process used only if required
Steady state and repeat—migration phase is considered complete but additional separation migrations may occur
Post-migration monitoring—the new system and data are monitored following the cutover
12
Data transfer
A standard approach to data transfer is to execute an initial “full” data copy to move all the initially identified data and to
follow it up with a series of incremental copies, which move only the data that had changed since the initial full copies had
run. This gives you the most flexibility in executing the cutover, as the additional incremental copies will be substantially
shorter to execute than the initial large data copies.
Cutover
After data migration, the process of actively moving clients from the old storage system to the new storage system will
occur during a cutover event.
With a high-level cutover plan, the following steps will occur:
For the old source—remove write access to ensure that clients are unable to write any new data
Execute a final incremental copy to migrate any remaining data from the old system to the new system
Test new target data and connectivity; selective UAT
Go or no go—decide whether to move forward with the cutover event
Update the client-to-storage connection mechanisms—Domain Name System (DNS), Distributed File System (DFS), virtual IPs (VIPs), and so on
Monitor the new storage system—monitor load and connections as the clients are transferred
Validate clients—review and validate that clients can successfully connect and operate
Validate workflow—verify that business operations work as expected
The cutover is complete
Acceptance
Once the data and client connections have been migrated over to the new storage solution, the storage availability and
workflow acceptance of the new data and storage solution must be validated.
Once you begin writing new data to the new storage system, the ease with which you can roll back to the old storage
system diminishes. If you were to roll back to the old system, any changed data would need to be copied back to the old
environment. Unless this newly written data can be discarded, rewritten, or manually reconciled, Dell EMC Isilon strongly
recommends that you execute a rollback before any significant changes are made to the data using the new storage
system.
Rollback
Be sure that you have a fully tested rollback plan in place. A rollback may be needed for a variety of reasons:
Client connectivity or storage name resolution issues develop following the cutover.
The final incremental copy is not completed during the outage window, so not all data was migrated.
An unplanned IT outage or issue occurs at the same time as the migration.
Data access on the new storage system is invalid and workflows are impaired.
The goal of a rollback plan is to quickly restore access to the old data storage solution. Assuming that the cutover was
executed correctly, restoring the prior data access should be straightforward, and it should be possible to implement the
data restore with minimal additional disruption. The primary goal is to restore access within the cutover window so that no
additional downtime and interruption to data occurs. It is critical to have a tested rollback plan that can be used if an issue
with the cutover occurs.
Repetition
After you validate the data transfer through cutover and client acceptance, most migration projects consist of multiple
migration cycles. The methodology can be executed again on different datasets in migration waves that encompass the
entire project.
13
Post-migration
Following the migration cutover, it is important to monitor both the new storage system and the old storage system. You
should find that client connections are moving to the new system and that active data connections are no longer initiated
on the old storage system. Can clients connect to and work with the new storage system without issues? As connection
counts increase on the new storage system, you should monitor the load and performance, and make performance
adjustments as needed.
You should be monitoring the following items during and after the cutover:
New system: System load and performance, number of connections, movement of users, security, and performance
Old system: Are users still connecting to it? Are there legacy connections being made to it from old applications?
If user quotas were utilized on the source system, it is appropriate to reimplement them on the target system following the
migration.
The Dell EMC Isilon OneFS SmartLock® feature should likewise be implemented post-migration. Files should be committed
only once the data has been completely migrated.
You should have a transition plan for what you will do with the old storage system. Some common approaches are as
follows:
Keep it around for a while but with administrator access only
Provide read-only access for users
Mothball the system while the new system transitions
Decommission it
Purge the data after a defined retention period has been reached
Single protocol NFS data migration
Although this paper addresses a single protocol NFS data migration, the approach and many of the best practices
described can be used as the foundation for other types of data migrations.
Challenges of a single protocol NFS data migration
Moving large amounts of data presents a number of challenges:
It is difficult to perform such a migration without downtime. Most source clusters are overloaded, requiring that the data be available at all times and that systems operate at near capacity. This supports the need for migration. However, the migration itself can present a significant additional load to the source cluster.
A large number of exports may need to be migrated. You must move not only the data but also the exports and export permissions. This introduces a second type of migration (configuration) that must be undertaken during the project.
The consolidation of multiple source filers into a single unified namespace and directory structure can be difficult to manage.
There may be a large number of differently connected clients that require separate cutover and validation events.
NFS exports may be mounted deeper in the exported tree.
There may be restricted exports to specific hosts and unique permissions. Export options must be verified.
There may be a high rate of change. Often large environments contain a large number of concurrently connected clients. In such cases, you must account for the rapid rate of data change during and after cutover.
Data-specific considerations
When you design a migration strategy, you should determine how you would like your data to appear after it has been
migrated.
14
Consider the scenario of multiple smaller filers being consolidated into a single Isilon cluster. There may be duplicate
UID/GID collisions if the source filers are not using
a consistent source of authentication. In that case, manual remediation may be necessary to combine and fix user and
group accounts that have duplicate IDs.
For example:
filer1: has users user1 (UID:305), user2 (UID:423), and user3 (UID:424).
filer2: has users user1 (UID:305), user4 (UID:423), and user5 (UID:424).
The UIDs for user2 (UID:423) and user4 (UID:423) are the same, so one user would have to change to a different UID and
the file ownership corrected during the migration. The same issue would occur with user3 (UID:424) and user5 (UID:424).
One user would need to change to a new UID and file ownership would need to be corrected before the users would be
able to both use the new cluster. One possible resolution would be to make user4 UID:1423 and change ownership of all
their files prior to the cutover. Likewise, you could make user5 UID:1424 and modify ownership on all of their files.
Keep in mind that OneFS will store all UID/GID information regardless of the source. OneFS does not require NFS
authentication.
Duplicate export paths present a similar issue:
filer1: has exports /vol/share/acct, /vol/share/work, and /vol/share/eng
filer2: has exports /vol/share/acct2, /vol/share/work and /vol/share/engineering
The problem is that /vol/share/work is the same from both source filers. This issue must be discussed with the
customer prior to migration. A plan for directory consolidation must be developed to deal with export path collisions. One
typical solution is to have an additional directory layer that identifies the original source filer:
isilon: would have exports: /ifs/data/filer1/acct, /ifs/data/filer2/acct2, /ifs/data/filer1/work,
/ifs/data/filer2/work, /ifs/data/eng, and /ifs/data/engineering
With this methodology, duplicate export paths can safely be consolidated from multiple source filers into a single cohesive
namespace. Clients, however, must be updated to reflect the updated export paths.
The metadata of files, in particular, can add complexity to a migration. You must identify the metadata that you want to
migrate with the data. The following metadata can affect your migration strategy:
File properties such as access time, created time, modified times, and owners
File attributes such as read only or archive (an Isilon cluster does not support compressed and encrypted)
Extended proprietary file attributes that are in use; these are not supported on Isilon clusters
Local users and groups; are these defined on the files?
Deduplication is in use, or archive stub files or Mac OS X resource forks are present
Other data-specific considerations include:
Date/access time/creation time retention requirements—These may not be preserved across migrations depending on which tool is used, for example, creation time is not preserved with isi_vol_copy.
Symbolic links will break—Depending on where the symbolic links connect to, the underlying paths will probably change after a migration and require rebuilding.
Automount maps will need to be repointed (NIS, NIS+)—Similar to symbolic links, the export paths may change and the hostname may also change.
Character encoding—Verify that it is the same on source and target; international characters in file names may be problematic.
How does the data need to appear post-migration:
Direct replication of all data and attributes
Move the data, then make updates, fix problems, change the security, and so on; migrate just the data and implement an entirely new security model
15
Migration requirements and customer data collection
Before you can plan your migration, you must collect requirements.
Requirements gathering
The data migration planning begins with identifying the data that you want to move from the old storage system to the new
storage system. Here is what you need to document:
Current state—what is the current state of:
Source infrastructure
Existing storage platforms
Network design and implementation
Name resolution infrastructure: DNS, DFS, or global namespace
Servers, clients, OS, and applications
Source infrastructure configurations
Volumes
Shares/exports
Access
Authentication
Source data
Logical structure—data layout and directory depth
Is the structure wide or deep?
Physical structure—total size, minimum/maximum/average file size
Number of files
Source data security
Current security model and how file access is enforced
Local users and groups
POSIX permissions
LDAP users and groups
Target state—what will be the target state:
Target infrastructure: Isilon cluster
Network configuration
Target configurations:
Directory layout and structure
Shares/exports
Access and authentication model
Target data
Logical structure—same as the source or new system
Physical structure—same as the source or new system
Target data security
Keep it the same as the current security model
Migrate and change the security model
Move the data and implement a new security model
16
How to gather the data:
Interview stakeholders
Gather documents: network diagrams, run books, and infrastructure and application details
Create a list of exports
Develop storage reports, and so on
Review the share permissions
Examine the directory structure (shallow versus deep), file composition (small versus large), and the number of files
Current infrastructure and data analysis
Start the migration design phase by collecting the data needed to develop the migration requirements.
Best practice
Create and utilize a standardized data collection and migration planning document, along with a standard target
configuration guide. By using a structured document to gather and collect all your source data and information, you can
identify your migration requirements, which will lead to clear migration design decisions.
Why: This will simplify and consolidate migration planning and implementation.
You need to collect the following information:
The amount of data; the actual file data, not compressed or deduplicated data
If there is deduplicated data, the amount of such data; this number will need to be added to the total
The number of directories and files; identify the directory trees and the quantity of them
The directory structure: shallow and flat, wide and deep, or otherwise
The number of directories with more than 10,000 files in them
The number of exports; are there share name collisions or reuse on multiple source hosts?
The way these exports are used—for home directories, application, or group use
How permissions are applied to source—at the individual or at the group level
The number of source locations; single source system or multiple
How clients access data; protocols and how they resolve storage names
The rate of file changes; how often and where files are changing
Networking architecture; source systems and network between it and the Isilon cluster
Source system load; understand what load the source storage is under and how much additional overhead from the migration would be tolerable
Determine migration methodology
After you collect information on the source system, the data, and the infrastructure, you are ready to develop a migration
methodology.
Migration methodology considerations
The elements described in the text that follows details some of the key things you must consider as you develop a
migration methodology.
Logical migration design
By analyzing the structure and layout of the source data, you can make logical migration design decisions—structuring the
migration into distinct executable units. A goal of the migration methodology is to identify logical boundaries that facilitate
the cutover of your clients and workflows.
Some logical migration boundaries are as follows:
Hosts, filers, servers, and arrays
Volumes
17
Exports
Directories—users or groups
Best practice—Define migration boundaries
Identify logical migration boundaries: Identify clearly well-defined data structures for migration and cutover—for example,
entire exports or directories. Be aware of the size of data inside a migration boundary, as the size of the data affects the
outage window required to complete
a cutover.
Why: This best practice organizes the migration into segment waves, making the migration easier to manage.
After you segment the logical boundaries into distinct migration phases, you can address other elements of your workflow,
such as metadata, that you need to migrate.
File attributes and security
Most data migration also includes the migration of the file’s metadata: ownership, access times, creation time, and security
descriptors. Before you can execute your migration, determine how you plan to handle metadata and file security.
Common migration approaches:
Migrate data files on an as-is basis (with no change to permissions or ownership).
Migrate data and permissions but also set the destination (recalibrate the permissions).
Migrate the data only. Create new permissions on the destination, or create a new security design.
Migrate away from an existing security model and implement a new model. Dell EMC Isilon recommends that you use a central authentication scheme on Isilon clusters. For example, if the NAS system that you are migrating from uses several directory services, you should consider consolidating the directory services into a single directory service for the new NAS system.
Best practice—Understand the attributes of the source data before the migration
Be data aware: Identify any DOS attributes, nonstandard extended file attributes, and nonstandard permissions that are
not supported by an Isilon cluster. Also, identify your local users or groups and have a plan to deal with them.
Why: Before you execute the migration, you may need to take additional steps to prepare the data for migration so that it
will be available on the new storage system.
Migration sequencing
The execution of a migration will likely require multiple iterations of the data transfer. If the source data is constantly
changing, try to find a window when the source data can be locked in a read-only state, or deny access to clients. Once
access to the source data is removed, the final data transfer can take place. Otherwise, differences between the data on
the source and the target system might result.
The recommended approach for a data migration is to use a multistep migration. A multistep migration consists of an initial
“full” or “level 0” data copy. The initial data copy is followed by a series of “incremental copies” that update only the new or
changed data. The initial data copy moves an entire copy of the source data. It can often take a long time to execute
because all the data must be assessed and transferred over the network to the migration target.
After the initial copy completes, additional differential transfers copy only the data that has changed since the initial full data
copy was executed. Additionally, any data that is deleted on the source will also be deleted on the target through the
incremental process. The size of an incremental copy is affected by the rate of change of the source data.
You should run multiple over-the-top incremental copies to guarantee the integrity and consistency of data that encounters
any issues during the initial full copy. Incremental copies will also keep the two data sources in sync with each other and
require less catch-up work on final copy.
A final incremental data copy should always be executed as part of the migration cutover plan to ensure that all the latest
data is on the new target storage system.
18
Best practice—Run initial full copies followed by incremental copies
Run initial full copies followed by multiple incremental copies. Always execute a final incremental data copy during cutover
to ensure that the latest data from the source
is migrated.
Why: Executing multiple migration passes will ensure that all the data is transferred and that the latest version of the files
will be stored on the target storage system.
Type of migration
You must determine how the migration will be executed. There are two possibilities: an indirect execution from a host and a
direct execution from an Isilon cluster. With host-based migration, an intermediary host executes a copy process between
the source system to the target system through the host, as shown in Figure 2.
LAN/WANLAN/WAN
“Target” Isilon cluster
rsync intermediary host
“Source” NAS storage array
Figure 2. Host-based migration
With a host-based migration, all data is transferred through an intermediary host en route to the Isilon cluster.
If the source system is supported, the Isilon cluster can execute a direct source to the Isilon data copy by using the Isilon
OneFS isi_vol_copy command, which will copy data using the Network Data Management Protocol (NDMP).
Host-based migrations
A host-based approach might be selected for a number of reasons:
The source system does not support an Isilon-based migration—isi_vol_copy is not supported.
Connectivity is restricted—storage is on different networks; a host may bridge the networks.
There is flexibility in execution—separate the execution from the administration of the storage systems.
There are security restrictions—these can be used to limit access to systems.
In a host-based migration, the toolset executing the migration makes a connection to the source and to the target system,
and then copies the data through the host. For the purpose of this paper, the primary host-based tool is rsync.
19
Best practice—Select a suitable host
Select a suitable host to run the migration that has adequate network bandwidth and processing power.
Why: Because all the data will move through the host, incorrect sizing may lead to a
bottleneck or an interruption in the migration. Using multiple hosts may facilitate
multistreamed migrations in which you can maximize network usage and the Isilon
nodes by executing multiple migrations concurrently. A host with 10 Gb network
connectivity is highly recommended.
Some common considerations are as follows:
Adequate resources to execute the migrations—CPU, RAM, and network 10 gigabit network infrastructure where possible
Connectivity between the host and the source and target storage systems
Availability; the host is stable and reliable—no reboots or downtime occur
Dedicated host—not running a lot of other parallel workloads and restricted user access
The migration host needs to be as optimized for the migration workload and have high network throughput as much as
possible, so it will send and receive all the data to be migrated. Figure 3 shows an Isilon-based migration.
LAN/WANLAN/WAN
“Target” Isilon cluster
Isi_Vol_Copy via NDMP protocol
“Source” NAS storage array
Figure 3. Isilon-based migration
With an Isilon-based migration, data is pulled directly from the source system to the Isilon cluster utilizing isi_vol_copy.
Another method of migration can be achieved by directly running the Linux rsync replication utility on the Isilon cluster itself.
This approach is shown in Figure 4.
20
LAN/WANLAN/WAN
“Target” Isilon cluster
rsync
“Source” NAS storage array
Figure 4. Isilon-based migration with rsync on individual nodes
Similar to the OneFS isi_vol_copy tool, rsync can be run natively on the individual Isilon nodes against locally mounted
NFS source exports that are mounted directly on each Isilon node. Data is transferred directly from the source cluster to
the Isilon cluster, reducing latency and network congestion and eliminating the need for external host computers to move
the data.
Isilon-based migrations
If the source system is capable of supporting an Isilon-based migration by use of isi_vol_copy or by direct access with
rsync, the connectivity exists, and the migration methodology supports utilizing this approach, a direct migration may be a
more applicable technique. The main advantage of the direct approach is there is no need for an intermediary host to
execute the process or for the data to traverse the external host.
Type of Isilon-based migration
There are two primary types of Isilon-based migrations:
NDMP-based migration with isi_vol_copy
Rsync-based migration—use the UNIX rsync tool to connect and either push or pull data directly to the Isilon target, running natively on the Isilon cluster
NetApp migration
A NetApp Isilon-based NDMP migration requires the following:
Isilon requirements: Isilon OneFS 6.5.5.6 or later
NetApp requirements: Data ONTAP 7.x or Data ONTAP 8.x operating in 7-mode
It is anticipated that additional source systems will be supported in future releases of Isilon OneFS.
As in all migration strategies, it is critical to evaluate the migration methodology against the selected approach to determine
if the method selected will facilitate your migration goals.
Best practice—Evaluate migration approach
Evaluate and select the most appropriate migration approach by selecting the method that meets your specific migration
requirements, provides cutover flexibility, and optimizes
data throughput.
21
Why: The selected approach will impact the migration schedule and planning.
Once you have identified the migration approach you will use, you can select the appropriate migration tool.
Migration tool selection and use
The data migration requirements will help define the tool selected to execute the data migration.
Tool selection
A number of tools are available and will work with your migration. Any file copy method that can connect over NFS to the
source and target storage can be used to move data between the systems. Dell EMC Isilon recommends that you use a
tool that can be automated and which provides robust functionality—a tool that can copy attributes, security, logging, and
so on.
The common NFS data copy tools are shown in Table 1.
Tool Advantages Disadvantages
isi_vol_copy Is included with Isilon OneFS
Pulls across all user and group
permissions
Supports both SMB and NFS protocols
Utilizes NDMP
Provides a direct source-to-target
migration
Is supported only against specific
source storage systems—NetApp
systems running OnTap 6.5 and
later with NDMP v4
Has limited error reporting
isi_vol_copy_vnx Is included with Isilon OneFS
Pulls across all user and group
permissions
Supports both SMB and NFS protocols
Utilizes NDMP
Provides a direct source-to-target
migration
Is supported only against specific
source storage systems (VNX 7.x
OE and Celerra DART 5.6.x or
later)
Has limited error reporting
rsync Is the Dell EMC preferred tool
Is designed for synchronizing directories
Sends only differences in data when files
change
Can be used with lots of available
switches
Can be scripted
Is open source—widely available
Has limited error reporting
tar, cpio Can be scripted
Is open source—widely available
Is good for one full push of data
Has limited error reporting
Is designed for backup and restore,
not active copying
22
Tool Advantages Disadvantages
Does not make incremental copies
Table 1. Summary of NFS copy tools
Tool versions
It is important to understand that different tools may behave differently on different hosts. Dell EMC Isilon strongly
recommends that you test tool versions and observe their behavior.
Best practice—Use the correct tool for the job
For NetApp filer: use isi_vol_copy
For VNX: use isi_vol_copy_vnx
For general NFS filers: use rsync
Why: Using the correct tool for the job will give you the best chance for a successful migration.
Best practice—Use the correct version of the tool
Rsync is available on nearly every UNIX and Linux distribution, as well as natively on the Isilon cluster. You must use the
correct version for the OS for the host that is running the tool.
Why: Using the correct version of the tool will optimize throughput and performance, that is, use a 64-bit version if your
host OS is 64-bit.
Best practice—Use the latest version of migration tools
You should always use the latest versions of the chosen file copy tool.
Why: Performance is optimized when you use the latest versions of a file copy tool, and they often have newer features
and bug fixes.
Migration tools
The following section provides an overview of the primary NFS migration tools that can be used in Isilon data migrations.
rsync
Overview:
The rsync tool provides a method for copying files, directories, and subdirectories from NFS exports to other NFS exports
with the ownership and attributes intact. It was designed to efficiently synchronize files and directories from one location to
another, minimizing data transfer while using delta encoding where appropriate. If the source and destination have many
files (and parts of files) in common, the utility need only transfer the differences between them. Incremental change copies
are thus extremely efficient.
Rsync can operate in both a local and remote mode (as a service) and behaves similarly to rcp. It can “pull” or “push” files
from filers.
Rsync should be run as a root to preserve file permissions and ownership. It can also use Secure Shell (SSH), if
necessary, for secure environments.
Source code is available, and rsync is implemented on nearly every modern operating system.
23
Usage:
Rsync [options] <source> <destination>
Features:
Enables you to copy file data, ownership, and time stamp information
Is extremely efficient for incremental copies
For a full list of rsync features and switches, run the following from a command shell:
man rsync
Sample rsync command:
rsync $OPTS [$SOURCE:]$SOURCEDIR [$TARGET:]$TARGETDIR
where variables are typically defined:
$OPTS=--force –ignore-errors –delete-excluded –exclude –from=$EXCLUDES –delete –
backup –a”
$EXCLUDES=/path/filestoexclude
$SOURCE=name of source filer
$SOURCEDIR=/path/sourcefiles
$TARGET=name of isilon node destination system
$TARGETDIR=/path/targetdir
Rsync by default runs in a local mode, but with the addition of [$HOST:] in front of either $SOURCEDIR or $TARGETDIR,
it can transfer files remotely between systems.
For example:
rsync –avh /tmp/foo root@host2:/tmp/bar
If run from the source system, it would transfer the local directory (/tmp/foo) to the remote host (host2) and place them in
the target directory (/tmp/bar).
Note that a shell script is usually created to automate and distribute the rsync jobs. Entire migrations can be automated
and incremental passes scripted to run automatically. Review the scripts with the customer to verify that the sequence of
commands matches the expected migration plan.
Best practice—rsync and compression
If the data is mostly binaries or large uncompressible files, Dell EMC Isilon does not recommend that you use
compression, as this will slow the migration considerably. Text files, however, will readily compress and, if the source data
consists of text files, using this option will greatly speed the migration. Know the source file composition.
Why: Trying to compress non-compressible data will greatly slow the migration.
24
Best practice—Exclude snapshots from replication
It generally does not make sense to migrate snapshots because they will not automatically work on the target system as
intended. Therefore, exclude them from the migration to speed the process.
Why: Snapshots will not migrate.
For example:
rsync –avh –exclude=’.snapshot*’ /tmp/foo root@host2:/tmp/bar
If run from the source system, this would transfer the local directory (/tmp/foo) to the remote host (host2) and place data in
the target directory (/tmp/bar) while excluding snapshots.
Best practice—Watch for spaces in names
Be aware that spaces in file and directory names can cause problems.
For example, a directory named “/spaces in my name” and a file named “some file.avi” would require special handling on
both a command line and in a script:
rsync –av foo@foomachine:’/spaces\ in\ my\ name/some\ file.avi’ /local_directory/
The “\” character is used before spaces to prevent the shell from parsing the next word as a separate argument.
Why: Spaces in file names and directories can cause scripts to fail. Because of this, be alert for them.
Best practice—Starting rsync switches
Suggested initial rsync switches:
--a=archive mode; equals –rlptgoD (no –H,-A,-X)
--delete=delete extraneous files on the target system (useful on incremental copies if the source data has been deleted)
--force=force the deletion of directories even if they are not empty (during incremental passes if directories are deleted on
the source)
--compression=compress the data being transferred (if the data is compressible)
Why: Dell EMC Isilon recommends that you start with a baseline of switches and test the copy, validate the results and
behavior of the copy, and make the appropriate adjustments to the rsync switches. No single default set of switches will
work for all migrations. Remember, rsync can be run multiple times incrementally, and different directories/exports may
require different options.
You should become familiar with many of the rsync switches and their use. The following highlights a few possible options
that you should be familiar with. It is important to recognize that each migration will require different switches because of
the unique requirements of each dataset.
A few useful switches to be aware of include the following:
-r,--recursive
Recurses into directories
-l,--links
Copy symlinks as symlinks-p,--perms
Preserves permissions
-h
Outputs numbers in a human readable format
25
--progress
Shows progress during the transfer
-z,--compress
Compresses the file data during the transfer
-g,--group
Preserves the group
-o,--owner
Preserves the owner
-D
Preserves special files and device files
--protect-args
Enables you to transfer files that contain white space; you can either specify --protect-args or escape the white space with
a “\”
-stats
Provides detailed list of the total number of files, files transferred, benchmarks, and an average transfer speed
-t,--times
Preserves modification times
-v,--verbose
Increases verbosity
-n,--dry-run
Performs a trial run with no changes being made
--exclude=PATTERN
Excludes files matching PATTERN
--exclude-from=FILE
Reads exclude patterns from FILE
Symbolic links and hard links
Be aware of symbolic links within the source file system. They may not point to the same target after migration if paths
change.
Rsync has multiple methods of dealing with symbolic links. Choose the most appropriate option after consulting with the
customer. These links can also be addressed in a separate migration pass.
By default, links are not transferred at all. A message such as “skipping non-regular file” is generated for any symbolic links
that rsync encounters. Switches to deal with links include:
--links
Symbolic links are recreated with the same target on the destination. Note that --archive implies --links.
26
-L,--copy-links
Symbolic links are “collapsed” by copying their referent, rather than the symbolic link.
-H,--hard-links
Preserves hard links
--safe-links
Ignores symbolic links that point outside the tree that is being replicated. This is useful for preventing sensitive system files
such as /etc/passwd from being inadvertently copied.
Best practice—Know the rsync switches
You should understand all the rsync switches and when and how to use them.
Why: Different migrations will require the use of different switches to meet the requirements of the data copy and the final
state of the migrated data. Discuss this with the customer before the migration begins to determine the optimal selection
of switches.
Best practice—Parallelizing the rsync processes
Examine the source directory structure and look for obvious ways to divide the source directory tree into smaller, more
manageable chunks.
For example, if you have a file system of 4,000,000 files, it might take six hours to complete the process in this
hypothetical example. Consider if the file system tree was divided into something like the following:
drwxr-xr-x 2 root root 179 Jul 19 15:00 directory_a
drwxr-xr-x 2 root root 179 May 1 00:00 directory_b
It would cut the migration time in half if you could run two simultaneous rsync jobs at the same time (assuming the content
of the directories was balanced nearly equal):
rsync –av –include=”/directory_a*” –exclude=”/*” –progress remote::/ /localdir/
rsync –av –include=”/directory_b*” –exclude=”/*” –progress remote::/ /localdir/
The best performance would result from spreading requests across multiple Isilon nodes and multiple source network
interfaces. Multiple rsync jobs can be run on individual nodes as well, but these processes tend to be network limited. You
want to spread the load across as much of the Isilon cluster as possible, maximizing the available bandwidth on each
node. Be aware if the NFS option “map root to nobody” is implemented, as this may affect access to files.
Why: You will see increased performance, but you may be limited by network bandwidth and source cluster throughput.
Isilon-based migrations – isi_vol_copy
Overview:
Isi_vol_copy is a native Isilon OneFS tool that supports data migration through the use of the NDMP. The tool allows the
cluster to mimic the behavior of a backup target and allows the data to be copied directly from the source system to the
Isilon, which preserves permissions and ownership.
Usage:
isi_vol_copy <src_filer>:<src_dir> [options] <dest_dir> [ -full | -incr ]
[-sa user: | user:password]
[-sport ndmp_src_port]
[-dhost dest_ip_addr]
27
[-maxino maxino]
[-h]
Features:
Utilizes native NDMP functionality and connectivity
Supports full and incremental backup levels
Migrates data and all security, and attribute information
Will restore the set of permissions and ACLs that existed on the source data
Will migrate NFS and SMB source data
Does not impact or interact with client data access
Provides a dedicated data transfer pipe between the source and cluster
Starting with OneFS 7.0.2, it supports the Backup Restartable Extension, so that full backups can be interrupted and restarted from a checkpoint
Limitations:
Source NAS arrays have limits on the number of NDMP threads and simultaneous backup jobs; you should therefore avoid overloading the source NAS system.
It can be limited by source filer network bandwidth.
Sample isi_vol_copy command
isi_vol_copy <source_filer_IP>:/<source> -sa<ndmpuser>:<ndmppassword> /ifs/data/<source_filer> -full
Best practice—isi_vol_copy target data use
Do not alter data on the target Isilon system until after the isi_vol_copy has completed.
Why: This will create problems and you may have to reperform a full copy.
Best practice—Simultaneous isi_vol_copy use
Do not execute multiple isi_vol_copy going to the same target, that is, don’t have all your isi_vol_copy migrations going to
the same target directory. For example:
filer1:/vol/sourcedir -> isilon:/ifs/data
filer2:/vol/sourcedir2 -> isilon:/ifs/data
Why: This creates problems for the copy process and may require remediation after migration.
Instead: Use an additional directory level:
filer1:/vol/sourcedir -> isilon:/ifs/data/filer1/sourcedir
filer2:/vol/sourcedir2-> isilon:/ifs/data/filer2/sourcedir2
If consolidation is required, this can occur after the data is migrated and any potential merging of identically named
subdirectories can be addressed.
Best practice—isi_vol_copy use
isi_vol_copy is optimized to stream as much data as possible across a network; always monitor load on the source and
target systems for any potential impact.
Why: Because isi_vol_copy is optimized to stream as much data possible, don’t overwhelm older source systems and
create potential link saturation or disk problems, especially if there are users connected and they are attempting to access
files.
28
Best practice—isi_vol_copy limits
Dell EMC Isilon recommends that you use fewer than 40 million files per volume transfer when using isi_vol_copy.
Why: All programs have limits, and this is the recommended maximum when using isi_vol_copy for each individual
transfer. Larger source volumes should be broken up into smaller chunks (that is, use a separate isi_vol_copy stream for
multiple subdirectories instead of one large transfer of an entire volume).
Once the initial copy is complete, then incremental copies can be run:
isi_vol_copy filer1:/vol/sourcedir –sa root:<password> /ifs/data/filer1/source –
incr
Important: Do not start an incremental copy job until a full copy has been completed. A successful full copy must finish
before an incremental copy is started. Unlike rsync, which automatically does incremental copies, isi_vol_copy must be
explicitly called with –incr to perform an incremental copy.
Subsets of source directories and volumes can be migrated.
An entire volume does not need to be migrated; subdirectories can be migrated individually as well.
For example:
A volume, /export/vol1, is exported containing subdirectories /work, /scratch, /tmp, and /home. You could migrate the entire
vol1 or any/all of the individual subdirectories under vol1, for example, /export/vol1/work and /export/vol1/home might be
the only necessary directories to move.
Isilon-based migrations from VNX—isi_vol_copy_vnx
Overview:
isi_vol_copy_vnx is a native Isilon OneFS tool that supports data migration through the use of the NDMP for VNX. The tool
allows the cluster to mimic the behavior of a backup target and allows the data to be copied directly from the source system
to the Isilon cluster, preserving permissions and ownership.
Usage:
isi_vol_copy_vnx <src_filer>:<src_dir> [options] <dest_dir> [ -full | -incr ]
[-sa user: | user:password]
[-sport ndmp_src_port]
[-dport ndmp_data_port]
[-dhost dest_ip_addr]
[-h]
Features:
Utilizes native NDMP functionality and connectivity
Supports full and incremental backup levels
Migrates data and all security and attribute information
Will restore the set of permissions and ACLs that existed on the source data
Will migrate NFS and SMB source data
Does not impact or interact with client data access
Provides a dedicated data transfer pipe between the source and cluster
Limitations:
Source filers have a limit on the number of NDMP threads and simultaneous backup jobs; you should therefore avoid overrunning the source filer.
It can be limited by source filer network bandwidth.
Check with Dell EMC Isilon Support for the latest compatibility with tools, DART codes, and OneFS.
29
Migration preparation
After you finish planning the migration and selecting the tools you will use, you can prepare the source and target systems
for the migration.
Infrastructure and environment setup
Network connectivity
Because all the data in the migration will traverse the network, you should optimize the network infrastructure and
connectivity between the source system(s) and the target Isilon cluster.
Common recommendations include the following:
Maximize network bandwidth; 10 Gb/s is preferred to 1 Gb/s, optimized end to end, with a Maximum Transmission Unit (MTU) of 9000 bytes
Limit hops and latency between the source and target storage systems
Isolate migration traffic so that it does not to compete with client access
Limit potential network bottlenecks that can occur with routers, firewalls, Intrusion Detection System (IDS), and shared network infrastructure
Best practice—Optimize the network for the migration traffic
Optimize the migration network path; try to limit other production traffic from this network and limit network devices the
traffic traverse (firewalls, IDS, and so on). Ideally, try to create a dedicated private migration network that can be
optimized for only the migration traffic.
Why: Separating the migration traffic from other network traffic will allow for maximum throughput and reduce potential
impact to existing production traffic by limiting
network saturation.
Migration account
In order for the migration data to be copied from the source to the target system, the tool accessing the data must be able
to access all of the source and target data.
Commonly used migration accounts:
root
User accounts created explicitly for the execution of isi_vol_copy, for example, ndmp
The account used to connect to the source and target storage systems will depend on the security model implemented in
the environment.
Best practice—Use a specific migration account to execute migration tasks
Use a specific migration account or an account with group membership that has the required access to all source and
target data, that is, root.
Why: Using a dedicated account will allow for oversight and management of the migration
data access. It will also allow migration tasks and users to be separated from other
production accounts.
Source host preparation
The source data storage system should be prepared and optimized for the migration.
Best practice
Access can be restricted to the source cluster exports, preventing users from changing data on the source cluster instead
of the migration target clusters. Change exports to read-only status once the migration and incremental copies are
30
completed to prevent clients from writing to them.
Why: This allows you to separate migration events from normal production access. This process can be used post-
cutover to deny read/writes from being made by a normal user to the source cluster. This prevents updates to the data
during data cutover and post-migration while still allowing administrative access.
Migration host preparation—Source and target access
The migration host should be prepared and optimized for running the migration copies:
Limit workload and access to optimize throughput
Restrict access and reduce service issues with the host
Prepare all migration jobs as scripts
Test and validate network throughput
Best practice—Watch out for root_squash
On the source cluster, exports sometimes restrict access by using root_squash to prevent root users from connecting
remotely and having root privileges. But this access is needed for migrating data. Instead, use the option
“no_root_squash” to turn off root squashing.
Why: Root access (or its equivalent) is needed to migrate all files and directories.
We can also set root squashing on the Isilon NFS exports as shown in Figure 5.
Figure 5. Setting root squashing on Isilon NFS exports
31
In Figure 5, “User/Group Mappings: Use default: Map root users to nobody” has been set for the export /ifs/data/work. This
should be disabled to allow the root full access to the file system. Note that this can be set on an export-by-export basis. In
addition, root access to a specific client, 192.168.43.200, has been restricted in the figure example. Typically, this would be
the host performing the migration.
Isilon cluster configuration preparation
All primary setup and configuration of the Isilon cluster should be completed before data migration begins. The
configuration includes, but is not limited to, the following:
Authentication provider integration—ensures that all authentication providers are online and fully operational.
If local users, all UIDs and GIDs are created and tested.
Access zone and role-based access control (RBAC) setup—complete any zone and RBAC setup.
Exports for clients are created and tested.
Networking design and setup—complete the setup and implementation of the network configuration.
SmartPools—complete the implementation of any SmartPools policies to limit post-migration work.
Dell EMC Isilon SyncIQ®—prepare any existing SyncIQ policies to operate alongside any data migration events.
SnapshotIQ—prepare any SnapshotIQ policies to operate alongside any data migration events.
SmartLock—execute all preliminary SmartLock work prior to migration.
SmartQuotas—disable SmartQuotas until the migration is completed.
Dell EMC Isilon recommends that you use a dedicated Isilon migration directory export to execute all migrations against.
Using a dedicated administrative migration export with the appropriate access and configuration can facilitate the migration
without impacting workflows or data permissions. Normal user clients will not mount this export; it will be mounted only by
migration hosts. Data can then be moved into place after the migration is completed with minimal disruption.
Best practice—Implement a logical NFS export path structure methodology
Be aware of export rules and how they interact with each other:
Path—should be unique, though nesting is possible if used with caution
Permission options—restricted by user ID mapping and IP addresses
Security—netgroups, authentication system, and Kerberos
The order of evaluation is path, client ACLs, then security types (unix, krb5).
For example, if you have exports:
/ifs/data --client=8.8.8.0/24
/ifs/data/something --client=10.10.10.0/24
and your client IP is 10.10.10.10, you would not have access to /ifs/data/something because the export, /ifs/data, has a
different IP restriction. The client must be able to traverse the path if the exports are nested. Access for 10.10.10.0/24
would need to be added to the /ifs/data export:
/ifs/data --client=8.8.8.0/24 --client=10.10.10.0/24
/ifs/data/something --client=10.10.10.0/24
Why: Complex and restrictive rules may prevent clients from connecting to exports that are nested. Clients may
encounter problems mounting a directory that is nested from a different export. If the parent directory has more restrictive
permissions, a client may not be able to mount a child export of that directory.
Best practice—Create the NFS exports
Create the new Isilon NFS exports prior to data migration.
Why: This will allow the creation and setup of the exports and export permissions prior to data migration and cutover for
initial testing and access validation/UAT.
32
Best practice—Do not use default /ifs export
Not using this export enables you to mount subdirectories and have open access to the whole file system.
Why: This should be used for easy setup use only, as it can be a potential security issue.
Best practice—Create the correct NFS export permissions
Set up the correct export permissions on the newly created user exports.
Why: Setting the correct export permissions will allow you to test and validate workflows when test migrations are
undertaken and maintain security.
Note: The migration methodology may include adding an explicit “deny” permission settings on users or directories, so
that they cannot write data to these exports until the cutover has been executed, as well as specific IP addresses to
prevent clients from accessing the exports.
Additional Isilon cluster considerations
The following are some additional Isilon cluster considerations that may need to be addressed prior to and during a data
migration.
NFS group membership limitation
The NFS standard by default does not support membership in more than 16 groups per individual user. This limitation can
be addressed on an Dell EMC Isilon cluster by enabling “Map Lookup UID” under NFS Settings ->NFS Export Settings -
>Export Behavior Settings -> Map Lookup UID as shown in Figure 6.
33
Figure 6. Increasing the number of group memberships in an Isilon cluster
Additional information can be found in the Dell EMC document “NFS supplemental groups limited to 15 in OneFS 6.5.4 and
earlier” (Article Number:000089550).
Production or preproduction cluster
An important consideration when planning and executing a data migration is the current status of the Isilon cluster. Is the
cluster in production or will the migration mark the initial cutover to active production traffic? The primary goal should be to
lessen any impact on a production cluster during migration activities, so appropriate steps should be taken to address
these concerns.
Common factors to be aware of while migrating to a cluster are as follows:
Administratively destructive actions
The saturation of network links
Cluster load and ingest, and impact they have on production workflows
Access zones and role-based access control
If the cluster uses an Isilon access zone or RBAC, the migration methodology may need to be adjusted to accommodate
this configuration. Currently, OneFS 7.0 only allows for NFS exports in the default System Zone and no other zones.
34
NFS RPC threads
By default, the number of NFS RPC threads is set to 16 per node. This number can be increased for specific workflows.
Consult Dell EMC Isilon Support for assistance.
Isilon OneFS SmartConnect or direct node connections
The current status of the cluster may dictate that you should try to optimize and segregate migration traffic within the
clusters network configuration. You can do this by:
Using SmartConnect to autobalance traffic
Separating migration traffic from existing production traffic by using a direct node or separate SmartConnect zone connection for migration traffic
If you use SmartConnect, you should validate and optimize the configuration before you transfer data across the network.
Isilon OneFS SyncIQ considerations
If the data to be migrated will be replicated to a secondary cluster through a SyncIQ policy, additional planning should be
undertaken to address the impact of the data migration and its interaction with active SyncIQ policies. With this scenario,
you should:
Pause active SyncIQ policies if they include migration paths
Schedule SyncIQ jobs to run outside of data copy windows
Utilize SmartConnect zones for copying and SyncIQ replication
Isilon OneFS SmartPools
Any SmartPools data policies should be in place prior to data migration, or additional cluster overhead may be required to
move data within the cluster post-migration.
Isilon OneFS SnapshotIQ
Any active SnapshotIQ policies should be analyzed for any impact during the data migration.
Antivirus integration
You should review and disable any active antivirus scanning policies that may be running against the target data.
35
Best practice—Disable antivirus scanning
Disable active antivirus scanning on migrated data during the initial full and incremental copies.
Why: The large influx of data associated with the migration can place an excessive load on
the antivirus scanning architecture and create a slowdown and potential bottleneck for
inbound data.
Isilon guidelines for large workloads
Be careful not to exceed the maximum configuration values listed in Table 2.
Guideline
Tested or
default
value
Theoretical or
max. practical
value Comments
NFS max. read
size
128 KB 1 MB This applies to both NFS3 and NFS4.
Prior to OneFS 7.0, the maximum read size was 128 KB.
NFS max. write
size
512 KB 1 MB This applies to both NFS3 and NFS4.
Prior to OneFS 7.0, the maximum write size was 512 KB.
NFS3
connections (per
node)
1000 N/A The number of TCP sockets available on the node is
typically what limits NFS connections. Unlike the Isilon
SMB server, nfsd uses file handles instead of file
descriptors to represent files and internally handles work
items much differently.
1,000 connections is a very conservative tested limit and
represents 1,000 mounts over 20 exports. NFS
connection testing is an ongoing test effort.
No maximum connection limit for NFSv3 has been
established at this time.
NFS3 exports
(per cluster)
750 2,000 Beyond 2,000 connections, manageability becomes a
problem. Cluster size does not matter.
nfsd threads (per
node)
16 16 This is a kernel limit, which is exposed via the Linux sysctl
command.
The value should not be changed without first consulting
Dell EMC Isilon Support.
This limit represents the maximum number of
simultaneous work items the server can service.
Simultaneous work items beyond this number are queued
and serviced when resources become available.
File name length 255 B 255 B Note: Most Unicode character encodings (like UTF-8,
which is the OneFS default) allow for multiple bytes per
character (UTF-8 allows for up to 4 B/character).
So, of the 255 B in the filename, that could represent 255
36
Guideline
Tested or
default
value
Theoretical or
max. practical
value Comments
characters, 63 characters, or some number of characters
in between.
Path length 1024 B 1024 B This is the maximum absolute path (for example,
/ifs/data/foo/bar/baz/) length that can be passed into a
syscall, not the maximum depth of a directory in the file
system (see “Directory depth”).
Directory depth 8470 Unlimited* * No specific hard limit is in place, but several other limits
could come into play (inode limits, metadata storage
limits, and so on). In tests, command line (shell) utilities
begin to experience problems at a depth of 8470 (EBADF
from many commands). At a depth of about 30000,
internal utilities (for example, Job Engine TreeDelete) also
begin to fail.
For utilities that make calls with absolute paths (for
example, cd /1/2/3/…), depth will be limited by “path
length” as described above. This applies to path-based
OneFS commands like “isi snapshot” and “isi quota”). For
utilities that access relative paths (for example, cd 1, cd 2,
cd 3…), these higher limits may apply, although the value
of extraordinarily deep directories is questionable.
37
Guideline
Tested or
default
value
Theoretical or
max. practical
value Comments
File size 4 TB 4 TB This is the hardcoded OneFS limit. Note that Job Engine
performance can be impacted on files larger than 1 TB
due to inefficient per-file threading.
Table 2. Isilon guidelines for large workloads
Migration approach—Testing and proof of concept
Once you have developed the migration approach, selected the toolset, and prepared the infrastructure for the data
migration, you can proceed with your initial testing of the methodology. The goal of the testing is to validate the outcome—
is data migrated, are the permissions moved, and are the timestamps moved? The testing phase also allows you to tune
and modify the migration approach to optimize all parts of the process.
The recommended testing approach is as follows:
Run the full copy—benchmark and monitor
Review and validate—potentially look at tuning or tweaking the methodology and re-run the full copy
Run the incremental copy—benchmark and monitor
Review and validate—potentially look at tuning or tweaking the methodology and re-run the incremental copy
Continue to run the incremental copy—and continue to monitor it
Dell EMC Isilon recommends that you test different copy methodologies to tune and optimize the throughput while meeting
your migration requirements.
Best practice—Execute multiple test migrations to validate the methodology
Dell EMC Isilon recommends that you execute multiple migration tests on smaller subsets of different data.
Why: Because different data will tend to have different properties and access profiles, it is important to test all data types
and how the migration methodology may need to be modified for different datasets.
Critical areas to evaluate and monitor during data migration testing are the following:
Network performance—throughput, saturation, and impact
The time to execute a full data copy—will allow for refinement of project plans
The time needed to execute an incremental copy after a set number days after the data change occurs—will help define cutover windows
Cluster load, source load, and host load—will help tune and refine the migration methodology
Best practice—Test all phases of the migration methodology
Execute all steps in the migration methodology to identify the time involved and to verify that the proposed methodology
fulfills all the migration requirements.
Why: It is important to identify issues with the methodology before executing production migrations and cutovers.
Data validation
After you migrate the data, you must validate the data and the file attributes. You must verify that:
File data copied correctly—data is intact and integrity is maintained
File security, ownership, and attributes migrated correctly
File timestamps are correct
38
Next, review the access control entries on a file by running the ls –led followed by a file name as is shown in Figure 7.
Figure 7. Reviewing access control entries on a file
You should also validate the data. Common methods include the following:
File size compares
Checksum/file hash compares—MD5 checksums
Tools—MD5, sum, and checksum
Audit and review directory structures
Once you have reviewed the data attributes directly, it is critical that you validate that the data works in client workflows.
Performance
A migration often moves a large amount of data. You must ensure that the migration methodology, toolset, and
environment are optimized for performance and throughput to work within the migration timeline. The common areas to
focus on when evaluating performance are as follows:
Identify bottlenecks—attempt to identify the worst-performing component
Disable antivirus scanning processes on target and/or source file systems during initial migration copies to minimize CPU impact on client access and its potential impact on extending the length of copy times.
WAN bandwidth physical (circuit limitations) and concurrency impact other systems that are replicating data (SAN, backup, and so on) over a shared link. This could affect replication performance for SyncIQ jobs that need to be run to completion before certain cutovers can be conducted.
Review the timing of the execution—how time of day and day of week tests were executed versus performance
Collect metrics on the data copies, network throughput, source, host, and target systems—evaluate the copy as a whole
Best practice—Time the incremental copies
Benchmark the incremental copies by timing how long they take to execute so that you can plan and orchestrate the
cutover phases appropriately.
Why: Knowing how long an incremental copy will take will likely help you determine the length of time required to execute
a cutover and will help you determine the data outage window that may occur.
User acceptance testing—Data and workflow testing
The final step of migration data testing is the UAT in which the data is tested for integrity with existing workflows. Dell EMC
Isilon recommends that you use test workflows because this data should be considered test data only at this time and may
be removed by later migration steps.
Best practice—Check workflows with test migrated data
Review all workflows on test migrated data.
Why: It is critical that you validate that newly migrated data can be integrated into workflows (that is, user home director
or group share access, and so on) at cutover time without issues. By testing the workflows, you can ensure that cutovers
will occur without incident.
39
Start of migration execution
After you complete all the testing and validation, you can begin to move into the production migration phases. All the
information obtained from testing and tuning should be used to modify and optimize the overall methodology so that the
production migrations are as clean and quick as possible.
The migration execution phases are as follows:
Execute the initial full copy
Execute incremental copies to keep the new storage up to date and as close to production as possible
Based on performance, you can execute multiple migrations simultaneously if they are supported and if the network infrastructure can support the additional load
Multiple hosts (if using a host-based migration) can replicate data in parallel; be aware of load placed on the source cluster.
To reduce contention, coordinate hosts so that they are not all writing to the same subdirectory. Spread hosts across the
Isilon cluster to maximize network bandwidth and processing power.
Alternative migration methodology: Mount the source NFS exports directly on the individual Isilon nodes and run the rsync
commands directly on the cluster. The benefit with this method is that if you have multiple nodes, your transfer bandwidth
scales, and you remove the “middle man” from the process. Data moves directly from the source filer to the Isilon cluster,
without having to go to a host first and then back out to the Isilon cluster. As a result, latency is greatly reduced and
migration times will drop.
Best practice—Continue to run incremental copies
Continue to run incremental copies, even if the cutovers are not scheduled.
Why: This will keep the source and target data more in sync and require less data transfer during the final pre-cutover
copy.
Depending on the size of the data migrations, the initial full copies may take a while to execute. During this time, you can
prepare for the final cutover events.
Pre-cutover preparation
After you start to migrate data, you can begin to prepare your cutover events.
Best practice—Create a detailed migration plan
Create a detailed migration plan with all the specific steps and timing of the migration execution.
Why: This document will dictate the commands and work that are being executed. The plan controls the entire migration
from start to finish. All roles, tasks, and responsibilities are defined.
The detailed migration plan dictates how the migration is executed.
Best practice—Create a cutover document
Create a cutover document that defines the high-level cutover tasks, responsibilities, and timings. The document should
outline the phases and sequence in which tasks are executed.
Why: This document will outline the sequence of events that need to occur during a cutover. It can be used to track and
monitor the progress of the cutover.
Best practice—Create a schedule and define outage windows
Have a well-defined cutover schedule and outage window.
Why: The schedule helps execute the migration cutover. The outage window can be scheduled when you have clearly
determined that access to storage will be unavailable and that you can make storage system changes without impact to
40
clients.
Best practice—Create a communication plan
Have a communication plan.
Why: This communication plan will clearly outline the protocols needed to keep all users up to date on the status of a
migration and enable storage administrators to stay focused on the execution of the cutover and not be distracted by
information requests from end users.
Best practice—Prepare the DNS name resolution infrastructure for cutover
Lower the DNS time to live (TTL).
Why: This will facilitate the cutover of clients using DNS name resolution by reducing the time between authoritative
updates to DNS.
Additional pre-cutover preparation steps often include the following:
Prepare the DFS namespace, if applicable
Create CNAMEs in DNS
Update scripts used by clients for storage connections
Prepare clients and applications
Cutover event
Once you have migrated the data and prepared the environment for the cutover, the actual final migration event can occur.
In general, the high-level cutover sequence resembles the following steps:
Initiate the migration cutover window—communicate the event
Restrict access or make source data read only—prevent new writes to the old data source
Execute a final incremental—copy all final data to the new storage system
Validate final incremental—validate that the source data is ready for the cutover
Execute final testing—the final cutover testing is completed
Make a go or no-go call on a full cutover—decide if the migration should continue or roll back
Update connection and name resolution protocols; DNS, CNAMEs, DFS, and scripts
Enable new storage to read/write—enable writes to the new storage system
Continue testing and user acceptance—continue to test and monitor as production traffic moves over to the new storage system
Execute the redirection of client to the new storage system—initiate the client redirection process
Monitor—assess the cutover, new storage system, and clients
When executing a cutover event, the best practices that follow are recommended.
Best practice—Follow the cutover schedule
Follow a cutover schedule.
Why: By following a well-defined schedule, you can monitor and control the migration. Dell EMC Isilon recommends that
you execute cutovers during off hours or when the number of active connections is low.
Best practice—Test the migrated cutover data
Prepare a number of data and workflow tests to execute against the migrated data. Have a number of well-defined
production use cases, data tests, and test users available to conduct post-cutover testing and review.
Why: Having a well-defined use case and users to validate the migration cutover will help you in making the decision to
continue with the cutover.
41
Best practice—Monitor clients and application during migrations
Monitor client and application connections to the new storage system during the cutover.
Why: This will verify that your cutover methodology is working as defined and that clients are moving and connecting to the
new storage system successfully.
Best practice—Develop a client connection remediation plan
Have a client remediation methodology plan in place and which is ready to execute against clients that exhibit any issues
connecting to the new data targets.
Why: Have a well-defined strategy to handle client connection issues, including a dedicated support line, email address,
or an IT desk.
The go or no-go decision
During the migration cutover window, a critical point will be reached. This threshold determines whether you will continue
with the cutover or abort the cutover and roll back to the existing storage system.
Common abort cutover situations include the following:
Final incremental does not complete in the outage window
Cutover methodology fails; clients are not connecting correctly
Security issues with the new storage system
Workflow issues post-cutover
Load and availability problems
Other unknown issues
Best practice—Clearly define your cutover criteria
It is critical that you have a series of cutover criteria that clearly defines when a migration will continue or be aborted and
rolled back.
Why: The criteria removes uncertainty, provides help with decision-making, and dictates the best action to take.
Once you begin to write data to the new storage system, reverting to the old system becomes much more complicated
because you now need to reconcile data with the original storage system.
Rollback
If a decision to abort the cutover is made, there should be a well-defined rollback plan in place that was developed and
tested ahead of time so that you can restore data access as quickly as possible.
Rollback plan:
Prevent any new writes to the new storage system
Move client connections back to the old storage system
Enable writes to the old storage system
Best practice—Develop a rollback plan
Have a clearly defined rollback strategy that is easy to implement and which can restore user access to data quickly and
cleanly. Also, make sure the plan is tested.
Why: A rollback plan will help you restore client data access quickly in the event that a migration cutover event fails.
If any data has already been written to the new storage system and a rollback is executed, then steps to remediate this
data must be taken to restore the new data back to the original storage system.
42
Common strategies for reconciling data during a rollback are as follows:
Manually reconcile the data—identify and manually move any data from the new storage system to the oldone.
Perform a reverse incremental—have migration type jobs to run in a reverse direction to update the old storage system.
Discard the data—consider the data as noncritical and decide to reconcile it.
Rewrite the data from the client or application to the old storage system—allow applications and clients to rewrite the data.
The goal of any rollback strategy is to limit the impact on end users and restore data access as seamlessly as possible. It is
for this reason that your migration cutover criteria should be well defined and that the rollback strategy should have been
well tested in the event that you need to implement it.
Migration event completion
After you successfully complete a cutover, you should continue to monitor the new storage system.
Best practice—Monitor the new storage post-cutover
Continue to monitor the cutover storage after the cutover event for any issues that may result from the cutover.
Why: Production load and workflow may be unpredictable; closely monitor the new storage system to rectify any post-
migration issues.
Steady state
Repetition
Most data migrations will constitute multiple cutover events. Once you have developed a well-structured migration
methodology, these additional cutover events should be run with the same plan and strategy.
Lessons learned
After you complete a migration, you should assess the success and failures of the methodology. If an additional migration
needs to be performed, the lessons you’ve learned from the migration process will enable you to refine the process.
Ask yourself the following questions:
What worked during the migration and cutover?
What did not work during the migration and cutover?
Can the migration methodology be modified or optimized?
Conclusion
The goal of this paper is to supply you with solid guidance for conducting an NFS single protocol file migration from a NAS system
to an Isilon cluster. The guidance is based on
a comprehensive set of industry knowledge and best practices on the technical aspects and process of data migration. As stated
in the beginning of this document, this paper does not aim to be an exhaustive authoritative source on the subject of NFS single
protocol migrations, but rather a comprehensive reference document that covers the key areas that will help ensure your success.
Dell EMC can provide comprehensive services, including migration services, Isilon training and education, and residency
services, to reduce risk and maximize system uptime and service levels during and after a data and system migration.
43
Appendix: Sample migration use case
This appendix provides a sample high-level overview of how to collect information for and plan a migration of NFS source
directories from a single NFS server. Keep in mind that the information in this section provides only a skeleton of some of the
information that you would want to collect, as well as an overview of the strategy that you will want to define for your migration.
The use case that follows answers the following question at a high level: What are the recommendations and best practices as
well as supported Isilon configurations to migrate all directories and data?
Tables A-1 through A-3 show a sample NFS filer storage system to an Isilon cluster.
Table A-1. Source data
Source configuration and data Sample directory structure
Single source system—NFS filer, 4 x 1 Gb Ethernet
ports
Total data: 25 TB
Max. file size: 4 GB
Min. file size: 0 B
Avg. file size: 256 KB
File count: 8,000,000
8 top-level exports:
acct
engineering
home
production
RandD
scratch
temp
work
User home directories: each user has a single home
directory under a higher level share, /exports/home.
/exports
/acct
/engineering
/home
/production
/RandD
/scratch
/temp
/work
44
Exports and permissions
drwxr-xr-x 10 root wheel 512 Aug 27 14:50 .
drwxr-xr-x 21 root wheel 512 Aug 27 14:50 ..
drwxrwxr-x 2 root rd 512 Aug 27 14:50 RandD
drwxrwxr-x 2 root acct 512 Aug 27 14:50 acct
drwxrwxr-x 2 bob eng 512 Aug 27 14:50 engineering
drwxrwxr-x 2 root users 512 Aug 27 14:50 home
drwxrwxr-x 2 prod prod 512 Aug 27 14:50 production
drwxrwxrwx 2 root wheel 512 Aug 27 14:50 scratch
drwxrwxrwx 2 root wheel 512 Aug 27 14:50 temp
drwxrwxr-x 2 root eng 512 Aug 27 14:50 work
Table A-2. Exports and permissions
Additional source information Additional Isilon information
All LDAP: single domain
Source system network connectivity:
4 x 1 Gb
No firewalls, IDS, or QoS
Monthly full backups
Antivirus scanning in place
Same data center as the Isilon cluster
No deduplication or offline files
DNS: 2 CNAMEs
No routing or VLAN restrictions
Dell EMC Isilon X200 x 3: ~61 TB
OneFS 7.0.x
LDAP authentication
3 x LACP (2 x 1 Gb/s each)
SmartQuotas
SnapshotIQ
Table A-3. Additional source data information
Requirements
All data and permissions are moved as is with no changes. All existing POSIX permissions and ownership is retained.
Eight cutover events; 12-hour window—Saturday 8:00 P.M. through Sunday 8:00 A.M.
One migration is performed per weekend.
Each user has a defined quota; quota limits are to be replicated on the Isilon cluster.
Migration project assumptions (including, but not limited to):
Customer will have approved change controls submitted for any migration activity.
Migration plan and design will have been reviewed and approved by the customer prior to the start of the cutovers.
Any recommended array OS upgrades (and firmware updates) necessary for the migration will be applied before any migration cutover activity occurs.
45
Source NAS and target Isilon systems must be in a known good state prior to conducting the migration.
The customer will have successfully completed a full system backup and verified its reliability.
Strategy
For an Isilon-based migration, mount source NFS exports directly on individual Isilon nodes
Conduct pilot migration; validate methodology, document performance metrics, refine and tune rsync switches and scripts; test migrated data with clients and users
Investigate sizes and file counts in each export; this will help determine the order for the migration (that is, the largest directories will take the longest time and should be started first)
Validate change rates and time to execute incremental copies
Have completed a full backup of all file systems that are to be migrated before the cutover; verify that the backup is good
Develop a detailed project timeline and cutover schedule with the customer
Develop a backout plan, and review it with the customer
Execute the migration in phases: execute initial full copies, followed by nightly incremental copies
Use a DNS update methodology to redirect clients
During the cutover, make sure that each source file system is changed to a read-only stare after the source directory is successfully replicated to prevent clients from making any changes
Reduce DNS TTLs in advance of cutover windows
Develop client communication; the customer should provide dedicated cutover IT support desk/personnel
Source system configuration
If possible, restrict access so that clients cannot modify data during the cutover (change to read only).
Isilon configuration
Pre-create NFS shares with identical share permissions
Disable snapshots on data until the cutover is completed
Disable antivirus scanning
Toolset selection
use rsync
Example:
rsync --a --delete sourcefiler:/exports/acct /ifs/data/exports/acct --exclude ‘.snapshot*’
Migration testing
Map source and target directories from the migration host
Replicate a small test set of data
Use a small set of test users to validate full data access (discard or overwrite data following the test)
Migration
This example directory structure can be broken down into at least eight separate rsync jobs. Several of the jobs can be run
in parallel, assuming the source cluster can tolerate the additional load while the migration is occurring. Monitor the source
cluster and scale it up or down accordingly.
If the customer has a preference for the order of directory migration, then plan the transfers accordingly. Otherwise, start
with the largest directories because they will take the most time to replicate.
Example:
Initial migration testing indicated that three simultaneous rsync jobs would be an acceptable additional load on the source
cluster. Run three rsync jobs concurrently so that you do not overload the source cluster. Note that in other scenarios,
additional nodes could run additional rsync jobs in parallel, if the source cluster has enough performance and bandwidth to
accommodate this.
Mount an individual export to a different node and start an rsync:
Isilon node 1:
46
Mount filer1:/export/acct with a command similar to:
mount filer1:/export/acct /import/acct
(mounts the remotely exported directory to a created directory locally)
Rsync to /ifs/data/filer1/acct with a command similar to:
rsync –av /import/acct /ifs/data/filer1 --exclude ‘.snapshot*’
(this would copy files in “archive” mode, which ensures that symbolic links, devices, attributes, permissions, ownerships,
and so on are preserved in the transfer. There is no compression. Exclude any snapshot directories from the transfer).
Repeat with Isilon node 2:
Mount filer1:/export/engineering and rsync to /ifs/data/filer1/engineering.
Do the same with Isilon node 3:
Mount filer1:/export/home and rsync to /ifs/data/filer1/home.
Monitor the source cluster to verify that it is not overloaded with the additional strain of migrating data. The idea is to avoid
impacting clients as the migration progresses.
Repeat this process with the remaining shares until all of the source data is migrated.
Once an initial full copy of the source data has been completed, incremental copies should be run to propagate any
changes that were made once the migration began.
On the day of the cutover, final incremental copies should be run and access to the source cluster should be restricted if
possible to prevent clients from writing data that may not be migrated.
User acceptance testing
Verify that data and permissions on the Isilon cluster are replicated correctly following the final copies
Review user access and verify that users have connectivity and correct permissions
Use a small set of test users to validate that full data access occurs during the migration event to identify any problems early on
Verify that a user can read/write to a file, create a new file and directory, and traverse the directory structure
Monitor performance of the Isilon cluster and client connections as load increases
Cutover plan
Make source file systems read only
Execute final incremental copies
Update DNS
Verify that clients can connect to Isilon
Test automated workflows, if possible
Initiate user logoff and re-logon
Rollback plan
Reverse DNS update
Make old source file systems read/write and remove any restrictions
Remove any connections to the Isilon cluster and stop exports
Note that any data that was written to the Isilon cluster is considered lost; no reverse or reconciliation process will be performed.
Exit criteria
DNS resolves to new target storage and shares
Confirm that client can successfully read/write data to directories
Confirm that workflows were successfully completed with no user or permission problems
Verify that there are no connectivity issues
Post-cutover
Conduct a customer meeting to review and triage the migration
Create documentation of the migration protocol
Conduct a “lessons learned” discussion with both the internal customer and their team