EMC RecoverPoint Architecture and Basic Concepts

7
EMC RecoverPoint Architecture and Basic Concepts March 25, 2013 David Ring RecoverPoint EMC , RecoverPoint This is my first blog on RecoverPoint; in this initial post I will detail some of the basic concepts and terminology around RecoverPoint and the GEN 5 hardware appliance specification. •Overview •Gen5 Hardware •Terminology Overview RecoverPoint provides continuous data protection for storage arrays running on a dedicated appliance (RPA) allowing for the protection of data at both local and remote levels. RecoverPoint provides bi-directional replication enabling the recovery of data to any point in time while replicating data over any distance; within the same site (CDP), to another distant site (CRR), or both concurrently (CLR). Data transfer inside the same site is performed using fibre channel connectivity and for transfer between sites both FC and IP (WAN) is supported. Synchronous replication is supported when the remote sites are connected through FC and provides for a zero RPO. For a synchronous configuration the lag between the production and the remote is always zero since RecoverPoint

description

introduce EMC RecoverPoint Architecture and Basic Concepts

Transcript of EMC RecoverPoint Architecture and Basic Concepts

Page 1: EMC RecoverPoint Architecture and Basic Concepts

EMC RecoverPoint Architecture and Basic Concepts

March 25, 2013 David Ring RecoverPoint EMC , RecoverPoint

This is my first blog on RecoverPoint; in this initial post I will detail some

of the basic concepts and terminology around RecoverPoint and the GEN

5 hardware appliance specification.

•Overview

•Gen5 Hardware

•Terminology

Overview

RecoverPoint provides continuous data protection for storage arrays

running on a dedicated appliance (RPA) allowing for the protection of

data at both local and remote levels. RecoverPoint provides bi-directional

replication enabling the recovery of data to any point in time while

replicating data over any distance; within the same site (CDP), to another

distant site (CRR), or both concurrently (CLR). Data transfer inside the

same site is performed using fibre channel connectivity and for transfer

between sites both FC and IP (WAN) is supported. Synchronous

replication is supported when the remote sites are connected through FC

and provides for a zero RPO. For a synchronous configuration the lag

between the production and the remote is always zero since RecoverPoint

does not acknowledge the write before it reaches the remote site.

Asynchronous replication provides crash-consistent protection and

recovery to specific points in time.

Page 2: EMC RecoverPoint Architecture and Basic Concepts

An example of a local Continuous Data Protection (CDP) solution:

From the above image you can see that the splitter sends a copy to the

Production LUN and the RPA.The write is acknowledged by the LUN and

the RPA. The RPA writes the data to the journal volume along with a time

stamp and bookmark metadata.The data is then distributed to the local

replica in a write-order-consistent manner. This means that if your

consistency groups contains many LUNs, all the data being written is

write-order consistent.

An example of a Continuous Remote Replication (CRR) solution:

If we examine the IO sequence of the CRR solution we can see again that

the IO is split sending one copy to the production LUN and the other to

the RPA. The Process as mentioned can be:

1. Asynchronous – In Asynchronous repl the write IO from the host is

sent to the RPA. The RPA acks it as soon as data arrives into its memory.

Page 3: EMC RecoverPoint Architecture and Basic Concepts

2. Synchronous – In Sync mode no data is ack’d by the RPA until it

reaches the memory of the DR’s RPA or DR persistent storage depending

on whether the “measure lag to remote RPA” flag setting is enabled in

the configuration. Sync replication can be run over FC or IP with the

requirement that when using FC the latency limit does not exceed 4ms

for a full round trip and for IP the latency does not exceed 10ms for a full

round trip.

For a concurrent local and remote (CLR) solution, both CDP and CRR

occur simultaneously to provide CLR.

The RecoverPoint family consists of three license offerings:

RecoverPoint/CL (Classic) for replicating across EMC Arrays and non-

EMC storage platforms with the use of VPLEX. Note: capacity is ordered

per RPA cluster not per RP system. Supports all EMC array splitters.

RecoverPoint/EX for VMAXe™, VPLEX™, VNX™ series,

VNXe3200, CLARiiON® CX3 and CX4 series, XtremIO, ScaleIO and

Celerra®unified storage environments.

RecoverPoint/SE for VNX series, VNXe 3200, CLARiiON CX3 and CX4

series, and Celerra unified storage environments.

Gen5 Hardware

The RecoverPoint appliance (RPA) is a 1u hardware based server (Intel

R1000). The specification of the RPA is as follows:

• 2 x Quad Core Sandy Bridge Processors

• Two 300GB 10K RPM 2.5” SAS Drives in RAID1 configuration

• 6 x 1GE ports (RJ-45) WAN, LAN & Remote management + 3 ports are

Page 4: EMC RecoverPoint Architecture and Basic Concepts

unused

• 16 Gig DDR3 Memory

• PCIe slot 1: Quad Port 8GB FC QLogic 2564 Card (PCIe slot 2 is empty)

From the image below you can see the port usage for WAN, LAN and the

HBA Port Sequence (left to right) 3-2-1-0. For each RPA, we use two

Ethernet cables to connect the Management (LAN) interface to eth1 and

the WAN interface to eth0.

GEN5 RPA:

Note: RecoverPoint clusters must have a minimum of 2 RPAs and a

maximum of 8 RPAs. Cluster sizes must be the same at each site of an

installation. A RecoverPoint Environment can have up to 5 clusters either

local or remote although RP/SE has a limit of two clusters. GEN4 & GEN5

RPAs can co-exist in the same RP cluster.

Terminology

Splitter – The function of the Array-based splitter is to ensure that the

RPA receives a copy of each write to the protected LUN. In the

Production site the function of the splitter is to split the IO’s so that both

the RPA and the storage receive a copy of the write while maintaining

write-order fidelity. In the DR site, the responsibility of the splitter is to

Page 5: EMC RecoverPoint Architecture and Basic Concepts

block unexpected writes from hosts and support the various types of

image accesses.

RecoverPoint Repository Volumes – are dedicated volumes on the SAN-

attached storage at each site, one repository volume is required for each

RPA cluster. The repository holds the configuration information about the

RPAs and consistency groups. Repository volumes are only exposed to the

RPAs. The minimum size for the repository is 2.86GB.

RecoverPoint Journal Volumes – are SAN-attached storage volume(s) for

each copy that is used in a consistency group (the production copy, local

replica copy, and remote replica copy). Again journal volumes are

exposed only to the EMC RPAs, not to the hosts. There are two types of

journal volumes:

1. Replica journals – used to hold snapshots that are either waiting to be

distributed, or that have already been distributed to the replica storage.

It also holds the meta-data for each image and bookmarks. The replica

journal holds as many snapshots as its capacity allows.

2. Production journals – are used when there is a link failure between

sites, in this situation marking information is then written to the

production journal and synced to the replica when the link comes online.

This process is known as delta marking (Marking Mode). The production

journal does not contain snapshots used for PIT recovery. Note: Minimum

size of journal volumes is 10GB for a standard consistency group and

40GB for a distributed consistency group.

Page 6: EMC RecoverPoint Architecture and Basic Concepts

Replication Set – a protected SAN-attached storage volume from the

production site and its replica (local or remote) are known as a

replication set.

Consistency Group – consists of replication sets grouped together to

ensure write order consistency across all the replication sets’ primary

volumes. A configuration change on a consistency group will apply to all

its replication sets, such as changing compression and bandwidth limits

on the group. A RecoverPoint system has a maximum limit of 128 CGs

max per RP system and a max of 64 CGs per RPA, if an RPA in the cluster

fails the CGs running on that RPA will fail over to another RPA in the

cluster.

Distributed Consistency Group – in order to obtain higher throughput

rates it is possible to configure the CG as a DCG which can use up to 4

RPAs (1 RPA is used per standard CG), you can configure a maximum of 8

DCGs. 128 CGs (CG&DCG) max per RP system.

Image Access – refers to providing host access to the replication volumes,

while still keeping track of source changes. Image access can be physical

(also known as logged), which provides access to the actual physical

volumes, or virtual, with rapid access to a virtual image of the same

volumes.

In the next RecoverPoint blog I will detail sizing and performance

characteristics for the Journal and Replica volumes.