IBM Active Cloud Engine/Active File Management

18
IBM Active Cloud Engine/Active File Management Kalyan Gunda [email protected]

Transcript of IBM Active Cloud Engine/Active File Management

Page 1: IBM Active Cloud Engine/Active File Management

IBM Active Cloud Engine/Active

File Management

Kalyan Gunda

[email protected]

Page 2: IBM Active Cloud Engine/Active File Management

Agenda

Need of ACE?

Inside ACE

Use Cases

Page 3: IBM Active Cloud Engine/Active File Management

Data Movement across sites

How do you move Data across sites today?

FTP, Parallel FTP

SCP

Backup to tape and Fedex

Issues

Pre planned, user initiated

Replica Mgmt

What if this data needs to move to multiple sites

very frequently

Page 4: IBM Active Cloud Engine/Active File Management

Data Movement between sites

What if there is a tool

That pulls data on demand

No explicit user initiation

That moves data periodically & smartly

That moves only changed data

That effectively uses the network

Manages these replicas keeping staleness in

control?

Is there such a tool?

Page 5: IBM Active Cloud Engine/Active File Management

Panache/ACE/AFM

ACE Global provides

Seamless data movement between clusters

On demand

Periodically

Continuously

Provide a persistent scalable POSIX-compliant

cache for remote filesystem

Even during disconnection

Page 6: IBM Active Cloud Engine/Active File Management

Moving data between locations can be slow and data copies itself can become stale

Once And data is not persistent…

Write

Read

Read

Read

But customers need to collaborate immediately with up

to date changes

Page 7: IBM Active Cloud Engine/Active File Management

Inside ACE

Page 8: IBM Active Cloud Engine/Active File Management

/home/appl/data/web/spreadsheet.xls

/home/appl/data/web/drawing.ppt

Panache Overview: Reads

/home

/appl

/data

/web

/home/appl/data/web/drawing.ppt

GPFS Panache

Scale out cache

Storage Array

Storage node

Storage node

Interface node

Interface node

/home/appl/data/web/spreadsheet.xls

Remote user reads local edge device for file

On demand-read from home site

Local cache to disk

Read Can run disconnected

Panache

NFS

CIFS

HTTP

VFS

Gateway node

Home Site Cluster

Page 9: IBM Active Cloud Engine/Active File Management

/home/appl/data/web/spreadsheet.xls

/home/appl/data/web/drawing.ppt

Asynchronous write back

/home

/appl

/data

/web

/home/appl/data/web/drawing.ppt

Storage node

Storage node

Interface node

Interface node

/home/appl/data/web/spreadsheet.xls

Remote user writes file to local edge

device

Local cache to disk Log write to memory Q

1. Write Periodically, or when

nw is connected

Panache scale out cache

Panache

Home cluster

Page 10: IBM Active Cloud Engine/Active File Management

Asynchronous Updates (write, create, remove)

Updates at cache site are pushed back lazily Mask the latency of the WAN

Data is written to GPFS at cache site synchronously

GW node queues the update for later execution Performance identical to a local file system update

Writeback is asynchronous Configurable asynch delay

GW nodes queue updates and write back to home as network bandwidth permits

Write back tends to coalesce updates and accommodate out-of-order and parallel

writes to files and directories

… maximizing WAN bandwidth utilization

Users can force a sync if needed

Page 11: IBM Active Cloud Engine/Active File Management

Expiration of Data

Staleness Control Defined based on time since disconnection

Once cache is expired, no access is allowed to cache

Manual expire/unexpire option for admin

Allowed onlys for ro mode cache

Disabled for SW & LU as they are sources of data themselves

Page 12: IBM Active Cloud Engine/Active File Management

Panache WAN Caching Features Feature Panache support

Writable cache Yes

Granularity Fileset (dir tree)

Policy based pre-fetching Yes (uses GPFS policy engine rules)

Policy based cache eviction Yes (uses GPFS policy engine rules)

Disconnected mode

operations

Yes (can also expire based on

configured timeout)

Data Transport protocol NFS (uses standard to move data from

any filer)

Streaming support Yes (GPFS policy rules select files to

replicate)

Locking support No (only local cluster wide locks)

Sparse file support Yes (can read as sparse files)

Namespace caching Yes ( gets dir struct along with data)

Parallel data transfer Yes

Page 13: IBM Active Cloud Engine/Active File Management

Use Cases

Page 14: IBM Active Cloud Engine/Active File Management

Use Case: Central/Branch Office

Central Site Data is created, maintained,

updated/changed.

Branch/edge sites periodically prefetch (via policy)

or pull on demand

Data is revalidated when accessed

A typical scenario for this is itunes like music sites

Periodic Prefetch

On Demand Pull

Edge site

(Reader)

HQ Primary Site (Writer)

Page 15: IBM Active Cloud Engine/Active File Management

Use Case: Non-Dependent Writers

Each site writes to the site’s decidated fileset/directory.

A central system which will have all home dirs and backup/hsm will be managed out of this.

UseUser A’s home directory

(writer)

r A’s home directory

(writer)

Backup Site

UseUser B’s home directory (writer)

UseBackujp site

Page 16: IBM Active Cloud Engine/Active File Management

Use Case: Ingest and Disseminate

Central site gets

updates

frequently

Regional/edge

sites can

periodically

prefetch or pull

on demand

Data is

revalidated

Backup Site

Periodic pre-fetch

Data Ingest on location(writer) On Demand Pull

Backup site

Periodic Pull

Page 17: IBM Active Cloud Engine/Active File Management

File System: store1

File System: store2Cache Filesets:

/data1

/data2

Local Filesets:

/data3

/data4

Cache Filesets:

/data5

/data6 Local Filesets:

/data1

/data2

Cache Filesets:

/data3

/data4

Cache Filesets:

/data5

/data6

File System: store2Cache Filesets:

/data1

/data2

Cache Filesets:

/data3

/data4

Local Filesets:

/data5

/data6

SONAS2.ibm.com

SONAS1.ibm.com

SONAS3.ibm.com

Clients connect to:

SONAS:/data1

SONAS:/data2

SONAS:/data3

SONAS:/data4

SONAS:/data5

SONAS:/data6

Clients connect to:

SONAS:/data1

SONAS:/data2

SONAS:/data3

SONAS:/data4

SONAS:/data5

SONAS:/data6

Clients connect to:

SONAS:/data1

SONAS:/data2

SONAS:/data3

SONAS:/data4

SONAS:/data5

SONAS:/data6

Each cache site will export same namespace view

Every fileset is

accessibile from

all sites

Home for

data3 and

data4

HOME

FOR DATA5

AND DATA6

Home for

data1 and

data2

Use Case: Global Namespace (Mesh)

Page 18: IBM Active Cloud Engine/Active File Management

Thank You