A COST-EFFECTIVE HIGH-BANDWIDTH STORAGE ARCHITECTURE

26
A COST-EFFECTIVE HIGH-BANDWIDTH STORAGE ARCHITECTURE G.A. Gibson, D. F. Nagle, K. Amiri, J. Butler, F.W. Chang, H. Gobioff, C. Hardin, E. Riedel , D. Rochberg, J. Zelenka Carnegie-Mellon U. ASPLOS ‘98

description

A COST-EFFECTIVE HIGH-BANDWIDTH STORAGE ARCHITECTURE. G.A. Gibson, D. F. Nagle, K. Amiri, J. Butler, F.W. Chang, H. Gobioff, C. Hardin, E. Riedel , D. Rochberg, J. Zelenka Carnegie-Mellon U. ASPLOS ‘98. Paper highlights. - PowerPoint PPT Presentation

Transcript of A COST-EFFECTIVE HIGH-BANDWIDTH STORAGE ARCHITECTURE

Page 1: A COST-EFFECTIVE HIGH-BANDWIDTH STORAGE ARCHITECTURE

A COST-EFFECTIVEHIGH-BANDWIDTH

STORAGE ARCHITECTURE

G.A. Gibson, D. F. Nagle, K. Amiri,J. Butler, F.W. Chang, H. Gobioff,C. Hardin, E. Riedel , D. Rochberg,J. ZelenkaCarnegie-Mellon U.

ASPLOS ‘98

Page 2: A COST-EFFECTIVE HIGH-BANDWIDTH STORAGE ARCHITECTURE

Paper highlights

• Introduces Network-Attached Secure Disk architecture characterized by:– Direct transfers to clients– Secure interfaces via cryptographical support– Asynchronous non-critical path oversight

(client can perform most operations us without synchronous appeals to file manager)

– Variably-sized data objects

Page 3: A COST-EFFECTIVE HIGH-BANDWIDTH STORAGE ARCHITECTURE

Motivation

• High demand for storage bandwidth caused by – Multimedia applications– Data intensive applications such as data

mining• Want to achieve scalable bandwidth

– Bandwidth that grows linearly with number of storage devices and client processors

Page 4: A COST-EFFECTIVE HIGH-BANDWIDTH STORAGE ARCHITECTURE

Storage architecture overview (I)

1. Local file system :• Sole solution for stand-alone computers

Computer Disk

Page 5: A COST-EFFECTIVE HIGH-BANDWIDTH STORAGE ARCHITECTURE

Storage architecture overview (II)

2. Distributed FS:• Provides basic file services• If server processes data for clients, we

have a Distributed Database System• Server machine could become bottleneck

when the number of drives increases

Server DiskClient

Page 6: A COST-EFFECTIVE HIGH-BANDWIDTH STORAGE ARCHITECTURE

Storage architecture overview (III)

3. Distributed FS with RAID Controller :• Improves reliability but adds one more layer

Server DiskClient K

Page 7: A COST-EFFECTIVE HIGH-BANDWIDTH STORAGE ARCHITECTURE

Storage architecture overview (IV)

4. Distributed FS with DMA:• Lets disks and clients exchange data w/o

server intervention

Server DiskClient K

Bulk data transfers

Page 8: A COST-EFFECTIVE HIGH-BANDWIDTH STORAGE ARCHITECTURE

Storage architecture overview (V)

5. Network-Attached Secure Disk:• Disk takes over several of the functions of

the server

Server NASDClient

R/W

Page 9: A COST-EFFECTIVE HIGH-BANDWIDTH STORAGE ARCHITECTURE

Storage architecture overview (VI)

6. Network-Attached Secure Disk with Cheops(their own file striping system)

Server NASDClient

R/W

K

Page 10: A COST-EFFECTIVE HIGH-BANDWIDTH STORAGE ARCHITECTURE

Related works

• Disk-like network attached storage (Cambridge’s Universal File System, 1980)

• Virtual volumes and virtual disks (mid 90’s)• Derived Virtual Devices (ISI’s Netstation, 1996)• Capabilities (1966)

Page 11: A COST-EFFECTIVE HIGH-BANDWIDTH STORAGE ARCHITECTURE

Enabling technology (I)

• I/O-bound applications:multimedia, data mining of retail transactions

• New drive attachment technologies: tendency to encapsulate drive communication over a serial switched packet-based SAN

• Excess of on-drive transistors:can now have more intelligent drives

Page 12: A COST-EFFECTIVE HIGH-BANDWIDTH STORAGE ARCHITECTURE

Enabling technology (II)

• Convergence of peripheral and interprocessor networks:– Clusters of workstations use Internet protocols

• Not special-purpose interconnects• Cost-ineffective storage servers:

– Server now much more expensive that the disks it manages

Page 13: A COST-EFFECTIVE HIGH-BANDWIDTH STORAGE ARCHITECTURE

Network-Attached Secure Disks

• Modify storage devices to transfer datadirectly to clients– Eliminate server bottleneck

• Present a flat-name space of variable length objects– Simple yet flexible

• Do not provide full file system functionality– Other FS tasks left to file manager

Page 14: A COST-EFFECTIVE HIGH-BANDWIDTH STORAGE ARCHITECTURE

One way to look at NASD

• Conventional architectures divide file system tasks between– A metadata service (directories and i-nodes)– A block storage service

• The authors propose to– Let storage units handle block allocation – Let them communicate directly with clients

Page 15: A COST-EFFECTIVE HIGH-BANDWIDTH STORAGE ARCHITECTURE

Network-Attached Secure Disks

• Architecture characterized by:– Direct transfers to clients– Secure interfaces via cryptographic support– Asynchronous non-critical path oversight

• Client can perform most operations without synchronous calls to file manager

– Variable length objects

Page 16: A COST-EFFECTIVE HIGH-BANDWIDTH STORAGE ARCHITECTURE

A NASD System

Page 17: A COST-EFFECTIVE HIGH-BANDWIDTH STORAGE ARCHITECTURE

NASD interface (I)

• Less than 20 requests including:– Read and write object data– Read and write object attributes– Create and remove object– Create, resize, and remove partition– Construct a copy-on-write object version– Set security key

Page 18: A COST-EFFECTIVE HIGH-BANDWIDTH STORAGE ARCHITECTURE

NASD interface (II)

• Resizable partitions allow capacity quotas to be managed by a drive administrator

• Objects with well-known names and structures allow configuration and bootstrap of drives and partitions. – Also enable file systems to find a fixed

starting point for an object hierarchy and a complete list of allocated object names

Page 19: A COST-EFFECTIVE HIGH-BANDWIDTH STORAGE ARCHITECTURE

NASD interface (III)

• Object attributes– Provide timestamps, size, …– Allow capacity to be reserved and objects to

be linked for clustering– Include an uninterpreted block of attribute

space• Can be used by any application

Page 20: A COST-EFFECTIVE HIGH-BANDWIDTH STORAGE ARCHITECTURE

NASD interface (IV)

• NASD security is based on cryptographic capabilities

– Drive checks that client has proivate apt of capability authorizing operation

• Data integrity and privacy ensured through encryption

– Costly but expected to be implemented on special hardware

Page 21: A COST-EFFECTIVE HIGH-BANDWIDTH STORAGE ARCHITECTURE

Prototype implementation (I)

• Working prototype of the NASD drive software runs as a kernel module in Digital UNIX

• Each NASD prototype drive runs on a DECAlpha3000/400 (133MHz, 64MB) with two disks attached by two 5MB/s SCSI busses

• Performance of this old machine is similar to that expected from future drive controllers

Page 22: A COST-EFFECTIVE HIGH-BANDWIDTH STORAGE ARCHITECTURE

Prototype implementation (II)

• Two drives managed by a software striping driver approximate the rates expected from more modern drives

• NASD object system implements its own internal object access, cache, and disk space management

• Prototype uses DCE RPC over UDP/IP for communication– Severely limited prototype performance

Page 23: A COST-EFFECTIVE HIGH-BANDWIDTH STORAGE ARCHITECTURE

File Systems for NASD (I)

• Implemented NFS and AFS on top of simulated NASDs

• Files and directories stored as objects• NFS implementation was straightforward

– Can store additional file attributes in uninterpreted block of attribute space

– Can piggyback capabilities on file manager’s response to lookup requests

Page 24: A COST-EFFECTIVE HIGH-BANDWIDTH STORAGE ARCHITECTURE

File Systems for NASD (II)

• AFS implementation required more thought– New RPC calls were added to obtain and

relinquish capabilities– File manager does not know when an actual

write takes place• Replaced callbacks by leases• NASD-NFS and NFS had benchmark times

“within 5% of each other”

Page 25: A COST-EFFECTIVE HIGH-BANDWIDTH STORAGE ARCHITECTURE

File Systems for NASD (III)

• Also implemented a simple parallel file systemnot discussed in class

Page 26: A COST-EFFECTIVE HIGH-BANDWIDTH STORAGE ARCHITECTURE

Conclusions

• NASD– Supports direct device-to-client operation– Provides secure interfaces– Lets file managers provide clients with

capabilities that allow them to interact directly with the devices (asynchronous oversight)

– Lets devices serve variable-length objects with additional attribures