Network Computing Laboratory Integrating Portable and Distributed Storage Niraj Tolia, Jan Harkes,...

26
Network Computing Laboratory Integrating Portable and Integrating Portable and Distributed Storage Distributed Storage Niraj Tolia, Jan Harkes, Michael Kozuch, and M. Satyanaray anan CMU and Intel Research Pittsburgh USENIX FAST 2004 Presenter: Yongjoon Son 2005. 11. 23

Transcript of Network Computing Laboratory Integrating Portable and Distributed Storage Niraj Tolia, Jan Harkes,...

Page 1: Network Computing Laboratory Integrating Portable and Distributed Storage Niraj Tolia, Jan Harkes, Michael Kozuch, and M. Satyanarayanan CMU and Intel.

Network Computing Laboratory

Integrating Portable and Integrating Portable and Distributed StorageDistributed Storage

Niraj Tolia, Jan Harkes, Michael Kozuch, and M. SatyanarayananCMU and Intel Research Pittsburgh

USENIX FAST 2004

Presenter: Yongjoon Son2005. 11. 23

Page 2: Network Computing Laboratory Integrating Portable and Distributed Storage Niraj Tolia, Jan Harkes, Michael Kozuch, and M. Satyanarayanan CMU and Intel.

Korea Advanced Institute of Science and Technology

Network Computing Laboratory | 2

One-line CommentOne-line Comment

This paper describes a technique called lookaside caching to integrate portable storage devices into the distributed file systems.

Page 3: Network Computing Laboratory Integrating Portable and Distributed Storage Niraj Tolia, Jan Harkes, Michael Kozuch, and M. Satyanarayanan CMU and Intel.

Korea Advanced Institute of Science and Technology

Network Computing Laboratory | 3

ContentsContents

MotivationUse Scenario

Lookaside CachingLookaside Caching Flowchart

ImplementationEvalution

Kernel CompileInternet Suspend/ResumeTrace Replay

Broader Use of Lookaside CachingDiscussion

Page 4: Network Computing Laboratory Integrating Portable and Distributed Storage Niraj Tolia, Jan Harkes, Michael Kozuch, and M. Satyanarayanan CMU and Intel.

Korea Advanced Institute of Science and Technology

Network Computing Laboratory | 4

MotivationMotivation

Sneakernet“Transfer of electronic information by physically carrying removable media from one computer to another” – WikipediaWhy is it alive and well today in spite of advances in networking and distributed file systems?

Full confidence that you will be ableto access that data anywhere,regardless of network quality,network or server outages,and machine configuration.The capacity and the performance ofthe portable storage are ever-improving.

LimitationsEnsuring right device? current version?Guarding against loss, theft and damageSome chores

Page 5: Network Computing Laboratory Integrating Portable and Distributed Storage Niraj Tolia, Jan Harkes, Michael Kozuch, and M. Satyanarayanan CMU and Intel.

Korea Advanced Institute of Science and Technology

Network Computing Laboratory | 5

Integration ValueIntegration Value

Portable Devices Distributed File Systems

Performance (stable) (variable)

Availability

Robustness

Sharing/Collaboration

Consistency

Capacity

Security (mainly physical security)

(network security)

Ubiquity

Page 6: Network Computing Laboratory Integrating Portable and Distributed Storage Niraj Tolia, Jan Harkes, Michael Kozuch, and M. Satyanarayanan CMU and Intel.

Korea Advanced Institute of Science and Technology

Network Computing Laboratory | 6

Use ScenarioUse Scenario

You don’t ensure where you can connect network in your destination. You decide to bring portable devices.If the network doesn’t work in the destination, you use the device directly. Though the data can be stale, you can access it.Otherwise, you can access the up-to-date data through the dfs. But, the portable devices can still work as a cache.

On a slow network or with a heavily loaded server, you will benefit from improved performance.With network or server outages, you will benefit from improved availability if your dfs supports disconnected operation and if you have hoarded all your meta-data.

Page 7: Network Computing Laboratory Integrating Portable and Distributed Storage Niraj Tolia, Jan Harkes, Michael Kozuch, and M. Satyanarayanan CMU and Intel.

Korea Advanced Institute of Science and Technology

Network Computing Laboratory | 7

Lookaside CachingLookaside CachingOnce a client possesses valid meta-data for an object, it looks up the corresponding file content in the mounted portable storage devices before accessing the file server.Minimal disruption of existing usage model

Based upon AFS2-style whole-file cachingLookaside caching extends the definition of meta-data to include a cryptographic hash of data contents.

Additional 20 bytes if SHA-1 is used as the hash

Client Program

Venus

(cache mgr)

Vice

(servers)

VFS

Kernel

Coda FSExt2 FS

ISO 9660 FSNFS

User level

system call

Return from syscall

network

Portable Storage

Lookaside indexes

Page 8: Network Computing Laboratory Integrating Portable and Distributed Storage Niraj Tolia, Jan Harkes, Michael Kozuch, and M. Satyanarayanan CMU and Intel.

Korea Advanced Institute of Science and Technology

Network Computing Laboratory | 8

Lookaside Caching FlowchartLookaside Caching Flowchart

open()

fetch attributes

cached

done

Yes

network fetch

No

done

Page 9: Network Computing Laboratory Integrating Portable and Distributed Storage Niraj Tolia, Jan Harkes, Michael Kozuch, and M. Satyanarayanan CMU and Intel.

Korea Advanced Institute of Science and Technology

Network Computing Laboratory | 9

Lookaside Caching FlowchartLookaside Caching Flowchart

open()

fetch attributes

cached

done

Yes

network fetch

No

done

+ hash

Page 10: Network Computing Laboratory Integrating Portable and Distributed Storage Niraj Tolia, Jan Harkes, Michael Kozuch, and M. Satyanarayanan CMU and Intel.

Korea Advanced Institute of Science and Technology

Network Computing Laboratory | 10

Lookaside Caching FlowchartLookaside Caching Flowchart

open()

fetch attributes

cached

done

Yes No

+ hash

portable

Page 11: Network Computing Laboratory Integrating Portable and Distributed Storage Niraj Tolia, Jan Harkes, Michael Kozuch, and M. Satyanarayanan CMU and Intel.

Korea Advanced Institute of Science and Technology

Network Computing Laboratory | 11

Lookaside Caching FlowchartLookaside Caching Flowchart

open()

fetch attributes

cached

done

Yes No

+ hash

portable

done

Yes

Page 12: Network Computing Laboratory Integrating Portable and Distributed Storage Niraj Tolia, Jan Harkes, Michael Kozuch, and M. Satyanarayanan CMU and Intel.

Korea Advanced Institute of Science and Technology

Network Computing Laboratory | 12

Lookaside Caching FlowchartLookaside Caching Flowchart

open()

fetch attributes

cached

done

Yes No

+ hash

portable

done

Yes

network fetch

No

done

Page 13: Network Computing Laboratory Integrating Portable and Distributed Storage Niraj Tolia, Jan Harkes, Michael Kozuch, and M. Satyanarayanan CMU and Intel.

Korea Advanced Institute of Science and Technology

Network Computing Laboratory | 13

ImplementationImplementation

Implemented lookaside caching in the Coda file system on Linux.The implementation consists of four parts

A small change to the client-server protocolViceGetAttr() -> ViceGetAttrPlusSHA()ViceValidateAttrs() -> ViceValidateAttrsPlusSHA()If the server doesn’t support lookaside caching, it falls back to using the original RPCs

A quick index check in code path for handling a cache missA tool for generating lookaside indexes

mkdb utility to generate index fileIndex generated from normal file system tree

Lazy update process

Allows users to change data

A set of user commands to include or exclude specific lookaside devices

Page 14: Network Computing Laboratory Integrating Portable and Distributed Storage Niraj Tolia, Jan Harkes, Michael Kozuch, and M. Satyanarayanan CMU and Intel.

Korea Advanced Institute of Science and Technology

Network Computing Laboratory | 14

Lookaside Commands on ClientLookaside Commands on Client

Dynamic inclusion or exclusion of lookaside devices is done through user-level commands.Multiple lookaside devices can be in use at the same time.The devices are searched in order of inclusion

Page 15: Network Computing Laboratory Integrating Portable and Distributed Storage Niraj Tolia, Jan Harkes, Michael Kozuch, and M. Satyanarayanan CMU and Intel.

Korea Advanced Institute of Science and Technology

Network Computing Laboratory | 15

EvaluationEvaluation

The performance of the lookaside caching depends onThe workloadThe network qualityThe overlap between data on the lookaside device and data accessed from the distributed file system

Three different benchmarksKernel compile benchmarkVirtual machine migration benchmark (ISR)Single-user trace replay benchmark

Page 16: Network Computing Laboratory Integrating Portable and Distributed Storage Niraj Tolia, Jan Harkes, Michael Kozuch, and M. Satyanarayanan CMU and Intel.

Korea Advanced Institute of Science and Technology

Network Computing Laboratory | 16

Kernel CompileKernel Compile

Benchmark DescriptionThe kernel being compiled is version 2.4.18.The kernel on the lookaside device varied

Key characteristics of the Linux kernel versions used in the compilation benchmark

Page 17: Network Computing Laboratory Integrating Portable and Distributed Storage Niraj Tolia, Jan Harkes, Michael Kozuch, and M. Satyanarayanan CMU and Intel.

Korea Advanced Institute of Science and Technology

Network Computing Laboratory | 17

Kernel CompileKernel Compile

Experiment SetupClient

3.0 GHz Pentium 4 processor, 2GB SDRAMFile cache size was large enough to prevent eviction during the experimentsOperated in write-disconnected modeThe client file cache was always cold at the start of an experiment. To discount the effect of a cold I/O buffer cache on the server, a warming run was done prior to each set of experiments.

File Server2.0 GHz Pentium 4 processor, 1GB SDRAM

Both ran Red Hat 9.0 Linux and Coda 6.0.2, and were connected by 100 Mb/s Ethernet

Page 18: Network Computing Laboratory Integrating Portable and Distributed Storage Niraj Tolia, Jan Harkes, Michael Kozuch, and M. Satyanarayanan CMU and Intel.

Korea Advanced Institute of Science and Technology

Network Computing Laboratory | 18

Kernel CompileKernel CompileExperiment Setup

Run at four different bandwidth settings100 Mb/s, 10 Mb/s, 1 Mb/s and 100 Kb/sNISTNet network router to control bandwidthNo extra latency was added at 100 Mb/s and 10 Mb/s. For 1 Mb/s and 100 Kb/s, add 10 ms and 100 ms respectively

Lookaside device512 MB Hi-Speed USB flash memory keychain.Manufacturer claim: read 48 Mb/s, write 36 Mb/s

Measured portable storage device performance

Page 19: Network Computing Laboratory Integrating Portable and Distributed Storage Niraj Tolia, Jan Harkes, Michael Kozuch, and M. Satyanarayanan CMU and Intel.

Korea Advanced Institute of Science and Technology

Network Computing Laboratory | 19

Kernel CompileKernel CompileResult

The performance metric is the elapsed time to compile the 2.4.18 kernel.Negative performance improvement means that the overhead of lookaside caching exceeds its benefit.Since the client cache manager already monitors bandwidth to servers, it would be simple to suppress lookaside at high bandwidths.

Page 20: Network Computing Laboratory Integrating Portable and Distributed Storage Niraj Tolia, Jan Harkes, Michael Kozuch, and M. Satyanarayanan CMU and Intel.

Korea Advanced Institute of Science and Technology

Network Computing Laboratory | 20

Internet Suspend/ResumeInternet Suspend/Resume

Benchmark descriptionThe ISR prototype layers VMware on Coda, and represents VM state as a tree of 256 KB files.Part of the VM state can be copied to the device at suspend.Common Desktop Application (CDA)

Modeling an interactive Windows user.Visual Basic scripting to drive Microsoft Office applications such as Word, Excel, Powerpoint, Access and Internet Explorer.113 independently-timed operations such as find-and-replace, open-document, and save-as-html.

Page 21: Network Computing Laboratory Integrating Portable and Distributed Storage Niraj Tolia, Jan Harkes, Michael Kozuch, and M. Satyanarayanan CMU and Intel.

Korea Advanced Institute of Science and Technology

Network Computing Laboratory | 21

Internet Suspend/ResumeInternet Suspend/Resume

Experiment SetupClients

2.0 GHz Pentium 4 processor, 1GB RAM, Red Hat Linux 7.3 LinuxVMware Workstation 3.1, 8GB Coda file cacheVM is configured to have 256 MB of RAM and 4GB of disk, and runs Windows XP as the guest OS

Server1.2 GHz Pentium III Xeon processor, 1G RAM, Red Hat Linux 7.3 Linux

100 Mb/s Ethernet, NISTNet network emulatorA USB flash memory keychain is updated at suspend with the minimal state needed for resume. This is a single 41 MB file corresponding to the compressed physical memory image of the suspended virtual machine.

Page 22: Network Computing Laboratory Integrating Portable and Distributed Storage Niraj Tolia, Jan Harkes, Michael Kozuch, and M. Satyanarayanan CMU and Intel.

Korea Advanced Institute of Science and Technology

Network Computing Laboratory | 22

Internet Suspend/ResumeInternet Suspend/ResumeResults

Performance metricsResume Latency (How slow is the resume step?)Total Operation Latency (After resource, how much is work slowed?)

VM started

VM usable

End of Benchmark

Resume

Latency

Total Operation

Latency

Time

Resume Latency1 Total Operation Latency2

1 A single 41 MB file corresponding to the compressed physical memory image of the suspended virtual machine (USB flash memory)

2 The VM state captured after installation of Windows XP and the Microsoft Office suite (DVD)

Page 23: Network Computing Laboratory Integrating Portable and Distributed Storage Niraj Tolia, Jan Harkes, Michael Kozuch, and M. Satyanarayanan CMU and Intel.

Korea Advanced Institute of Science and Technology

Network Computing Laboratory | 23

Trace ReplayTrace ReplayBenchmark Description

Experiment SetupSame as the Kernel Compile Experiment

Update Ops: the percentage of operations that change the file system sate such as mkdir, close-after-write, etc.

Working Set: the size of the data accessed during trace execution

Page 24: Network Computing Laboratory Integrating Portable and Distributed Storage Niraj Tolia, Jan Harkes, Michael Kozuch, and M. Satyanarayanan CMU and Intel.

Korea Advanced Institute of Science and Technology

Network Computing Laboratory | 24

Trace ReplayTrace Replay

ResultsPerformance metric

The time taken for trace replay completion

Vary the amount of data found on the device The overhead of lookaside caching dominates

Due to the large number of meta-data accesses

Page 25: Network Computing Laboratory Integrating Portable and Distributed Storage Niraj Tolia, Jan Harkes, Michael Kozuch, and M. Satyanarayanan CMU and Intel.

Korea Advanced Institute of Science and Technology

Network Computing Laboratory | 25

Broader Use of Lookaside CachingBroader Use of Lookaside CachingContent-Addressable Storage (CAS)

ExperimentExtend the prototype implementation

For the ISR benchmark, evaluate the performance benefit of using a LAN-attached CAS provider with same contents as the DVD

CAS provider is on a faster machine than the file server

Cooperative CachingA collection of dfs clients with mutual trust can export each other’s file caches as CAS providers.

Page 26: Network Computing Laboratory Integrating Portable and Distributed Storage Niraj Tolia, Jan Harkes, Michael Kozuch, and M. Satyanarayanan CMU and Intel.

Korea Advanced Institute of Science and Technology

Network Computing Laboratory | 26

DiscussionDiscussion

Internet Suspend/Resume

VM state Part of VM statePart of VM state VM state

Lookaside Caching SoulPad

If you merchandise, which model do you prefer?

Internet Suspend/Resume