1 Mobile File Systems: Disconnected and Weakly Connected File Systems 3/29/2004 Richard Yang.

Mobile File Systems:

Disconnected and Weakly

Connected File Systems

3/29/2004

Richard Yang

Outline

Admin. and recap Mobile file systems

Admin.

Project proposal due tonight at 11:59pm

at most one page• what is the problem?• why is the problem important?• what are the major potential challenges?• what is your methodology?

please send proposal to the TA• david.goldenberg@yale.edu

TCP: congestion control in Internet TCP is window-based

• to use the stability of self-clocking TCP adjusts congestion window using the AIMD

algorithm • AIMD is a special case of the simplest possible control

rules• AIMD constantly probes for network state

– achieves dynamic equilibrium– converges to fair state

Throughput of TCP is inverse proportional to the square root of packet loss rate in wireless networks, losses due to corruption are

interpreted as congestion indication, and thus slow down transmission

indirect TCP splits connection snoop TCP preserves end-to-end semantics

Outline

Admin. and recap Mobile file systems

Server (maintains a collection of files/objects)

Client (Inserts, Deletes and Updates files/objects connecting to the server)

Network File Systems

NFS assumes “strong” connectivity

Motivation

Mobile users must be able to work on files (on remote file servers) while disconnected/weakly connected, e.g. take your laptop on a trip

The Problems Caused by Disconnection

Read miss stalls progress (the user has to stop working)

Delayed write may cause inconsistency if concurrent writes

by multiple users are allowed

To reduce read misses, persistently store files in local caches this is called hoarding

The idea of hoarding was proposed in the CMU CODA project http://www.coda.cs.cmu.edu/

Discussion: what problems should the CODA system address?

Using Hoarding to Reduce Read Miss

Volume is the unit for management (hoarding), e.g., the home directory of one user a volume is smaller than a disk partition typical volume size is 10MB

Each volume is a partial sub tree of the name space%cfs makemount u.smith /coda/usr/smith

CODA Groups Files into Volumes

HOARDING

EmulationReintegration

Disconnection

Physical Reconnection

Logical Reconnection

Hoard data in anticipationof disconnectionPrioritized cache management

Log replayResolving conflicts

(write/write)Seek user feedback in doubt

Persistent storageClient Modification Log

CODA Client (Venus)

Flow Steps of Client Operations

Shows the case that the network is connected.

File Servers and Replicated Servers CODA uses replicated

file servers to improve reliability

A volume is stored by a group of servers called its Volume Storage Group (VSG)

Read/write read-one write-many

AVSG: All accessible VSG members

Read: Serving a Cache Miss

A read also installs a callback at each server so that the serverwill call back if the content changes

Two Phase Update

COP1 (CODA optimistic phase 1): send writeCOP2: sends the status from all servers

Summary: CODA

Outline

Admin. and recap Mobile file systems: dealing with

disconnection CODA SEER: automatic prediction of related files to

avoid user manual configuration of hoarding

SEER: A Predictive Hoarding System

Views user activities as composed of projects than individual files

Predicates files in a project and fetch them together

Discussion: how do you predicate all of the files a project may use?

Basic Idea of SEER: Semantic Distance Quantifies user’s intuition about

relationship between files smaller closer in relation

Infers relationship static (done by an external investigator), e.g.,

• observes directory structure/membership• observes naming convention• #include in a program

dynamic• watches user’s behavior

Lifetime Semantic Distance Looks at file open/close (not file content !!) Lifetime semantic distance:

The lifetime semantic distance between an open of file A and an open of file B is defined as 0 if A has not been closed before B is opened and the number of intervening file opens (including the open of B) otherwise

End up with multiple lifetime semantic distances between two events of two files needs distance between two files, not events uses geometric mean to convert to a single distance

Sample file access sequence

Semantic distance- AB , AC is 0- AD is 3

Basic Idea of SEER: Clustering Algorithm Based on algorithm by

Jarvis and Patrick Allows overlapping clusters Steps

calculates n nearest neighbors for each file

Phase 1: if two points (files here) have at least kn overlapping neighbors, combine their clusters into one

Phase 2: if two points have more than kf but less than kn overlapping neighbors, overlap the clusters i.e. add each to the other cluster

Relation Action

kn ≤x

kf≤x<kn

Combine clustersOverlapping clustersNo action

Summary of clustering algorithm

Example

Seven files , A-G{A} {B} {C} {D} {E} {F}

Phase 1: {A, B} {A, B, C}{D, E} {F, G} {D,E,F, G}

Phase 2:two pairs {A, C} {C, D}

{A, C} : same cluster already{C, D} overlap clusters

Final result {A, B, C, D} {C,D, E, F,G}

Number of shared neighbors

From ToA B C D E F G

ABCDEFG

kn kf kn kf kn

Using Both Lifetime Semantic Distance and the Input of External Investigator

Essentially gives application specific info

Example large directory distance => looser

relationship• subtract directory distance from shared neighbor

Real World Anomalies: Special Cases Many special cases

authors use a heuristic to solve each

Shared libraries e.g. : library X might cause unwanted clustering Heuristic: files which represent more than a

certain percentage of all references marked as “frequently-referenced” (1%)

• eliminate from calculation

Critical files (e.g. : startup files) rarely accessed but important use heuristic and hoard

• special control file that specifies such files• detect by names e.g. .login etc

Temporary files (e.g. : in /tmp) transient and don’t depict correct relationship might displace other important files from n closest heuristic: ignore files in /tmp etc. completely

Simultaneous access e.g. : read mail & compile code independent streams are intermixed ! maintain reference-history on a per-process basis

More Special Cases …

Performance Evaluation: Methodology

Inputtrace-driven simulation

MeasureMiss-free hoard size

• size a hoard would have to be to ensure no misses (remember our goal!)

Results

Graph : sorted working set sizes Seer consistently slightly more than working set size

Outline

disconnection CODA: hoarding SEER: automatic prediction of related files to

avoid user manual configuration of hoarding Bayou: automatic conflict update

Bayou: Managing Update Conflicts

Basic idea: application specific conflict detection and update

Two mechanisms for automatic conflict detection and resolution dependency check merge procedure

Bayou Write Operation: An Example

Outline

disconnection CODA: hoarding SEER: automatic prediction of related files to

avoid user manual configuration of hoarding Bayou: automatic conflict update

Mobile file systems: dealing with low bandwidth LBFS: efficient file comparison and merging

Motivation

The CODA system assumes that modifications are kept as logs (CML) a user sends the logs to the servers to update

If the storage of a client is limited, it may not be able to save logs then upon reconnection, the cache manager needs to

find the difference between the stored file and its local cached copy

same problem exists for the rsync tool !

Question: how to efficiently compare the differences of two remote files (when the network connection is slow)?

LBFS: Low-Bandwidth File System

Break Files into chunks and transfer only modified chunks

Fixed chunk size does not work well why?

Flexible Chunk Size

Compute hash value of every 48 byte block if the hash value equals to a magic value, it is

a chunk boundary

1 Mobile File Systems: Disconnected and Weakly Connected File Systems 3/29/2004 Richard Yang.

Documents

Transcript of 1 Mobile File Systems: Disconnected and Weakly Connected File Systems 3/29/2004 Richard Yang.

Maritime Operations in Disconnected, Intermittent and Low ...€¦ · Maritime Operations in Disconnected, Intermittent and Low-bandwidth Environments Stephan Lapic Spawar Systems

Localized chaotic patterns in weakly dissipative magnetic ...Localized chaotic patterns in weakly dissipative magnetic systems D. Laroze1,2 1Instituto de Alta Investigación, Universidad

Weakly endochronous systems Dumitru Potop-Butucaru IRISA, France Joint work with A. Benveniste and B. Caillaud.

Totally Disconnected Sierpinski Relatives · Introduction Distinguishing Totally Disconnected Relatives Conclusions and Future Work Totally Disconnected Sierpinski Relatives T.D.

Disconnected Defaulters

SFT2841 disconnected

Towards Weakly Consistent Local Storage Systems · Towards Weakly Consistent Local Storage Systems Ji-Yong Shin Cornell University jyshin@cs.cornell.edu Mahesh Balakrishnan Yale University

Supporting Disconnected Operations in Publish/Subscribe Systems

Recognizing weakly chordal graphsskeide/rec_wcg.pdfChordal graphs are a class of graphs which, among other things, is important for solving sparse linear systems of equations. Weakly

Young But Disconnected

Disconnected Youth - Sonoma State Universityweb.sonoma.edu/.../disconnected/disconnected-youth.pdfpercent of disconnected young men and 43 percent of disconnected young women. Wald

Exploring Weakly Labeled Images for Video Object Segmentation …cvteam.net/papers/2018_TIP_Zhang-Exploring Weakly Labeled... · 2020. 7. 16. · Exploring Weakly Labeled Images for

Disconnected Applicationblog.stikom.edu/meli/files/2011/12/6.-Disconnected-Application.pdfDisconnected Data Access Architecture of ADO.NET •ADO.NET introduces the concept of disconnected

Weakly connected dominating set-based secure clustering ...networking.khu.ac.kr/layouts/net/publications/data/Weakly connected... · Weakly connected dominating set-based secure clustering

Weakly linear systems of fuzzy relation inequalities and ... · Weakly linear systems of fuzzy relation inequalities and their applications: A brief survey Jelena Ignjatovic´a, Miroslav

Building Systems and Pathways for Disconnected Youth · PDF fileBuilding Systems and Pathways for Disconnected Youth ... Building Systems and Pathways for Disconnected Youth ... individuals

Weakly Supervised Learning for Attribute Localization in Outdoor …shuo.wang/files/Weakly Supervised... · Weakly Supervised Learning for Attribute Localization in Outdoor Scenes

Disconnected Futures: Exploring notions of ethical ...orca.cf.ac.uk/43031/2/Shirani - Disconnected Futures...Disconnected Futures: Exploring notions of ethical responsibility in energy

IBM Disconnected Log Collector: IBM Disconnected Log ...4 IBM Disconnected Log Collector: IBM Disconnected Log Collector Guide System requirements for Disconnected Log Collector IBM

Weakly-supervised Discovery of Visual Pattern Configurationspapers.nips.cc/paper/5284-weakly-supervised-discovery-of... · 2014-12-03 · Weakly-supervised Discovery of Visual Pattern