Naming (1) Chapter 4. Chapter 4 topics What’s in a name? Approaches for naming schemes Directories...

Post on 02-Jan-2016

220 views 1 download

Tags:

Transcript of Naming (1) Chapter 4. Chapter 4 topics What’s in a name? Approaches for naming schemes Directories...

Naming (1)

Chapter 4

Chapter 4 topics

• What’s in a name?

• Approaches for naming schemes

• Directories and location services

• Distributed garbage collection

Name Space

• Name Space is a general term used for the “space” of all possible names using given rules or constructions.

• For example, “names” are 4 letter words, no numbers or other characters. The name space includes 426 symbols so we can name that many objects in this name space.

Name Spaces

Tanenbaum uses it to mean a graph with a single root node that can store all the names in the name space.

What is a Name?

An identifier that:Identifies a resource

– Uniquely?– Describes the resource?

Enables us to locate that resource– Directly?– With help?

How is the name used? – Disambiguate? Access? Locate?

Names

• Must humans remember or recognize it?

• Is resource static?– Never moves– Change in location should change name– Resource may move– Resource is mobile

• Name vs Identifier vs Address

Approaches to Naming

• Globally unique identifier– Ethernet– Solves identification, but not description or location

• Hierarchically assigned globally unique identifier (hierarchy is location-based)– Telephone number, IP address– Solves identification, not description– Helps with location

Approaches to Naming

• Hierarchically assigned name (hierarchy is description-based)– Domain Name Service, URL– Solves identification– Helps with description– Still problems with location

• Globally unique name– TCP/IP Protocol Ports– Extensibility problems

URI, URL, URN

URI Uniform Resource Identifier– IETF meta-standard

– Defines naming schemes / protocols

– Each naming scheme has it’s own mechanism

URL Uniform Resource Locator– Uses DNS to map to host

– Host knows how to map remainder to resource

URN Uniform Resource Name– Idea: Permanent URL

Naming: Why an Issue for Application Developers?

DNS is widely accepted standard– Only names machines– Doesn’t handle mobility

URI / URN will become standard– Can be descriptive– Globally unique, uses registry– Persistent– But expensive to create

Distributed Database Example: R*

R* developed at IBM Almaden Research – first distributed relational database

Wanted mobility of resources– Supports fault tolerance– But movement rare

Performance is criticalSolution: Two components to name

– Unique ID assigned by “birthplace”– Local catalog maps ID to:

• Birthplace (maintains current location)• Presumed current location

Security Considerations

Does name give away information?– Social Security Numbers– URL– Batched IDs (e.g., Ethernet)– Sequentially assigned IDs

Solution: Define what name SHOULD do– Ensure it meets goals– Look for reasons it doesn’t

Directories

• Unless you use physical locations for names and objects never move, you will need directories.

• How to organize?

• Who uses it? How often?

• How to modify (when object moves)?

X.500: What is it?

Goal: Global “white pages”– Lookup anyone, anywhere– Developed by Telecommunications Industry– ISO standard directory for OSI networks

Idea: Distributed Directory– Application uses Directory User Agent to access a Directory Access Point

Directory Information Base(X.501)

Tree structure– Root is entire directory– Levels are “groups”

• Country• Organization• Individual

Entry structure– Unique name

• Build from tree– Attributes: Type/value pairs– Schema enforces type rules

Alias entries

Linking and Mounting (book)

Position in hierarchy affects performance - search time

Name Space Distribution

An example partitioning of the DNS name space, including Internet-accessible files, into three layers.

Implementation of Name Resolution (1)

Iterative name resolution.

Implementation of Name Resolution (2)

Recursive name resolution. Root name server does more work, but can cache intermediate results for future requests.

Implementation of Name Resolution (3)

The comparison between recursive and iterative name resolution with respect to communication costs.

Naming versus Locating Entities

a) Traditional name service has direct, single level mapping between names and addresses. Works if addresses do not change frequently and names are identifiers.

b) Two-level mapping using a name service for human-friendly name to identify and location service for ID to address.

Home-Based Approaches

The principle of Mobile IP.

4.3 Removing Unreferenced Entities

• Called Distributed Garbage Collection

• Many languages (Java) and distributed middleware systems provide for the recycling of memory objects that are no longer referenced.

• Use terminology of Java RMI with skeletons and proxies.

The Problem of Unreferenced Objects

An example of a graph representing objects containing references to each other.

Garbage Collection in a Centralized System

Simple solution: stop allocation of new objects and deallocation.

• .Mark all objects as unreferenced

• .Go through all pointers and mark referenced objects as referenced.

• .Delete unreferenced objects and resume processing.

Works if all objects and references are on the same machine, but time consuming.

A Little More Efficient

Reference Counting (not distributed)

The object maintains a reference counter.• When an object is created with a reference pointer,

its reference counter is set to one.• Reference counter is increased or decreased as

additional pointers are created or removed.• If the reference counter (RC) goes to zero, the object

is GC’ed.

Problem with scheme: unreachable objects referencing each other.

Distributed Garbage Collection

• In DS the objects and pointers may be on different machines. Difficult to stop processing on one machine, let alone a DS.

• Proxies and skeletons: When a distributed object is created, a skeleton is created for it. When it is referenced, a proxy is created at the client machine (referencer) to talk to the skeleton at the object site.

Proxies and Skeletons

proxyskeleton

process object

Distributed Reference Counting(1)

• Where are the reference counters maintained? How to increase and decrease RC from a remote proxy?

• Soln: The object skeleton will maintain the RC. Messages to the RC must be protected against duplication or loss.

Reference Counting (2)

The problem of maintaining a proper reference count in the presence of unreliable communication.

Reference Counting (3)

(a) Copying a reference to another process and incrementing the counter too late.

(b) A solution, but increased message traffic.

Advanced Reference Counting (1)

• Idea: Eliminate race between increase and decrease messages by having only messages which decrease the count.

• Also, make it possible to copy references without communicating with object. This has advantages and disadvantages.

Advanced Reference Counting (2)

1. When object O is created, it has a TOTAL WEIGHT, TW and PARTIAL WEIGHT, PW. Initially TW = PW = 2^N = 2N.

2. When a reference is created half the PW of the object (TW and PW are stored at the skeleton) is assigned to the proxy at process P1.

3. If a remote reference is duplicated, half the PW at P1 is passed to P2. (skeleton is not aware of this).

4. If remote reference is passed to P2, P2 gets all of the PW. Again, object O doesn’t need to know.

5. When reference is destroyed, message is sent to object’s skeleton to decrement TW by process’s PW.

6. When O’s TW equals its PW, object can be GC’ed.

Advanced Referencing Counting (3)

(a) The initial assignment of weights in weighted reference counting

(b) Weight assignment when creating a new reference.

Advanced Referencing Counting (4)

c) Weight assignment when copying a reference from P1 for P2.

Advanced Reference Counting (5)

• Problem: only a limited number of references (the N in 2N) can be copied in this way without resorting to an additional scheme.

• Additional schemes: Existing reference P1 can create its own skeleton so that it can duplicate more references on its own. However, this gives rise to extra indirections which degrade performance.

Advanced Referencing Counting (6)

Creating an indirection when the partial weight of a reference has reached 1.

Generation Reference Counting (1)

• This scheme solves problem of allowing a process to create an endless number of copies of references.

• Advantages: copies can create copies forever without communicating with the object. Reference creating and destruction requires comm with object or creator but not both.

• Disadvantage: still requires reliable comm.

Generation Reference Counting (2)

• Object skeleton keeps a table G where G[i.] is the number of outstanding references for generation i.

• When an object is created with a reference, that reference is considered generation 1. Any reference created by the object is generation 1.

• When a new reference is created, it is told its generation and its copy counter is initially zero. If it copies the reference for P2, it increments its copy counter. The copy (P2) increments the generation count.

• When a remote reference is deleted, with copy=X, generation=Y, a message is sent to the object skeleton with X and Y. G[Y] is decremented by one for the removed reference. G[Y+1] is increased by X for the X copies made by the process at generation Y. Note: at any given time, G probably will not accurately reflect reality. Also, the counts may be negative.

• When all entries G[i.] are zero, the object can be GC’ed.

Generation Referencing Counting (3)

Creating and copying a remote reference in generation reference counting.