Virtual Nodes: An Approach to Reliability and Availability ... · Page 13 Virtual Nodes j J org...
Transcript of Virtual Nodes: An Approach to Reliability and Availability ... · Page 13 Virtual Nodes j J org...
Christian Spann, Steffen Kachele, Jorg Domaschka, Franz J.
Hauck | [email protected] | September 08, 2010 |Aspectix Research Team, Institute for Distributed Systems,
Ulm University, Germany
Virtual Nodes: An Approach toReliability and Availability
XtreemOS Summer School 2010
Page 2 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
XtreemOS vs Reliability
Targeting large peer-to-peer grids
I Off-the-shelf computers
I Connected via the Internet
I No central infrastructure, fully decentralised
I Churn, unreliable nodes
Unreliable environment
⇒ ? ⇐
Need for reliable services(e.g. security, monitoring, . . .)
Page 2 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
XtreemOS vs Reliability
Targeting large peer-to-peer grids
I Off-the-shelf computers
I Connected via the Internet
I No central infrastructure, fully decentralised
I Churn, unreliable nodes
Unreliable environment
⇒ ? ⇐
Need for reliable services(e.g. security, monitoring, . . .)
Page 2 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
XtreemOS vs Reliability
Targeting large peer-to-peer grids
I Off-the-shelf computers
I Connected via the Internet
I No central infrastructure, fully decentralised
I Churn, unreliable nodes
Unreliable environment ⇒ ? ⇐Need for reliable services(e.g. security, monitoring, . . .)
Page 3 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Questions to answer
I How can reliability be achieved?
Page 4 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Reliability
Snapshots:
I Save state of application from time to time
I In case of failures: load snapshot
But:
I May invalidate client stateI User may experience downtime
I Bad for login or security services
⇒ Reliability 6= Availability
I Which entity monitors the application?I Has to be reliable and availableI Has to be distributed
⇒ Self-containment
Page 4 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Reliability
Snapshots:
I Save state of application from time to time
I In case of failures: load snapshot
But:
I May invalidate client state
I User may experience downtimeI Bad for login or security services
⇒ Reliability 6= Availability
I Which entity monitors the application?I Has to be reliable and availableI Has to be distributed
⇒ Self-containment
Page 4 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Reliability
Snapshots:
I Save state of application from time to time
I In case of failures: load snapshot
But:
I May invalidate client stateI User may experience downtime
I Bad for login or security services
⇒ Reliability 6= Availability
I Which entity monitors the application?I Has to be reliable and availableI Has to be distributed
⇒ Self-containment
Page 4 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Reliability
Snapshots:
I Save state of application from time to time
I In case of failures: load snapshot
But:
I May invalidate client stateI User may experience downtime
I Bad for login or security services
⇒ Reliability 6= Availability
I Which entity monitors the application?I Has to be reliable and availableI Has to be distributed
⇒ Self-containment
Page 5 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Questions to answer
I How can reliability be increased?
I How can availability be increased?
I Is there a self-contained solution?
Page 5 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Questions to answer
I How can reliability be increased?
I How can availability be increased?
I Is there a self-contained solution?
Page 6 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Availability
Replication:
I Availability by redundancy
I Provide identical entities at multiple sites
I Contains snapshots as special case
I Consistency protocol ensures reliability
Page 7 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Outline
Motivation
Replication: An Introduction
Virtual Nodes
Distributed Servers with Virtual Nodes
Conclusions
Page 8 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
What to replicate?
I Data object
I Database
I Computing task
I Remote object
I Service
⇒ Sophisticated algorithms for all fields . . .⇒ . . . and a general model
Page 8 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
What to replicate?
I Data object
I Database
I Computing task
I Remote object
I Service
⇒ Sophisticated algorithms for all fields . . .
⇒ . . . and a general model
Page 8 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
What to replicate?
I Data object
I Database
I Computing task
I Remote object
I Service
⇒ Sophisticated algorithms for all fields . . .⇒ . . . and a general model
Page 9 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
General Replication Model
Phase 1:Clientcontact
Phase 2:ServerCoordination
Phase 3:Execution
Phase 4:AgreementCoordination
Phase 5:Clientresponse
Replica 1
Replica 2
Replica 3
ClientClient
Update
Update
Replication protocol:I Decides on the use of a phase
I Different approaches per phase
I Different demands to the code
Page 9 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
General Replication Model
Phase 1:Clientcontact
Phase 2:ServerCoordination
Phase 3:Execution
Phase 4:AgreementCoordination
Phase 5:Clientresponse
Replica 1
Replica 2
Replica 3
ClientClient
Update
Update
Replication protocol:I Decides on the use of a phase
I Different approaches per phase
I Different demands to the code
Page 10 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
General Replication Model
5 Phases
1. Request: client submits operation
2. Server coordination: synchronize the execution(e.g., message ordering)
3. Execution: operation is executed(by one or more replicas)
4. Agreement coordination: result of the operation(e.g., guarantee atomicity)
5. Response: send outcome back to client
Page 11 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Classification
Active replication:
I State-machine replication
I Decentralised approach
I Request processed by all replicas
I Simple due to symmetry
I Quick reaction to failuresI Demanding with respect to determinism (most of the time)
I Message orderingI Execution order
Page 12 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Classification
Active replication:
Phase 1:ClientRequest
Phase 2:ServerCoordination
Phase 3:Execution
Phase 4:AgreementCoordination
Phase 5:Clientresponse
Replica 1
Replica 2
Replica 3
Client Client
Update
Update
Update
AtomicBroadcast
Page 13 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Classification
Passive replication:
I Primary backup replication
I Centralised approach
I Request processed by a single replica (primary)
I New state/state changes transferred to backups
I Failure of primary requires re-election
I Can handle nondeterminism (sometimes)
Page 14 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Classification
Passive replication:
Phase 1:ClientRequest
Phase 2:ServerCoordination
Phase 3:Execution
Phase 4:AgreementCoordination
Phase 5:ClientResponse
Replica 1
Replica 2
Replica 3
Client Client
Apply
Update
Apply
VS CAST
Page 15 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Outline
Motivation
Replication: An Introduction
Virtual NodesOverviewDeterminismNested InvocationsScheduling
Distributed Servers with Virtual Nodes
Conclusions
Page 16 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Environment
What to replicate?
I Data object
I Database
I Computing task
I Remote object/Service
Why Objects and Services?”Can’t you just use databases?”
I Applications do not need/use stable storage
I Uniform programming model
I Support for legacy applications
Page 16 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Environment
What to replicate?
I Data object
I Database
I Computing task
I Remote object/Service
Why Objects and Services?”Can’t you just use databases?”
I Applications do not need/use stable storage
I Uniform programming model
I Support for legacy applications
Page 17 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Virtual Nodes: XtreemOS Approach to ReliabilityReplication Framework
I Java-based, highly configurable
I Support for changing replica groups
I Support for deterministic multithreading
I Multiple middleware interfaces (CORBA, J-RMI, SOAP, . . .)
I Support for nested invocations (SOA)
I Optimization for read-only invocations
I Self-contained: independent of other nodes and servicesI Service implementation orthogonal to replication
I Except for non-deterministic methodsI Except for state transferI . . .
Page 17 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Virtual Nodes: XtreemOS Approach to ReliabilityReplication Framework
I Java-based, highly configurable
I Support for changing replica groups
I Support for deterministic multithreading
I Multiple middleware interfaces (CORBA, J-RMI, SOAP, . . .)
I Support for nested invocations (SOA)
I Optimization for read-only invocations
I Self-contained: independent of other nodes and servicesI Service implementation orthogonal to replication
I Except for non-deterministic methodsI Except for state transferI . . .
Page 18 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Architecture: Overview
Appl.
Replication Framework
Client MW
Client Repl
Server MW
Server Repl
Impl.
Middleware
Layer
Replication
Layer
ClientReplica/
Server
Page 19 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Architecture: Client-side
Communication Replica Tracker
Network
Marshalling Binding Information
Appl.
Middleware
Layer
Replication
Layer
Page 20 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Architecture: Server-side
Semantics
Client
Communication
Replica
Communication
Network
Marshalling
Dispatching Binding Information
Impl
Group
InformationStrategy
Group Communication
Semantic
Information
Persistency Scheduler
Middleware
Layer
Replication
Layer
Page 21 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Example: the use of time
// remote methodpublic void buyItem(Item item){long timestamp = System.getCurrentTimeMillis();state . store (item, timestamp);
}
// remote methodpublic void deleteOldItems(long olderThan){for (Item item : state . items() ){long timestamp = state.getTimestamp(item);if (timestamp < olderThan) state.remove(item);
}}
Page 22 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Example: the use of time (2)
Active Replication: all replicas execute requests
init
1 2 1 2
Matters when:
I State leaks to the clients
I Non-determinism affects ’critical’ state(i.e., state that is required to initialise a new replica)
Page 22 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Example: the use of time (2)
Active Replication: all replicas execute requests
buyItem(i)
1 2 1 2
3 4
Matters when:
I State leaks to the clients
I Non-determinism affects ’critical’ state(i.e., state that is required to initialise a new replica)
Page 22 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Example: the use of time (2)
Active Replication: all replicas execute requests
deleteOldItems(3)
4
Matters when:
I State leaks to the clients
I Non-determinism affects ’critical’ state(i.e., state that is required to initialise a new replica)
Page 22 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Example: the use of time (2)
Active Replication: all replicas execute requests
deleteOldItems(3)
4
Matters when:
I State leaks to the clients
I Non-determinism affects ’critical’ state(i.e., state that is required to initialise a new replica)
Page 23 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Example: the use of time (3)
Passive Replication: only one replica executes requests
I More resilient to deterministic code(no inconsistency with other replicas)
I But not entirely, cornercases exist(client state may invalidated at leader change)
Page 24 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Determinism — Overview
In general:
I May be an issue with any replication protocol
I Dependent on application semantics
In Java: only with native methods
Yet, plenty of them
I Default hashCode implementation
I Access to file system
I Time and date
I New I/O and new concurrency implementations
I Thread creation, execution, and scheduling
Page 24 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Determinism — Overview
In general:
I May be an issue with any replication protocol
I Dependent on application semantics
In Java: only with native methods
Yet, plenty of them
I Default hashCode implementation
I Access to file system
I Time and date
I New I/O and new concurrency implementations
I Thread creation, execution, and scheduling
Page 25 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Nested Invocations
Invocations of one replicated service to another
Group A Group B
Page 25 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Nested Invocations
Invocations of one replicated service to another
Group A Group B
Page 25 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Nested Invocations
Invocations of one replicated service to another
Group A Group B
Page 26 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Nested Invocation for Active ReplicationNative approach:
I Many messages
I Multiple identical requests: process only onceIndentify and filter at receiver
Better:
I Only one replica issues nested invocations (owner)
I Other replicas are blocked
I Reply is broadcast to other replicas
Issues:
I Determining an owner
I Handling failure of owner
I Identify duplicate invocations(a consequence of failure handling)
I Reply may arrive before request has been triggered
Page 26 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Nested Invocation for Active ReplicationNative approach:
I Many messages
I Multiple identical requests: process only onceIndentify and filter at receiver
Better:
I Only one replica issues nested invocations (owner)
I Other replicas are blocked
I Reply is broadcast to other replicas
Issues:
I Determining an owner
I Handling failure of owner
I Identify duplicate invocations(a consequence of failure handling)
I Reply may arrive before request has been triggered
Page 26 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Nested Invocation for Active ReplicationNative approach:
I Many messages
I Multiple identical requests: process only onceIndentify and filter at receiver
Better:
I Only one replica issues nested invocations (owner)
I Other replicas are blocked
I Reply is broadcast to other replicas
Issues:
I Determining an owner
I Handling failure of owner
I Identify duplicate invocations(a consequence of failure handling)
I Reply may arrive before request has been triggered
Page 27 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Nested Invocation for Active Replication (II)
Vnode approach:
I Owner: the replica contacted by the clientI Failure of owner
I Oldest replica becomes ownerI Re-executes request if reply is not available
I Cache at invoked service for filtering duplicate messagesI Requires deterministic messageIdsI Disallows the invocation of replication-unaware services
I Early arrival of repliesI Ensure that reply can be mapped to a surrounding requestI Group communication properties (total order) guarantee that
surrounding request arrives prior to reply
Page 27 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Nested Invocation for Active Replication (II)
Vnode approach:
I Owner: the replica contacted by the client
I Failure of ownerI Oldest replica becomes ownerI Re-executes request if reply is not available
I Cache at invoked service for filtering duplicate messagesI Requires deterministic messageIdsI Disallows the invocation of replication-unaware services
I Early arrival of repliesI Ensure that reply can be mapped to a surrounding requestI Group communication properties (total order) guarantee that
surrounding request arrives prior to reply
Page 27 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Nested Invocation for Active Replication (II)
Vnode approach:
I Owner: the replica contacted by the clientI Failure of owner
I Oldest replica becomes ownerI Re-executes request if reply is not available
I Cache at invoked service for filtering duplicate messagesI Requires deterministic messageIdsI Disallows the invocation of replication-unaware services
I Early arrival of repliesI Ensure that reply can be mapped to a surrounding requestI Group communication properties (total order) guarantee that
surrounding request arrives prior to reply
Page 27 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Nested Invocation for Active Replication (II)
Vnode approach:
I Owner: the replica contacted by the clientI Failure of owner
I Oldest replica becomes ownerI Re-executes request if reply is not available
I Cache at invoked service for filtering duplicate messagesI Requires deterministic messageIdsI Disallows the invocation of replication-unaware services
I Early arrival of repliesI Ensure that reply can be mapped to a surrounding requestI Group communication properties (total order) guarantee that
surrounding request arrives prior to reply
Page 27 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Nested Invocation for Active Replication (II)
Vnode approach:
I Owner: the replica contacted by the clientI Failure of owner
I Oldest replica becomes ownerI Re-executes request if reply is not available
I Cache at invoked service for filtering duplicate messagesI Requires deterministic messageIdsI Disallows the invocation of replication-unaware services
I Early arrival of repliesI Ensure that reply can be mapped to a surrounding requestI Group communication properties (total order) guarantee that
surrounding request arrives prior to reply
Page 28 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Nested Invocation for Passive Replication
I Straightforward: only one active replica, exceptI Cornercase: Leader crashes during nested invocation:
I Use deterministic message idsI Replies are part of persistent state as long as surrounding
request is still running
Page 29 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Deterministic Scheduling
Replication Requires Determinism
I Multithreading is non-deterministic
I Single-threaded executionI Slow and dead-lock proneI Denies use of condition variables (wait, notify)I Does not make use of multi-cpu/-core machines
Deterministic Multithreading
I Deterministic thread switching: limited concurrencyI Four algorithms with different properties
I Single active thread (SAT, Reiser et al.)I Multiple active threads (MAT, Reiser et al.)I Lose synchronization algorithm (LSA, Basile et al.)I Preemptive deterministic scheduling (PDS, Basile et al.)
I No one-size-fits-all solution
Page 29 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Deterministic Scheduling
Replication Requires Determinism
I Multithreading is non-deterministicI Single-threaded execution
I Slow and dead-lock proneI Denies use of condition variables (wait, notify)I Does not make use of multi-cpu/-core machines
Deterministic Multithreading
I Deterministic thread switching: limited concurrencyI Four algorithms with different properties
I Single active thread (SAT, Reiser et al.)I Multiple active threads (MAT, Reiser et al.)I Lose synchronization algorithm (LSA, Basile et al.)I Preemptive deterministic scheduling (PDS, Basile et al.)
I No one-size-fits-all solution
Page 29 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Deterministic Scheduling
Replication Requires Determinism
I Multithreading is non-deterministicI Single-threaded execution
I Slow and dead-lock proneI Denies use of condition variables (wait, notify)I Does not make use of multi-cpu/-core machines
Deterministic Multithreading
I Deterministic thread switching: limited concurrencyI Four algorithms with different properties
I Single active thread (SAT, Reiser et al.)I Multiple active threads (MAT, Reiser et al.)I Lose synchronization algorithm (LSA, Basile et al.)I Preemptive deterministic scheduling (PDS, Basile et al.)
I No one-size-fits-all solution
Page 29 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Deterministic Scheduling
Replication Requires Determinism
I Multithreading is non-deterministicI Single-threaded execution
I Slow and dead-lock proneI Denies use of condition variables (wait, notify)I Does not make use of multi-cpu/-core machines
Deterministic Multithreading
I Deterministic thread switching: limited concurrencyI Four algorithms with different properties
I Single active thread (SAT, Reiser et al.)I Multiple active threads (MAT, Reiser et al.)I Lose synchronization algorithm (LSA, Basile et al.)I Preemptive deterministic scheduling (PDS, Basile et al.)
I No one-size-fits-all solution
Page 30 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Scheduler Integration
Intercept Java Synchronisation Statements:I synchronized methods and blocks
I synchronized instance methodsI synchronized static methodsI synchronized blocks
I wait(), notify(), and notifyAll() calls
Interception: Replace Statements by Calls to Scheduler
I synchronized: pair of lock/unlock invocations
I All other: simple replacement
I On source code or byte code level
I Transparent to service developer
I Appropriate also for legacy applications
Page 30 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Scheduler Integration
Intercept Java Synchronisation Statements:I synchronized methods and blocks
I synchronized instance methodsI synchronized static methodsI synchronized blocks
I wait(), notify(), and notifyAll() calls
Interception: Replace Statements by Calls to Scheduler
I synchronized: pair of lock/unlock invocations
I All other: simple replacement
I On source code or byte code level
I Transparent to service developer
I Appropriate also for legacy applications
Page 31 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Interception by Code Transformation
public class Queue extends ... {public synchronized
String remove(){
while(data. size ()==0)wait();
return data.remove(0);}
public synchronizedvoid append(String x)
{data.add(x);notify();
}
}
⇒
public class Queue extends ... {public String remove() {
scheduler.lock(this);try {
while(data. size ()==0)scheduler. wait(this);
return data.remove(0);} finally {
scheduler.unlock(this);}
}public void append(String x) {
scheduler.lock(this);try {
data.add(x);scheduler. notify(this);
} finally {scheduler.unlock(this);
}}
}
Page 31 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Interception by Code Transformation
public class Queue extends ... {public synchronized
String remove(){
while(data. size ()==0)wait();
return data.remove(0);}
public synchronizedvoid append(String x)
{data.add(x);notify();
}
}
⇒
public class Queue extends ... {public String remove() {
scheduler.lock(this);try {
while(data. size ()==0)scheduler. wait(this);
return data.remove(0);} finally {
scheduler.unlock(this);}
}public void append(String x) {
scheduler.lock(this);try {
data.add(x);scheduler. notify(this);
} finally {scheduler.unlock(this);
}}
}
Page 32 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Time to Take a Breath
How far we’ve come
I Easy-to-use replication frameworkI Highly configurable
I Replication protocol, scheduler, access protocol
I Support for nested invocationsI Deterministic scheduling for high performance
I Transparently integrated into system
Where we are going to?
I Scheduler optimisations
I (Semi-) automatic detection and removal of non-determinism
I Integration of data-centric replication protocol
I Increased client-side transparency
Page 32 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Time to Take a Breath
How far we’ve come
I Easy-to-use replication frameworkI Highly configurable
I Replication protocol, scheduler, access protocol
I Support for nested invocationsI Deterministic scheduling for high performance
I Transparently integrated into system
Where we are going to?
I Scheduler optimisations
I (Semi-) automatic detection and removal of non-determinism
I Integration of data-centric replication protocol
I Increased client-side transparency
Page 33 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Outline
Motivation
Replication: An Introduction
Virtual Nodes
Distributed Servers with Virtual NodesMotivation: Client TransparencyExcursus: Mobile IPv6Distributed ServersIntegrationDiscussion
Conclusions
Page 34 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Client Transparency
Communication Replica Tracker
Network
Marshalling Binding Information
Appl.
Middleware
Layer
Replication
Layer
I Client has to install additional software
I Application developer has to be aware of replication
I Violates the goal of transparency
Page 35 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Increase Client TransparencyRemove Replica Layer
Semantics
Client
Communication
Replica
Communication
Network
Marshalling
Dispatching Binding Information
Impl
Group
InformationStrategy
Group Communication
Semantic
Information
Persistency Scheduler
Communication Replica Tracker
Marshalling Binding Information
Appl.
”How do clients keep track of service location?”
Page 35 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Increase Client TransparencyRemove Binding Information
Semantics
Client
Communication
Replica
Communication
Network
Marshalling
Dispatching Binding Information
Impl
Group
InformationStrategy
Group Communication
Semantic
Information
Persistency Scheduler
Marshalling/
Communication Binding Information
Appl.
”How do clients keep track of service location?”
Page 35 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Increase Client TransparencyAdd Middleware Proxy
Semantics
Client
Communication
Replica
Communication
Network
Marshalling
Dispatching
Impl
Group
InformationStrategy
Group Communication
Semantic
Information
Persistency Scheduler
Marshalling/
Communication
Appl.
”How do clients keep track of service location?”
Page 35 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Increase Client Transparency
Semantics
Client
Communication
Replica
Communication
Network
Marshalling
Dispatching
Impl
Group
InformationStrategy
Group Communication
Semantic
Information
Persistency Scheduler
Marshalling/
Communication
Appl.
MW Proxy
”How do clients keep track of service location?”
Page 35 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Increase Client Transparency
Semantics
Client
Communication
Replica
Communication
Network
Marshalling
Dispatching
Impl
Group
InformationStrategy
Group Communication
Semantic
Information
Persistency Scheduler
Marshalling/
Communication
Appl.
MW Proxy
”How do clients keep track of service location?”
Page 36 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Replica Tracking
Use a location service?
I How is the location service being tracked?
Use client-side daemon?
I Will work most of the time
I Still no guarantee
I Additional traffic due to polling
I No fix address: initial contact difficult
Our Approach: Exploit Mobile IPv6
I Uses standardized techniques
I Does not require any modifications at client side
I Provided by XtreemOS Distributed Servers
Page 36 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Replica Tracking
Use a location service?
I How is the location service being tracked?
Use client-side daemon?
I Will work most of the time
I Still no guarantee
I Additional traffic due to polling
I No fix address: initial contact difficult
Our Approach: Exploit Mobile IPv6
I Uses standardized techniques
I Does not require any modifications at client side
I Provided by XtreemOS Distributed Servers
Page 36 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Replica Tracking
Use a location service?
I How is the location service being tracked?
Use client-side daemon?
I Will work most of the time
I Still no guarantee
I Additional traffic due to polling
I No fix address: initial contact difficult
Our Approach: Exploit Mobile IPv6
I Uses standardized techniques
I Does not require any modifications at client side
I Provided by XtreemOS Distributed Servers
Page 37 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Mobile IPv6 in a Nutshell
Mobile nodes reachable while away from home networksI Correspondent node (CN): any node talking to mobile node
Mobile Node: Two AddressesI Home address (HoA): identifies mobile node, never changes
I Careof address (CoA): represents mobile node’s current location
Transparency for High-level Protocols:I Mobile nodes addressed by HoA
I IP-level translates HoA to CoA
I Location changes are annouced by the mobile node
”Sounds nice, but how does the IP-level know?”
Page 37 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Mobile IPv6 in a Nutshell
Mobile nodes reachable while away from home networksI Correspondent node (CN): any node talking to mobile node
Mobile Node: Two AddressesI Home address (HoA): identifies mobile node, never changes
I Careof address (CoA): represents mobile node’s current location
Transparency for High-level Protocols:I Mobile nodes addressed by HoA
I IP-level translates HoA to CoA
I Location changes are annouced by the mobile node
”Sounds nice, but how does the IP-level know?”
Page 37 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Mobile IPv6 in a Nutshell
Mobile nodes reachable while away from home networksI Correspondent node (CN): any node talking to mobile node
Mobile Node: Two AddressesI Home address (HoA): identifies mobile node, never changes
I Careof address (CoA): represents mobile node’s current location
Transparency for High-level Protocols:I Mobile nodes addressed by HoA
I IP-level translates HoA to CoA
I Location changes are annouced by the mobile node
”Sounds nice, but how does the IP-level know?”
Page 37 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Mobile IPv6 in a Nutshell
Mobile nodes reachable while away from home networksI Correspondent node (CN): any node talking to mobile node
Mobile Node: Two AddressesI Home address (HoA): identifies mobile node, never changes
I Careof address (CoA): represents mobile node’s current location
Transparency for High-level Protocols:I Mobile nodes addressed by HoA
I IP-level translates HoA to CoA
I Location changes are annouced by the mobile node
”Sounds nice, but how does the IP-level know?”
Page 38 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Mobile IPv6 in a Nutshell (II)
Home Agent (HA)I Router in home network
I Mobile node informs HA about CoA
I Knows mapping from HoA to CoA
”Hey, wait a second! You do use a central entity! Isn’t thischeating!?!”
Yes, but . . .I Routers are not switched off spontaniously
I Routers run a small software system and tend to be less buggy
I No network depends on a single router
Page 38 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Mobile IPv6 in a Nutshell (II)
Home Agent (HA)I Router in home network
I Mobile node informs HA about CoA
I Knows mapping from HoA to CoA
”Hey, wait a second! You do use a central entity! Isn’t thischeating!?!”
Yes, but . . .I Routers are not switched off spontaniously
I Routers run a small software system and tend to be less buggy
I No network depends on a single router
Page 38 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Mobile IPv6 in a Nutshell (II)
Home Agent (HA)I Router in home network
I Mobile node informs HA about CoA
I Knows mapping from HoA to CoA
”Hey, wait a second! You do use a central entity! Isn’t thischeating!?!”
Yes, but . . .I Routers are not switched off spontaniously
I Routers run a small software system and tend to be less buggy
I No network depends on a single router
Page 39 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Distributed Servers
Distributed Server (Vrije Universiteit Amsterdam)
I Group of nodes pretending to be a mobile node
I Identified by home address
I Node addresses represent careof addresses
Features
I One node registers at home agent (contact node)
I Nodes can hand back and forth single connections(cooperatively)
I Contact node can change (cooperatively)
Page 39 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Distributed Servers
Distributed Server (Vrije Universiteit Amsterdam)
I Group of nodes pretending to be a mobile node
I Identified by home address
I Node addresses represent careof addresses
Features
I One node registers at home agent (contact node)
I Nodes can hand back and forth single connections(cooperatively)
I Contact node can change (cooperatively)
Page 40 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Outline
Motivation
Replication: An Introduction
Virtual Nodes
Distributed Servers with Virtual NodesMotivation: Client TransparencyExcursus: Mobile IPv6Distributed ServersIntegrationDiscussion
Conclusions
Page 41 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Integrated Approach
Benefit:
I Virtual Nodes: fault-tolerance for Distributed Servers
I Distributed Servers: anycast mechanism for Virtual Nodes
Facts:
I Handover requires an old socket state⇒ Replication of state
I Only reasonable with active replication
Failure Detection:
I Minimize experienced downtime: change contact node quickly
I Minimize false positives: exclude group members slowly
Page 41 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Integrated Approach
Benefit:
I Virtual Nodes: fault-tolerance for Distributed Servers
I Distributed Servers: anycast mechanism for Virtual Nodes
Facts:
I Handover requires an old socket state⇒ Replication of state
I Only reasonable with active replication
Failure Detection:
I Minimize experienced downtime: change contact node quickly
I Minimize false positives: exclude group members slowly
Page 41 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Integrated Approach
Benefit:
I Virtual Nodes: fault-tolerance for Distributed Servers
I Distributed Servers: anycast mechanism for Virtual Nodes
Facts:
I Handover requires an old socket state⇒ Replication of state
I Only reasonable with active replication
Failure Detection:
I Minimize experienced downtime: change contact node quickly
I Minimize false positives: exclude group members slowly
Page 42 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Invocation
1. Client sends request to contact node
2. Contact node copies socket state
3. Contact node broadcasts request and socket
4. All nodes process request
5. Contact node sends reply to client
6. Contact node broadcasts new socket state
Page 43 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Discussion
Fault-tolerance:
I No fault-tolerance during steps 1 and 2I Steps 3 – 5: Handover reveals #bytes sent and received
I Allows to send remaining bytes of reply
Minimal overhead (copying socket)
I Step 6 purely for garbage collection
I Piggyback on other requests
Changing contact node
I No effect on client
I Other replicas need to know
I Causes an additional group message
Page 43 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Discussion
Fault-tolerance:
I No fault-tolerance during steps 1 and 2I Steps 3 – 5: Handover reveals #bytes sent and received
I Allows to send remaining bytes of reply
Minimal overhead (copying socket)
I Step 6 purely for garbage collection
I Piggyback on other requests
Changing contact node
I No effect on client
I Other replicas need to know
I Causes an additional group message
Page 43 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Discussion
Fault-tolerance:
I No fault-tolerance during steps 1 and 2I Steps 3 – 5: Handover reveals #bytes sent and received
I Allows to send remaining bytes of reply
Minimal overhead (copying socket)
I Step 6 purely for garbage collection
I Piggyback on other requests
Changing contact node
I No effect on client
I Other replicas need to know
I Causes an additional group message
Page 44 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Conclusions
XtreemOS:I Challenge for reliability and availability
I Replication can solve both issues
XtreemOS Virtual Nodes:I Configurable replication framework for fault-tolerance
I Support for multiple middleware systems at client-side
I Deterministic multithreading
XtreemOS Distributed Servers:I Anycast due to mobile IPv6
I Group of nodes pretends to be a mobile node
I Handing over of connections
Integration:I Both systems are orthogonal
I Increases client-side transparency
Page 44 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Conclusions
XtreemOS:I Challenge for reliability and availability
I Replication can solve both issues
XtreemOS Virtual Nodes:I Configurable replication framework for fault-tolerance
I Support for multiple middleware systems at client-side
I Deterministic multithreading
XtreemOS Distributed Servers:I Anycast due to mobile IPv6
I Group of nodes pretends to be a mobile node
I Handing over of connections
Integration:I Both systems are orthogonal
I Increases client-side transparency
Page 44 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Conclusions
XtreemOS:I Challenge for reliability and availability
I Replication can solve both issues
XtreemOS Virtual Nodes:I Configurable replication framework for fault-tolerance
I Support for multiple middleware systems at client-side
I Deterministic multithreading
XtreemOS Distributed Servers:I Anycast due to mobile IPv6
I Group of nodes pretends to be a mobile node
I Handing over of connections
Integration:I Both systems are orthogonal
I Increases client-side transparency
Page 44 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Conclusions
XtreemOS:I Challenge for reliability and availability
I Replication can solve both issues
XtreemOS Virtual Nodes:I Configurable replication framework for fault-tolerance
I Support for multiple middleware systems at client-side
I Deterministic multithreading
XtreemOS Distributed Servers:I Anycast due to mobile IPv6
I Group of nodes pretends to be a mobile node
I Handing over of connections
Integration:I Both systems are orthogonal
I Increases client-side transparency
Page 45 Virtual Nodes | Jorg Domaschka | Aspectix Research Team, Ulm University | 2010-07-08
Papers
I Matthias Wiesmann et al: Understanding Replication in Databases andDistributed Systems. ICDCS ’00
I Hans P. Reiser et al: Consistent Replication of Multithreaded DistributedObjects. SRDS’06
I Hans P. Reiser et al: Deterministic Multithreading for Replicated CORBAObjects. PDCS’06
I Claudio Basile et al: Active Replication of Multithreaded Applications.Transactions on Parallel and Distributed Systems, May 2006
I Michal Szymaniak et al: Enabling Service Adaptability with VersatileAnycast. Concurrency and Computation: Practice and Experience,September 2007.
I Jorg Domaschka et al: Multithreading Strategies for Replicated Objects.Middleware ’08.
I Jorg Domaschka et al: Virtual Nodes: a Re-Configurable ReplicationFramework for Highly-available Grid Services. Middleware (Companion)’08.