Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

48
Implementing Implementing Remote Procedure Remote Procedure Calls Calls Birrell and Nelson Birrell and Nelson Presentation By: Presentation By: Khawaja Shams Khawaja Shams
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    216
  • download

    1

Transcript of Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

Page 1: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

Implementing Implementing Remote Procedure Remote Procedure

CallsCalls

Birrell and NelsonBirrell and Nelson

Presentation By:Presentation By:Khawaja Shams Khawaja Shams

Page 2: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

The IdeaThe Idea• Extend procedure calls to enable Extend procedure calls to enable

transfer of control & data across a transfer of control & data across a networknetwork

• High Level overview:High Level overview:– Pass the parameters across the network Pass the parameters across the network

to invoke procedure remotelyto invoke procedure remotely– Pass the result back to callerPass the result back to caller

Page 3: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

Why ProceduresWhy Procedures• Clean and Simple SemanticsClean and Simple Semantics• Efficiency through simplicityEfficiency through simplicity• GeneralityGenerality

Page 4: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

Issues to ConsiderIssues to Consider• Precise semantics of callsPrecise semantics of calls• Semantics of address containing Semantics of address containing

arguments w/o shared address spacearguments w/o shared address space• Integration with existing/future Integration with existing/future

programming languagesprogramming languages• BindingBinding• Suitable protocols for data/control Suitable protocols for data/control

transfer between caller/calleetransfer between caller/callee• Data integrity and securityData integrity and security

Page 5: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

ComponentsComponents• CedarCedar• DoradosDorados• EthernetEthernet

– 3 Mbps and 10Mbps3 Mbps and 10Mbps

• Assumes most communication will Assumes most communication will happen on local ethernethappen on local ethernet

Page 6: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

AimsAims

• Make distributed computation easyMake distributed computation easy• Provide communication with as Provide communication with as

much ease as local procedure callsmuch ease as local procedure calls– No added complexities to the No added complexities to the

programmer (timeout, etc)programmer (timeout, etc)• Encourage people to build more Encourage people to build more

distributed systemsdistributed systems

Page 7: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

Secondary AimsSecondary Aims• Make RPC communication efficientMake RPC communication efficient

– Within a factor of 5 beyond necessary Within a factor of 5 beyond necessary transmission times of a networktransmission times of a network

• Secure CommunicationsSecure Communications– Few distributed systems attempted this in the Few distributed systems attempted this in the

pastpast– No RPC systems had attempted thisNo RPC systems had attempted this

Page 8: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

DecisionsDecisions

• Procedure calls vs. Message PassingProcedure calls vs. Message Passing• Parallel ParadigmParallel Paradigm• Shared Address space between Shared Address space between

computerscomputers– Hard to integrate with the programming Hard to integrate with the programming

languagelanguage– Could exploit address mapping Could exploit address mapping

mechanisms of virtual memorymechanisms of virtual memory– Conclusion: Feasible, but does not seem Conclusion: Feasible, but does not seem

like it is worth the extra work like it is worth the extra work

Page 9: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

RPC StructureRPC Structure1.1. UserUser

2.2. User stubUser stub1.1. Places specs of target procedure and Places specs of target procedure and

arguments into packetsarguments into packets

3.3. RPC CommunicationsRPC Communications• Reliably passed packets to the callee Reliably passed packets to the callee

4. Server stub4. Server stub

5.5. ServerServer

Page 10: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

Simple Call over RPCSimple Call over RPC

Page 11: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

BindingBinding

• Naming: What?Naming: What?– Type Type – Instance Instance

• Location: Where?Location: Where?– Statically include location in appStatically include location in app

• Binds too earlyBinds too early– Broadcast Broadcast

• Too much interference for bystandersToo much interference for bystanders• Not convenient for binding to machines Not convenient for binding to machines

not on the local networknot on the local network– Distributed databaseDistributed database

Page 12: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

Binding – cont.Binding – cont.• Grapevine Grapevine

– Distributed database that is widely and Distributed database that is widely and reliably availablereliably available

– Replicates dataReplicates data– Entries are of two types: Entries are of two types:

• Groups (corresponds to Types)Groups (corresponds to Types)• Individuals (corresponds to Instance)Individuals (corresponds to Instance)

Page 13: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

Binding on the CalleeBinding on the Callee• Exports maintained in a tableExports maintained in a table• For each export table contains:For each export table contains:

– Interface nameInterface name– Dispatcher procedure from server stubDispatcher procedure from server stub– 32 bit machine relative unique id of the export32 bit machine relative unique id of the export

• When caller imports:When caller imports:– Give it 1) the 32 bit id, and 2) the indexGive it 1) the 32 bit id, and 2) the index– This servers as a verification schemeThis servers as a verification scheme

Page 14: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

Binding: In ActionBinding: In Action

Page 15: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

A few words on …A few words on …

• No heavy state managementNo heavy state management– With many importsWith many imports– With idle users (we’ll get to this)With idle users (we’ll get to this)

• AuthenticationAuthentication– Through GrapevineThrough Grapevine

• When to bindWhen to bind– User specifies only type and not instanceUser specifies only type and not instance– Instance is an RName (so lookup at bind Instance is an RName (so lookup at bind

type)type)– Specify a network address in the appSpecify a network address in the app– Dynamically instantiate interfaces & Dynamically instantiate interfaces &

import them.import them.

Page 16: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

PAUSEPAUSE

• Is RPC hiding too many details from the Is RPC hiding too many details from the programmers? programmers?

• Distributed Communication with No Distributed Communication with No Timeout?Timeout?

• RPC vs. Local Procedure CallsRPC vs. Local Procedure Calls– Looks likeLooks like– Feels like it (maybe…)Feels like it (maybe…)– However, it is not a local call!However, it is not a local call!

• Paper claims that it removes unnecessary Paper claims that it removes unnecessary difficulties, but it leaves real difficulties difficulties, but it leaves real difficulties of distributed systems. of distributed systems. – These difficulties can be easily overlooked These difficulties can be easily overlooked

with this level of abstractionwith this level of abstraction

Page 17: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

Transport ProtocolTransport Protocol• Existing ProtocolsExisting Protocols

– Work, but not fast enoughWork, but not fast enough– Meant for bulk data transferMeant for bulk data transfer

• Custom ProtocolsCustom Protocols– Potentially 10 X fasterPotentially 10 X faster– Focus on minimizing elapsed time b/w request Focus on minimizing elapsed time b/w request

and responseand response– Quick setup and teardownQuick setup and teardown– Minimal stateMinimal state– Make servers scalableMake servers scalable

Page 18: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

GuaranteesGuarantees

• If call returns:If call returns:– Procedure executed exactly onceProcedure executed exactly once

• ElseElse– Exception reported to userException reported to user– Procedure invoked 0 or 1 timesProcedure invoked 0 or 1 times– No upper bound on how long to wait for No upper bound on how long to wait for

resultsresults• Unless communication breakdownUnless communication breakdown• No time out if server deadlocksNo time out if server deadlocks• Argument: make it as similar to local calls as Argument: make it as similar to local calls as

possiblepossible

Page 19: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

Simple CallsSimple Calls• Focus on efficiency for calls where Focus on efficiency for calls where

all the arguments/results fit in one all the arguments/results fit in one package eachpackage each

• Reduce explicit ACKS for Reduce explicit ACKS for quick/frequent requestsquick/frequent requests– Response can be the ack for requestResponse can be the ack for request– Next request can be the ack for Next request can be the ack for

responseresponse

Page 20: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

The Call IdentifierThe Call Identifier• <Machine Id, PID, seq #><Machine Id, PID, seq #>• Allows:Allows:

– eliminations of duplicate calls packetseliminations of duplicate calls packets– Caller to determine if the result packet is the Caller to determine if the result packet is the

result of the current requestresult of the current request

• Activity = <Machine_id, Process>Activity = <Machine_id, Process>– Nice property: each activity can have at most Nice property: each activity can have at most

one procedure executing remotelyone procedure executing remotely– Call disregarded if the seq # is not greater Call disregarded if the seq # is not greater

than the last seq # used for activitythan the last seq # used for activity

Page 21: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

Simple Call IllustratedSimple Call Illustrated

Page 22: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

Connection MaintenanceConnection Maintenance

• During Idle Connections:During Idle Connections:– Only state on callee is an entry in the Only state on callee is an entry in the

table <activity, seq#>table <activity, seq#>– Caller only has the machine wide Caller only has the machine wide

countercounter– No pings required to maintain this stateNo pings required to maintain this state

• No Teardown!No Teardown!

Page 23: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

Complicated CallsComplicated Calls• If the calls takes too long:If the calls takes too long:

– Caller periodically sends a probe packet to callee Caller periodically sends a probe packet to callee that the callee acksthat the callee acks

– First probe sent after ~ 1 RTT. Duration of First probe sent after ~ 1 RTT. Duration of subsequent probe transmission increases gradually subsequent probe transmission increases gradually until there is one probe/5 minutesuntil there is one probe/5 minutes

– Probes retransmitted n times if no ackProbes retransmitted n times if no ack– Notice how this ONLY detects communication Notice how this ONLY detects communication

failurefailure

• Could have also done this on the callee side: Could have also done this on the callee side: transmit a keep alive if response is not transmit a keep alive if response is not generated quicklygenerated quickly

Page 24: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

Complicated Calls – Cont.Complicated Calls – Cont.

• What if the arguments don’t fit in one What if the arguments don’t fit in one packet?packet?– Sent in multiple packets, each requesting an Sent in multiple packets, each requesting an

explicit ack except the last oneexplicit ack except the last one– Caller and Callee send packts alternativelyCaller and Callee send packts alternatively

• Problem: only one packet in the bufferProblem: only one packet in the buffer– Really bad for networks with high bandwidth, but Really bad for networks with high bandwidth, but

high latency high latency – Remember: this was optimized for simple callsRemember: this was optimized for simple calls– Mitigated by multiplexing through multiple Mitigated by multiplexing through multiple

processesprocesses

• Future research: automatically switch to a Future research: automatically switch to a bulk data transfer protocol for large bulk data transfer protocol for large requestsrequests

Page 25: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

Complicated Calls – Cont.Complicated Calls – Cont.

Page 26: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

Exception HandlingException Handling• Callee can return exception packet to Callee can return exception packet to

callercaller– Packet handled by RPCRuntime on callerPacket handled by RPCRuntime on caller– RPCRuntime raises the exception in the RPCRuntime raises the exception in the

proper processproper process– If there is a catch phrase, it is executed and If there is a catch phrase, it is executed and

results are passed to callee. If catch results are passed to callee. If catch phrase jumps out, callee is notifiedphrase jumps out, callee is notified

– RPCRuntime can also raise exception RPCRuntime can also raise exception during communication failureduring communication failure

Page 27: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

SecuritySecurity• Grapevine is the authentication/ key Grapevine is the authentication/ key

distribution servicedistribution service• Full End to End encryptionFull End to End encryption

– Confidentiality and integrityConfidentiality and integrity

• Complex security details left for Complex security details left for another paperanother paper

Page 28: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

Performance ResultsPerformance Results• 2 Dorados connected by 3Mbps 2 Dorados connected by 3Mbps

Ethernet Ethernet • Lightly loaded network, 5-10% Lightly loaded network, 5-10%

capacitycapacity• Accomplished 2Mbps on the 3Mbps Accomplished 2Mbps on the 3Mbps

network through parallel callsnetwork through parallel calls• Did not measure cost of Did not measure cost of

export/importexport/import

Page 29: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

Performance ResultsPerformance Results

Page 30: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

Performance of Performance of FireFly RPCFireFly RPC

By Shroeder and BurrowsBy Shroeder and Burrows

Presentation by Khawaja ShamsPresentation by Khawaja ShamsDraws From Presentation byDraws From Presentation by

Swati AgarwalSwati Agarwal

Page 31: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

GoalsGoals

• Make RPC much faster Make RPC much faster – Otherwise, developers may be tempted Otherwise, developers may be tempted

to design their own communication to design their own communication protocolsprotocols

• Good same machine performanceGood same machine performance– FireFly RPC costs 60x a local procedure FireFly RPC costs 60x a local procedure

callcall– Compatible with Bershad’s system: 20x Compatible with Bershad’s system: 20x

latencylatency

Page 32: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

HardwareHardware

• Multiple VAX processors can access Multiple VAX processors can access a shared memory system through a a shared memory system through a coherent cachecoherent cache

• 5 Micro VAX II CPUs: one connected 5 Micro VAX II CPUs: one connected to a QBus I/O busto a QBus I/O bus

• Network access through a DEQNA Network access through a DEQNA device controller that connects QBus device controller that connects QBus to a 10 Mbps ethernet. to a 10 Mbps ethernet.

Page 33: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

MeasurementsMeasurements

• PROCEDURE: Null( ) ;PROCEDURE: Null( ) ;– Measures base latency of RPCMeasures base latency of RPC

• PROCEDURE: MaxResult(VAR OUT buf: ARRAY OF PROCEDURE: MaxResult(VAR OUT buf: ARRAY OF CHAR)CHAR)– Measurers server to caller throughputMeasurers server to caller throughput– No argument, 1440 byte array resultNo argument, 1440 byte array result

• PROCEDURE: MaxArg(VAR IN buf: ARRAY of CHAR)PROCEDURE: MaxArg(VAR IN buf: ARRAY of CHAR)– Measures caller to server throughputMeasures caller to server throughput– 1440 byte array as arg, no result1440 byte array as arg, no result

• Notice the array size : 1440Notice the array size : 1440– Considering the max ethernet packet size of 1514, and Considering the max ethernet packet size of 1514, and

the 74 byte header, this fills the packet completelythe 74 byte header, this fills the packet completely

Page 34: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

ResultsResults

Page 35: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

Comments on ResultComments on Result

• Base latency of RPC 2.6msBase latency of RPC 2.6ms• Notice how the throughput almost Notice how the throughput almost

doubles when going from single thread doubles when going from single thread to two threadsto two threads

• Max bandwidth results with 6 threads: Max bandwidth results with 6 threads: 4.7 Mbps4.7 Mbps

• 7 threads accomplish 741 RPCs/s7 threads accomplish 741 RPCs/s• Are these tests really a true indicator Are these tests really a true indicator

of RPC throughput? of RPC throughput?

Page 36: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

Notes on VARsNotes on VARs

• Var Out: If results fits in a single Var Out: If results fits in a single packet, then the server stub passed packet, then the server stub passed address of result in the result packet address of result in the result packet buffer to the server procedure, so buffer to the server procedure, so that server can directly write to the that server can directly write to the buffer (avoids copy)buffer (avoids copy)

Page 37: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

Marshalling TimeMarshalling Time

Marshalling time scales linearly with size Marshalling time scales linearly with size for simple argsfor simple args

Notice the huge latency due to memory allocation and invoking library functions

Page 38: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

Steps in RPCSteps in RPC

• CallerCaller– Obtain a packet buffer with partially filled Obtain a packet buffer with partially filled

in buffer, Marshal arguments into the in buffer, Marshal arguments into the packet, Transmit, Unmarshal result from packet, Transmit, Unmarshal result from results packet, Send packet back to poolresults packet, Send packet back to pool

• ServerServer– Unmarshall the call’s argument, hand the Unmarshall the call’s argument, hand the

argument to the server component, when argument to the server component, when the server component returns, marshall the server component returns, marshall the results in the result packet (the saved the results in the result packet (the saved call packet)call packet)

Page 39: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

Transporter MechanismTransporter Mechanism

• TransporterTransporter– Fill RPC header in call packetFill RPC header in call packet– Call Sender - fills in other headersCall Sender - fills in other headers– Send packet on Ethernet (queue it, notify Send packet on Ethernet (queue it, notify

Ethernet controller)Ethernet controller)– Register outstanding call in RPC call table, Register outstanding call in RPC call table,

wait for result packet (not part of RPC fast wait for result packet (not part of RPC fast path)path)

• Packet-arrival interrupt on serverPacket-arrival interrupt on server• Wake server thread - ReceiverWake server thread - Receiver• Return result (send+receive)Return result (send+receive)

Page 40: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

Reducing LatencyReducing Latency

• Usage of direct assignments rather than calling Usage of direct assignments rather than calling library procedures for marshallinglibrary procedures for marshalling

• Starter, Transporter and Ender through Starter, Transporter and Ender through procedure variables not through table lookupprocedure variables not through table lookup

• Interrupt routine wakes up correct threadInterrupt routine wakes up correct thread– OS doesn’t de-multiplex incoming packetOS doesn’t de-multiplex incoming packet

• For Null(), going through OS (for both call and result) For Null(), going through OS (for both call and result) takes 4.5 mstakes 4.5 ms

• Server reuses call packet for resultServer reuses call packet for result• RPC packet buffers reside in shared memory to RPC packet buffers reside in shared memory to

eliminate the need for extra address mapping eliminate the need for extra address mapping operations or copying when doing RPC. operations or copying when doing RPC. – Can be a security issue in a time shared environmentCan be a security issue in a time shared environment

Page 41: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

Accounting for the timeAccounting for the time

Page 42: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.
Page 43: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

• Write fast path Write fast path code in assembly code in assembly not in Modula2+not in Modula2+– Speeded up by a Speeded up by a

factor of 3factor of 3– Application Application

behavior behavior unchangedunchanged

Page 44: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

Improvement Improvement SpeculationSpeculation

• Different Network ControllerDifferent Network Controller• Save 11 % on Null() and 28 % on MaxResultSave 11 % on Null() and 28 % on MaxResult

• Faster Network – 100 Mbps EthernetFaster Network – 100 Mbps Ethernet• Null – 4 %, MaxResult – 18%Null – 4 %, MaxResult – 18%

• Faster CPUsFaster CPUs• Null – 52 %, MaxResult – 36 %Null – 52 %, MaxResult – 36 %

• Omit UDP checksumsOmit UDP checksums• Ethernet controller occasionally makes Ethernet controller occasionally makes

errorserrors

• Redesign RPC ProtocolRedesign RPC Protocol

Page 45: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

• Omit layering on IP and UDPOmit layering on IP and UDP• Busy Wait – caller and server Busy Wait – caller and server

threadsthreads• Time for wakeup can be savedTime for wakeup can be saved

• Recode RPC run-time routinesRecode RPC run-time routines

Page 46: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

Varying Numbers of Varying Numbers of ProcessorsProcessors

Page 47: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

CommentsComments

• Reducing caller processors from 5 to Reducing caller processors from 5 to 2 increases the latency by only 10%2 increases the latency by only 10%

• Sharp jump in latency for uni-Sharp jump in latency for uni-processor callerprocessor caller

• Uni processors are not the focus of Uni processors are not the focus of Firefly RPCFirefly RPC

• Fast path is abandoned on a Uni Fast path is abandoned on a Uni Processor often by lock conflictsProcessor often by lock conflicts

Page 48: Implementing Remote Procedure Calls Birrell and Nelson Presentation By: Khawaja Shams.

ConclusionConclusion

• Main Objective:Main Objective:– Make RPC primary communication Make RPC primary communication

mechanism between address spaces mechanism between address spaces (inter machine and same machine)(inter machine and same machine)

• Make RPC fast so developers have no Make RPC fast so developers have no excuse not to use itexcuse not to use it

• Functions and characteristics of RPC Functions and characteristics of RPC maintainedmaintained

• Not the faster RPC in the gameNot the faster RPC in the game