Distributed systems - University of Cambridge...– Reliabilityand persistencypart of the...

13
Distributed systems Lecture 8: PubSub; Security; NASD/AFS/Coda Dr Robert N. M. Watson 1 Last time Looked at replication in distributed systems Strong consistency: Approximately as if only one copy of object Requires considerable coordination on updates Transactional consistency & quorum systems Weak consistency: Allow clients to potentially read stale values Some guarantees can be provided (FIFO, eventual, session), but at additional cost to availability Amazon/Google case studies: Dynamo, MapReduce, BigTable, Spanner. 2

Transcript of Distributed systems - University of Cambridge...– Reliabilityand persistencypart of the...

Page 1: Distributed systems - University of Cambridge...– Reliabilityand persistencypart of the programming model • In effect the model being implemented by many Content Distribution Networks

DistributedsystemsLecture8:PubSub; Security;NASD/AFS/Coda

Dr RobertN.M.Watson

1

Lasttime• Lookedatreplicationindistributedsystems• Strongconsistency:– Approximatelyasifonlyonecopyofobject– Requiresconsiderablecoordinationonupdates– Transactionalconsistency&quorumsystems

• Weakconsistency:– Allowclientstopotentiallyreadstalevalues– Someguaranteescanbeprovided(FIFO,eventual,session),butatadditionalcosttoavailability

• Amazon/Googlecasestudies:Dynamo,MapReduce,BigTable,Spanner.

2

Page 2: Distributed systems - University of Cambridge...– Reliabilityand persistencypart of the programming model • In effect the model being implemented by many Content Distribution Networks

Publish-subscribe(PubSub)• Getmoreflexibilitywithpublish-subscribe:

– Publishers advertiseandpublishevents– Subscribers registerinterestintopics (i.e.propertiesofevents)– Event-service notifiessubscribersofrelevantpublishedevents

• Similartoreliablemulticast,withoutorderingfocus:– Asynchronousstructure– Allowsone-to-manycommunication– Dynamicmembership:publishers/subscribersjoining/leaving

• Sometimesdescribedascontent-centricnetworking– Engagesnotjusthosts,butalsonetworkrouters– Focusisondata,notnetworkmessaging– Reliability andpersistency partoftheprogrammingmodel

• IneffectthemodelbeingimplementedbymanyContentDistributionNetworks(CDNs)suchasAkami,Netflix

3

Publish-subscribe:prosandcons• PubSub usefulfor‘adhoc’systemssuchasembedded

systemsorsensornetworks:– Client(s)can‘listen’foroccasionalevents– Don’tneedtodefinesemanticsofentiresysteminadvance(e.g.

whattodoifgetevent<X>)– Promotedinrecentresearchforhigher-levelapplications

• Leadstonatural“reactive”programming:– When<X>,<Y>occurthendo<Z>– Event-drivensystemslikeApama canhelpunderstandbusiness

processesinreal-time• But:

– Canbeawkwardtouseifapplicationdoesn’tfit– Anddifficulttomakeperformwell…

4

Page 3: Distributed systems - University of Cambridge...– Reliabilityand persistencypart of the programming model • In effect the model being implemented by many Content Distribution Networks

Distributed-systemsecurity• Distributedsystemsspanadministrativedomains;contentfrom

manyusersandorganizations• Naturaltoextendauthentication,accesscontrol,audit,to

distributedsystem,butcanwe:– Distributelocalnotionsofauser overmanymachines?– Enforcesystem-widepropertiessuchaspersonaldataprivacy?– Allowsystemsoperatedbydifferentpartiestointeractsafely?– Notrequirethatnetworksbesafefrommonitoring/tampering?– Toleratecompromiseasubsetofnodesinthesystem?– Providereliableservicetomostusersevenwhenunderattack?– Acceptandtoleratenation-stateactorsasadversaries?

• Forasystemtooffersecureservices,itmust(itself)besecure– TrustedComputingBase(TCB)– theminimumsoftware(orhardware)

requiredforasystemtobesecure

5

Accesscontrol

• Distributedsystemsmaywanttoallowaccesstoresourcesbasedonasecuritypolicy

• Aswithlocalsystems,threekeyconcepts:– Identification:whoyouare(e.g.username)– Authentication:provingwhoyouare(e.g.password)– Authorization:determiningwhatyoucando

• Canconsiderauthoritytocoveractionsanauthenticatedsubjectmayperformonobjects– AccessMatrix =setofrows,onepersubject,whereeachcolumnholdsallowedoperationsonsomeobject

6

Page 4: Distributed systems - University of Cambridge...– Reliabilityand persistencypart of the programming model • In effect the model being implemented by many Content Distribution Networks

Recall:access-controlmatrix

• A(i,j)– Rowsrepresentprincipals(sometimesgroups)– Columnsrepresentobjects– Cell(i,j)containaccessrightsofrowi onobjectj

• Accessmatrixistypicallylarge&sparse:– Justkeepnon-NULLentriesbycolumnorbyrow

7

Object1 Object2 Object3 …

User1 +read

User2 +read +write +read

Group1 -read +read+write…

Accesscontrollists(ACLs)

• Keepcolumns:foreachobject,keeplistofsubjectsandallowableaccess

• ACLsstoredwithobjects(e.g.localfilesystem)• Keyprimitives:get/set• Likeaguestlistonthedoorofanightclub• ACLchangeshould(arguably)immediatelygrant/denyfurtheraccess–Whatdoesthismeanfordistributedsystems?

8

Page 5: Distributed systems - University of Cambridge...– Reliabilityand persistencypart of the programming model • In effect the model being implemented by many Content Distribution Networks

Capabilities

• Capabilitiesareunforgeable tokensofauthority– Keeprows:foreachsubjectS,keeplistofobjects/allowableaccesses

– Capabilitiesstoredwithsubjects(e.g.processes)– Bitlikeakeyoraccesscardthatyoucarryaround

• Keyprimitive:delegation– Clientcandelegatecapabilitiesitholdstootherclients(orservers)inthesystemtoactonitsbehalf

– Downside:revocationmaynowbemorecomplex

9

Accesscontrolindistributedsystems

• Singlesystemsoftenhavesmallnumberofusers(subjects)andlargenumberofobjects:– E.g.ahundredofusersinaUnixsystem– Tracksubjects(e.g.userIDs)andstoreACLswithobjects(e.g.files)

• Distributedsystemsarelarge&dynamic:– Canhavehuge(andunknown?)numberofusers– Interactionsvianetwork– noexplicit‘login’orper-userprocess

• Capabilitymodelisamorenaturalfit:– Clientpresentscapabilitywithrequestforoperation– Systemonlyperformsoperationifcapabilitychecksout– AvoidsynchronousRPCstocheckidentities/access-controlpolicies

• Notmutuallyexclusive:ACLsasapolicyforgrantingcapabilities• Can’ttrustnodesorlinks:relyoncryptographywithsecretkeys

10

Page 6: Distributed systems - University of Cambridge...– Reliabilityand persistencypart of the programming model • In effect the model being implemented by many Content Distribution Networks

Cryptographiccapabilities• Howcanwemakecapabilitiesunforgeable?• Capabilityservercouldissuecapabilities

– Userpresentscredentials(e.g.,username,password)andrequestscapabilitiesrepresentingspecificrights

– e.g.capabilityserverhassecretkeyk andaone-wayfunctionf()– Issuesacapability<ObjID,access,f(k,ObjID,access)>– Simpleexampleisf(k,o,a)=SHA256(k|o|a)

• Clienttransmitscapabilitywithrequest– Ifobjectserverknowsk,cancheckoperation

• Canusesamecapabilitytoaccessmanyservers– Andoneservercanuseitonyourbehalf(e.g.,webtiercan

requestobjectsfromstoragetieronuser’sbehalf)• Morematureschememightusepublickeycrypto(why?)

11

Distributedcapabilityexample:NASD

• Network-AttachedSecureDisks(NASD)– Gibson,etal1997(CMU)• Clientsaccessremotedirectlydisksratherthanviathroughservers• “FileManager”grantsclientsystemscapabilitiesdelegatingdirect

accesstoobjectsonnetwork-attacheddisks12

BlockServer

Client

FileManager

FileManageraccounts:UserID1,PW1UserID2,PW2…

BlockServer

2.Clientenclosescapabilitywithrequesttoauthorizeit

1.Clientexchangescredentialsforcryptographiccapabilitytoobject

FileManagerandBlock

Serveragreeonsecretk

Page 7: Distributed systems - University of Cambridge...– Reliabilityand persistencypart of the programming model • In effect the model being implemented by many Content Distribution Networks

Capabilities:prosandcons• Relativelysimpleandprettyscalable• Allowanonymousaccess(i.e.serverdoesnotneedtoknowidentityofclient)– Andhenceeasilyallowsdelegation

• Howeverthisalsomeans:– Capabilitiescanbestolen(unauthorizedusers)…– …andaredifficulttorevoke(likesomeonecuttingacopyofyourhousekey)

• Canaddresstheseproblemsby:– Havingtime-limitedvalidity(e.g.30seconds)– Incorporatingversionintocapability,andstoringversionwiththeobject:increasingversion=>revokeallaccess

13

CombiningACLsandcapabilities

• RecalloneproblemwithACLswasinabilitytoscaletolargenumberofusers(subjects)

• Howeverinpracticewemayhaveasmall-ishnumberofauthoritylevels– e.g.moderatorversuscontributoronchatsite

• Role-BasedAccessControl(RBAC):– Have(small-ish)well-definednumberofroles– StoreACLsatobjectsbasedonroles– Allowsubjectstoenter rolesaccordingtosomerules– Issuecapabilitieswhichattesttocurrentrole

14

Page 8: Distributed systems - University of Cambridge...– Reliabilityand persistencypart of the programming model • In effect the model being implemented by many Content Distribution Networks

Role-basedaccesscontrol(RBAC)• Generalideaisverypowerful– Separates{principal→role },{role →privilege }– Developersofindividualservicesonlyneedtofocusontherightsassociatedwitharole

– Easilyhandlesevolution(e.g.anindividualmovesfrombeinganundergraduatetoanalumnus)

• Possibletohavesophisticatedrulesforroleentry:– e.g.enterdifferentroleaccordingtotimeofday– orentirerolehierarchy(1Bstudent<=CSTstudent)– orparametric/complexroles(“thedoctorwhoiscurrentlytreatingyou”)

15

Single-systemsignon• Distributedsystemsinvolvemanymachines

– Frustratingtohavetoauthenticatetoeachone!• Single-systemsignoneasesuserburdenwhilemaintainingsecurity

– E.g.Kerberos,MicrosoftActiveDirectoryletyouauthenticatetoasingledomaincontroller

– Bootstrapusingapasswordorprivatekey/certificateonsmartcard– Getasessionkeyandaticket (~=acapability)– Ticketisforaccesstotheticket-grantingserver(TGS)– Whenwishtoe.g.logontoanothermachine,oraccessaremote

volume,s/wasksTGS foraticketforthatresource– Notice:principalsmightcouldbeusers…orservices

• Otherwide-area“federated”schemes– Multi-realmKerberos,OpenID,Shibboleth

16

Page 9: Distributed systems - University of Cambridge...– Reliabilityand persistencypart of the programming model • In effect the model being implemented by many Content Distribution Networks

AFSandCoda• TwoCMUdistributedfilesystemsthathelpedcreateourunderstandingofdistributed-systemscalability– AFS:AndrewFileSystem“campus-wide”scalability– Coda:Addwritereplication,weaklyconnectedorfullydisconnectedoperationformobileclients

• Scaledistributedfilesystemstoglobalscaleusingamaturesetofconcurrentanddistributed-systemideas

• RPC,close-to-opensemantics,pureandimpurenames,explicitcachemanagement,security,versionvectors,optimisticconcurrency,multicast,journaling,…

17

TheAndrewFileSystem(AFS)• CarnegieMellonUniversity(1980s)addressperformance,

scalability,securityweaknessesofNFS• Global-scaledistributedfilesystem

– /afs/cs.cmu.edu/user/rnw,/afs/ibm.com/public– Cells transparentlyincorporatedozensorhundredsofservers– Clientstransparentlymergenamespacesandhidefile

replication/migration– Authentication/accesscontrolw/Kerberos,groupservers– Cryptographicprotectionofallcommunications– Maturenon-POSIXfilesystemsemantics(close-to-open,ACLs)

• Stillinuseatlargeinstitutionstoday;opensourcedasOpenAFS• InspirationmanyaspectsofDistributedComputingEnvironment

(DCE),Microsoft’sDistributedFileSystem(DFS),andNFSv4

18

Page 10: Distributed systems - University of Cambridge...– Reliabilityand persistencypart of the programming model • In effect the model being implemented by many Content Distribution Networks

Fileserver pool

Ubik quorumdatabases

AFS3per-cellarchitecture• Client-serverandserver-serverRPC• Ubik quorumdatabaseforauthentication,

volumelocation,andgroupmembership• Namespacepartitionedintovolumes;e.g.,

/afs/cmu.edu/user/rnw/public_htmltraversesfourvolumes

• UniqueViceIDs:{CellID,VolumeID,FID}• Volumeserversallowlimitedredundancy

orhigher-performancebulkfileI/O:– read-writeonasingleserver (~rnw)– read-onlyreplicasonmultipleservers (/bin)

• Inter-serversnapshottingallowsin-usevolumestobemigrated(withclienthelp)

19

DBserver

DBserver DBserver

FileserverFileserver

Fileserver

Fileserver

Localcachefiles

Persistentclient-sidecachinginAFS

• AFSimplementspersistentcachesonclient-sidedisks• Vnode operationsonremotefilesareredirectedtolocalcontainerfilesforlocalI/Operformance

• Non-POSIXClose-to-opensemanticsallowwritestobesenttotheserveronlyonclose()

20

Synthesized/afs namespace

a

/

afsandrew.cmu.edu

athena.mit.edu

usr cache

b

c

ab

c

Page 11: Distributed systems - University of Cambridge...– Reliabilityand persistencypart of the programming model • In effect the model being implemented by many Content Distribution Networks

AFScallbackpromises

• Serversissuecallbackpromisesonfilesheldinclientcaches• Whenafileserverreceivesawrite-close()fromoneclient,it

issuescallbacks toinvalidatecopiesinotherclientcaches• UnlikeNFS,nosynchronousRPCisrequiredwhenopeninga

cachedfile:ifcallbackisnotbeenbroken,cacheisfresh• However,clientwrite-close()issynchronous:can’treturn

untilcallbacksacknowledgedbyotherclients– why?21

Client1

Client2

Client3

Fileserver

a2

a1

a1

a2

TheCodaFileSystem• DevelopedatCarnegieMellonUniversityinthe1990s• Startingpoint:open-sourcedAFS2fromIBM• Improveavailability throughoptimisticreplicationandclient-sidecaching/journaling:– Improveavailabilitythroughread-writereplication– Improveperformanceforweaklyconnectedclients– Supportmobile(sometimes)fullydisconnectedclients

• Exploitnewnetworkfeaturestoimproveperformance:– MulticastRPCtoefficientlysendRPCstogroupsofservers

• Keydesignchallenge:tradeoffexposingweakconsistencytouserinreturnforavailability

22

Page 12: Distributed systems - University of Cambridge...– Reliabilityand persistencypart of the programming model • In effect the model being implemented by many Content Distribution Networks

Codaread-writeserverreplication• VolumeStorageGroups(VSGs)ratherthanper-volumeservers• Eachfilehasaversionvector

– Likeavectorclockonlyper-object ratherthanperprocess– Eachvectorentrycorresponds tooneVSGserver’sversionofthefile

• ReachableVSGsubsetistheAccessibleVolumeStorageGroup(AVSG)• Clientsreadfromanyserver,multicastwrites toall

– Whenfullyonline (AVSG=VSG),close()issynchronous;writesordered– Onpartition/serveroutage(AVSG� VSG),writesarestillpermitted– Asserversrecover,clientaccesstriggersserver-serverresolution– Ifversionvectorsallowcausalordertobeestablished,automaticresolution– Mostnon-causaldirectoryconflictscanbeautomaticallyresolved(why?)– Forfiles,user-directedorapplication-specificconflictresolutionisrequired

• Whatifauserisaskedtoresolveaconflictonafiletheydidn’tmodify?

23

Codadisconnectedoperation• Mobilecomputing– butdeviceshadweakorintermittentconnectivity• Codaallowedclientoperationstocontinueagainstthepersistentcache

evenwhenoperatingdisconnected(AVSG=�)• Hoarding:priortogoingoffline,userscanprovideCodawithpolicyasto

whichfilesshouldbepreemptivelyloadedintothecache(e.g.,user~)• OfflinewritesareloggedintheClientModificationLog(CML)

– Whengoingbackonline, CMLisreplayedagainstAVSG (reintegration)– CMLoptimizationdeletesNOPsequences:e.g.,create+delete atempfile– Client-serverconflicts,aswithserver-server,aredetectedviaversionvectors– User/applicationmusthandleconflictsthatcan’tberesolvedautomatically– Isthisbetterthantheserver-serverconflictresolution case?

• IfEthernetunplugged,softwarebuildsgofaster– why?– Clevertrickforweaklyconnectedclients:ifnetworkisbottleneck,takevolume

offlineandlogchanges,trickling thembackasynchronouslyuntil caughtup• TheseideashaveinfluencedsystemslikeMicrosoft’s“offlinefolders”

24

Page 13: Distributed systems - University of Cambridge...– Reliabilityand persistencypart of the programming model • In effect the model being implemented by many Content Distribution Networks

Summary(1)• Distributedsystemsareeverywhere• Coreproblemsinclude:– Inherentlyconcurrentsystems– Anymachinecanfail…– …ascanthenetwork(orpartsofit)– Andwehavenonotionofglobaltime

• Despitethis,wecanbuildsystemsthatwork– Basicinteractionsarerequest-response– CanbuildsynchronousRPC/RMIontopofthis…– Orasynchronousmessagequeuesorpub/sub

25

Summary(2)

• Coordinatingactionsoflargersetsofcomputersrequireshigher-levelabstractions– Processgroupsandorderedmulticast– Consensusprotocols,and– ReplicationandConsistency

• Variousmiddlewarepackages(e.g.CORBA,EJB)provideimplementationsofmanyofthese:– Butworthknowingwhat’sgoingon“underthehood”

• Recenttrendstowardsevenhigher-level:– MapReduce andfriends

26