Content Delivery Infrastructure - web.eecs.umich.eduCDN.pdfContent Delivery Infrastructure ......
-
Upload
trinhthuan -
Category
Documents
-
view
219 -
download
0
Transcript of Content Delivery Infrastructure - web.eecs.umich.eduCDN.pdfContent Delivery Infrastructure ......
Computer Networks
Lecture9:HTTP
ContentDeliveryInfrastructure
Peer-to-peer(p2p):• hybridp2pwithacentralizedserver• purep2p• hierarchicalp2p• end-host(p2p)multicastContent-DistributionNetwork(CDN)• HTTPOverview• HTTPPerformance• HTTPCaching• ContentDistributionNetwork
AWebPageAwebpageconsistsofabaseHTML-filewhichmayincludereferencestooneormoreobjects• anobjectcanbeanotherHTMLfile,aJPEGimage,aJavaapplet,anaudiofile,aflashvideo,etc.
• eachobjectisaddressablebyaURL• exampleURL:http://www.mgoblue.com/images/pic.gif
hostname pathnameprotocol
HTTPOverviewHTTP:HyperTextTransferProtocol• Web’sapplication-layerprotocol• client/servermodel
• client:browserthatrequests,receives,and“displays”Webobjects
• server:sendsobjectsinresponsetorequests• HTTP1.0:RFC1945
• HTTP1.1:RFC2068
• HTTP/2:RFC7540(May2015)
PCrunningFirefox
ServerrunningApacheWebserver
MacrunningSafari
UsesTCP:• clientinitiatesTCPconnection(createssocket)toserver,port80
• serveracceptsTCPconnectionfromclient• HTTPmessages(application-layerprotocolmessages)exchangedbetweenbrowser(HTTPclient)andWebserver(HTTPserver)
• TCPconnectionclosedHTTPis“stateless”• servermaintainsnoinformationaboutpastclientrequests
HTTPOverview
Protocolsthatmaintain“state”arecomplex!• pasthistory(state)mustbemaintained
• ifserver/clientcrashes,theirviewsof“state”maybeinconsistent,andmustbereconciled
aside
TwotypesofHTTPmessages:request,response
HTTPrequestmessage:• inASCII(human-readableformat)• generalformat:
HTTP1.xRequestMessage
GET /somedir/page.html HTTP/1.1 Host: www.someschool.edu User-agent: Mozilla/4.0 Connection: close Accept-language: fr (extra carriage return, line feed)
Carriagereturn,linefeedindicatesendofmessage
example
MethodTypes(HTTP1.1)• GET,POST,HEAD• PUT
• uploadsfileinentitybodytopathspecifiedinURLfield• DELETE
• deletesfilespecifiedintheURLfield
Uploadingform,inputalternatives:1. POSTmethod:
• webpagesoftenincludeforminput• inputisuploadedtoserverinentitybody
2. asparametertoGETURLmethod:• inputisuploadedinURLfieldofrequestline:www.somesite.com/animalsearch?monkeys&banana
inputparameters
HTTP1.x ResponseMessageExample
HTTP/1.1 200 OK Connection close Date: Thu, 06 Aug 1998 12:00:15 GMT Server: Apache/1.3.0 (Unix) Last-Modified: Mon, 22 Jun 1998 …... Content-Length: 6821 Content-Type: text/html data data data data data ...
firstline:statusline(protocolstatuscode,statusphrase)
headerlines
data,e.g.,requestedHTMLfile
blankline
HTTP1.xResponse:StatusLineHTTP-version3-digit-response-codeReason-phrase• 1XX–informational• 2XX–success
• 200 OK:requestsucceeded,requestedobjectlaterinthismessage• 3XX–redirection
• 301 Moved Permanently:requestedobjectmoved,newlocationspecifiedlaterinthismessage(“Location:”inheader)
• 303 Moved Temporarily • 304 Not Modified
• 4XX–clienterror• 400 Bad Request:requestmessagenotunderstoodbyserver• 404 Not Found:requesteddocumentnotfoundonthisserver
• 5XX–servererror• 505 HTTP Version Not Supported
Client-sideStates:Cookies
HTTPis“stateless”• servermaintainsnoinformationaboutpastclientrequests• butsometimesitmaybeusefultokeepper-clientstates,forexamplefor:• authorization
• shoppingcarts
• wishlist
• recommendations
• usersessionstate(Webe-mail)
StatesoruserID(tolookupserver-sidestates)keptatclientsideusingcookies
Client-sideStates:CookiesFourcomponents:1. cookieheaderlineintheHTTPresponsemessage2. cookieheaderlineinHTTPrequestmessage3. cookiefilekeptonclienthostandmanagedbyclientbrowser4. back-enddatabaseatWebserver
client amazonserver
cookie-specificaction,e.g.,wishlist
cookie-specificaction
usualhttpresponse+Set-cookie:1678
usualhttprequestmsg
usualhttprequestmsgcookie:1678
usualhttpresponsemsg
usualhttprequestmsgcookie:1678
usualhttpresponsemsg
servercreatesID1678foruser
Cookiefile
amazon:1678ebay:8734
Cookiefileebay:8734
Cookiefile
amazon:1678ebay:8734
oneweeklater:
“Abuse”ofCookiesExcellentmarketingopportunitiesandconcernsforprivacy:• cookiespermitsitestolearnalotaboutyou• youmayunknowinglysupplypersonalinfotosites• advertisingcompaniestracksyourpreferencesandviewinghistoryacrosssites,examplescenario:• adcompanycontractedwith(1)asocialnetworkingsite,(2)abookstore,and(3)aclothingstore
• youviewyourfriend’stravelphotostoHawaiiatthesocialnetworkingsite
• whenyouvisitthebookstore,atravelbookaboutHawaiiispushedtoyou
• whenyouvisittheclothingstore,aswimminggoggleispushedtoyou
• atallthreeplacesatravelagency’sextra-lowprice,expiringin30seconds,Hawaiivacationpackageispushedtoyou
ObjectRequestResponseTimeRTT(round-triptime):timeforasmallpackettotravelfromclienttoserverandbackResponsetime:• 1RTTtoinitiateTCPconnection• 1RTTforHTTPrequestandthefirstfewbytesofHTTPresponsetoreturn
• filetransmissiontime• ����� =2RTT+transmittime
timetotransmitfile
initiateTCPconnection
RTT
requestfileRTT
filereceived
time time
HTTP1.0HTTP1.0usesnon-persistentconnections:• atmostoneobjectissentoveraTCPconnection• objecttransmissioncompletiondetectedbyrecv()returning0(connectionclosed)
• whyisthisnotagooddesign?
Client Server
SYN SYN
SYN
SYN
ACK
ACK
ACK
ACK
ACK
DAT
DAT
DAT
DAT
FIN
ACK
0 RTT
1 RTT
2 RTT
3 RTT
4 RTT
Serverreadsfromdisk
FIN
Serverreadsfromdisk
ClientopensTCPconnection
ClientsendsHTTPrequestforHTML
ClientparsesHTMLClientopensTCPconnection
ClientsendsHTTPrequestforimage
Imagebeginstoarrive
HTTP1.1HTTP1.1usespersistentconnections:• serverleavesconnectionopenaftersendingresponses• subsequentHTTPmessagesbetweenthesameclient/servertofetchmultipleobjectsaresentoverthesameconnection
Client Server
ACK
ACK
DAT
DAT
ACK
0 RTT
1 RTT
2 RTT
ServerreadsfromdiskClientsendsHTTPrequestforHTML
ClientparsesHTMLClientsendsHTTPrequestforimage
Imagebeginstoarrive
DAT Serverreadsfromdisk
DAT
HowtoMarkEndofMessage?Threeoptions:
Content-Lengthinheader:
HTTP/1.1 200 OK Connection close Date: Thu, 06 Aug 1998 12:00:15 GMT Server: Apache/1.3.0 (Unix) Last-Modified: Mon, 22 Jun 1998 …... Content-Length: 6821 Content-Type: text/html
HowtoMarkEndofMessage?Impliedlength,e.g.,304(cachefresh)neverhascontent
Transfer-Encoding:chunked(HTTP1.1)• afterheaders,eachchunkcomprisescontentlengthinhex,CRLF,thenbody;length0indicatesend-of-chunk
HTTP/1.1 200 OK<CRLF> Transfer-Encoding: chunked<CRLF> <CRLF> 25<CRLF> This is the data in the first chunk<CRLF> 1A<CRLF> and this is the second one<CRLF> 0<CRLF>
PipelinedandParallelConnectionsPersistentwithoutpipelining:• clientissuesnewrequestonlywhenpreviousresponsehasbeenreceived
• oneRTTforeachreferencedobject
Persistentwithpipelining:• clientsendsrequestsassoonasitencountersareferencedobject
• aslittleasoneRTTforallreferencedobjects• defaultinHTTP1.1
BrowserscanopenparallelTCPconnectionstofetchreferencedobjects(eveninHTTP1.0)
HTTPModelingAssumeWebpageconsistsof:• 1baseHTMLpage(ofsizeLbits)• Mimages(eachalsoofsizeLbits)
Non-persistentHTTP:• M+1TCPconnectionsinseries• responsetime= (M+1)*2*RTT + (M+1)*L/µ,
µ:pathspeed
PersistentHTTP(withpipelining):• 2RTTstorequestandreceivebaseHTMLfile• 1RTTtorequestandreceiveMimages• responsetime = 3*RTT + (M+1)*L/µ
HTTPModelingAssumeWebpageconsistsof:• 1baseHTMLpage(ofsizeLbits)• Mimages(eachalsoofsizeLbits)
Non-persistentHTTPwithnparallelconnections• supposeM/nevenly• 1TCPconnectionforbasefile• M/nparallelconnectionsforimages• n-parallelresponsetime= (M/n + 1)*2*RTT + (M/n+1)*L/µ compare:• non-persistentresponsetime= (M+1)*2*RTT + (M+1)*L/µ • persistentresponsetime = 3*RTT + (M+1)*L/µ
HTTPResponsetime(inseconds)RTT= 100msec,L = 5Kbytes,M = 10,andn = 5
Forlowbandwidth,transmissiontimedominatesoverconnectionandresponsetime⇒performanceofpersistentconnections
comparabletothatofparallelconnections
HTTPResponsetime(inseconds)
ForlargerRTT,TCPestablishmentandslowstartdelaysdominateoverresponsetime⇒persistentconnectionsnowgivesignificantimprovement:
particularlyinhighbandwidth×delaynetworks
RTT= 1 sec,L = 5Kbytes,M = 10,andn = 5
HTTP/2
BasedonGoogle’sSPDY(2009)RFC7540cameoutinMay2015(writtenbythe
twoauthorsofSPDY)
ChromebrowseralreadyhasSPDYbuilt-in
ProblemswithHTTP1.x:• pipeliningstillsuffersfromhead-of-lineblocking(iffirstitemislarge,theresthastowait)
• parallelstreamssolvesHoLblocking,butonbandwidth-limitedchannel,toomanystreamsclogupthechannel
HTTP/2Somechangesfrom1.1:• headersnolongerintextformat• separatecontrolanddataheaders• streammultiplexingoverasingleTCPconnection:• eachstreamhasanID,dataistaggedwithstreamID
• eachstreamcanalsohavedifferentpriority
• serverpush:don’thavetowaitforclienttoparsepagebeforeinitiatingdownload
• headercompression
Performanceimprovement:upto64%reductioninpageloadtime
[Grigorik]
WebCaches(ProxyServer)Goal:satisfyclientrequestwithoutinvolvingoriginserver• usersetsbrowsertodirectallwebaccessesviacache• browsersendsallHTTPrequeststocache
• ifobjectisnotcached,cacherequestsobjectfromoriginserver,thenreturnsobjecttoclient
• elsecachereturnsobject• cacheactsasbothclientandserver
• typicallycacheisinstalledbyISP(university,company,residentialISP)
• mustbetransparent,allowforplug-n-play
client
Proxyserver
client
originserver
WebCachingExample:NoCaching
Parameters:• averageobjectsize=100,000bits• avg.#ofrequeststoservers=15/sec• InternetlatencybetweenarouteronthepublicInternetandanyserver=2secs
Resultingperformance:• utilizationonLAN=15%• utilizationonaccesslink=100%�over-utilizedlinkcauseslongqueue(delayofminutes)
• totaldelay =Internetdelay+accessdelay+LANdelay=2secs+minutes+milliseconds
originservers
publicInternet
institutionalnetwork
10MbpsLAN
1.5Mbpsaccesslink
Possiblesolution• increaseaccesslinkbandwidthto,say,10Mbps(oftenacostlyupgrade)
Performance:• utilizationonLAN=15%• utilizationonaccesslink=15%• totaldelay=Internetdelay+accessdelay+LANdelay
=2secs+msecs+msecs
originservers
publicInternet
institutionalnetwork
10MbpsLAN
10Mbpsaccesslink
WebCachingExample:NoCachingAnothersolution:installcache• assumehitrateof0.4
Performance:• 40%requestswillbesatisfiedalmostimmediately
• 60%requestssatisfiedbyoriginserver• utilizationofaccesslinkreducedto60%,resultinginnegligibledelays(say10msecs)
• avg.totaldelay =Internetdelay+accessdelay+LANdelay=.6*(2.01)secs+msecs<1.4secs
originservers
publicInternet
institutionalnetwork 10MbpsLAN
1.5Mbpsaccesslink
cache
WebCachingExample:WithCaching
ConditionalGET Goal:don’tsendobjectifcachehasup-to-dateversion
• cache:specifiesdateofcachedcopyinHTTPrequestIf-modified-since: <date>
• server:responsecontainsnoobjectifcachedcopyisup-to-date:HTTP/1.0 304 Not Modified
MaybeusedwithorwithoutTTL,TTLhardtoset,dependsonsitecontent
cache server
HTTPrequestmsgIf-modified-since: <date>
HTTPresponseHTTP/1.0 304 Not Modified
objectnotmodified
HTTPrequestmsgIf-modified-since: <date>
HTTPresponseHTTP/1.0 200 OK <data>
objectmodified
Multiplecachesmayformadistributedcache
Insteadofgoingdirectlytooriginserver,acachemayqueryoneormoreothercachesforobjectfirst,e.g.,csecachequeriesececachefirst
Toeliminatefrequentinter-cachequery-reply,eachcachemaypushanindexofitscontentstoothercaches,i.e.,ececachetellscsecachealltheobjectsitisholding
Frequently,this“index”isintheformofaBloomFilter
CooperativeCaching originservers
publicInternet
cse
csecache
ece
ececache
BloomFilterAnefficient,lossywayofdescribingaset,comprising:• abitvectoroflengthw• afamilyofindependenthashfunctions
• eachmapsanelementofthesettoanintegerin[0, w)Toinsertanelement:• foreachhashfunction,setthebittheelementhashesto
Tosearchforanelement:• foreachhashfunction,examinethebittheelementhashesto• ifanybitisnotset,theelementisdefinitelynotintheset• ifallthebitsareset,theelementmaybeintheset(potentialforfalsepositive)
insert:
search:
search:
BloomFilterThefalsepositiverateisawell-defined,linearfunctionof:
1. width(w),2. thenumberofhashfunctions,and3. thenumberofelementsintheset
• widerfiltersarealwaysmoreaccurate
• optimaltradeoffbetweenfilterstorageandaccuracyiswhenabouthalfofthebitsareset
BloomFiltersalsousefulinmaintainingp2psupernodebackboneanddistributedstorageindatacenternetwork
VariableDelay
browsercache
DNSresolution
TCPopen
1stbyteresponse
Lastbyteresponse
Sourcesofvariabilityofdelay• browsercachehit/miss,needforcacherevalidation
• DNScachehit/miss,multipleDNSservers,errors• TCPhandshake,packetloss,highRTT,serveracceptqueue
• RTT,busyserver,CPUoverhead(e.g.,CGIscript)• responsesize,receivebuffersize,congestion
LimitationsofWebCachingSignificantfraction(>50%)ofHTTPobjectsarenotcacheable
Whynot?• dynamicdata:stockprices,scores,webcams• scripts:resultsbasedonpassedparameters• useofcookies:resultsmaybebasedonpasseddata• advertising/analytics:ownerwantstomeasure#hits• randomstringsincontentensureuniquecounting• HTTPS:encrypteddataisnotcacheable• multimedia:objectlargerthancacheornotallowedtobecachedduetointellectualpropertyrights
Howtoensurescalabilityofwebserverwhencontentisnotcacheable?
ContentDistributionNetworks(CDNs)
Streaminglargefiles(e.g.,video)fromasingleoriginserverinrealtimerequireslargeamountofbandwidthSolution:replicatecontenttohundredsofserversthroughouttheInternet• placeserversinedge/accessnetwork• contentpre-downloadedtoservers• whenuserdownloadscontent,directusertotheserverclosesttoit• placingcontent“closeto”useravoidsnetworkdelayandlossoflongpaths
originserverinN.America
CDNdistributionnode
CDNserverinS.America
CDNserverinEurope
CDNserverinAsia
CDNsvs.ContentOwnersMaintainingyourownnetworkofsuchserversisexpensive(bothCAPEXandOPEX)
CDNprovidersmaintainanetworkofserversandsellcontentreplicationservicetomultiplecontentowners• exampleofcontentowners:ABC,HBO,Netflix• exampleofCDNproviders:Akamai,Limelight
• Akamaihas~25Kserversspreadover~1Kclustersworld-wide
CDNreplicatesowners’contentinCDNservers
Whenownerupdatescontent,CDNupdatesservers
SomelargecontentownersoperatetheirownCDNs:Amazon,Google/YouTube,Netflix(virtual)
SampleDelivery(ExampleOnly)
index.html,logo.gif
shirtad.gif
stadium.mp4,tvlogo.mp4
shirtad.gif
stadium.mp4,tvlogo.mp4
index.html,logo.gif
Whydon’twestoreindex.htmlandshirtad.gifattheCDNalso?
www1.cdi.ex
www2.cdi.ex
www3.cdi.ex
[Frank13]
ContentDistributionNetwork
CDNnodescreateapplication-layeroverlaynetworkLargerCDNsmayhavetheirownWANs,e.g.,Google’sB4,thatinterconnectwiththerestoftheInternetlikeanyotherISP’snetworkCDNdirectsarequesttotheserverclosesttotheclient(how?)
[afterWalrand]
Tier-1Backbones
ISPs
IXPs
AccessAggregators
CDNs:e.g.,Akamai,Amazon,Google
ClientRedirection
Howtodirectclientstoaparticularserver?
Aspartofapplication:HTTPredirect• pros:application-level,fine-grainedcontrol• cons:additionalloadandRTTs,hardtocache
Aspartofnaming:DNS• pros:well-suitedtocaching,reduceRTTs• cons:reliesonproxiesandestimations,notaccurate
Prosandconsofeach?
DNS-basedRedirection
ClientsaredirectedtotheclosestserveraspartoftheDNSnameresolutionprocess:1. clientasksitslocalDNSresolvertoresolveCDN’sserver’sname
2. thelocalDNSresolverisdirectedtoCDN’sauthoritativenameserverbyDNS
3. CDN’snameservereitherreturnstheaddressofserverclosesttotheDNSresolveroranorderedlistofaddresses,rankedbydistancetolocalDNSresolver
CDNExampleHTTPrequestforhome.ex/index.html containscdi.ex/stadium.mp4
DNSqueryforcdi.ex
HTTPrequestforcdi.ex/stadium.mp4
1
2
5
originserver
client’slocalnameserver
nearbyCDNserver
DNSqueryforcdi.ex
CDN’sauthoritativeDNSserver
34
ServerSelectionHowtochoosewhichservertodirectaclient?• serverload• client-serverdistance
• CDNmaintainsa“map”,estimatingdistancesbetweenaccessISPsandCDNnodes
• CDN’snameserveruses“map”todetermineserverclosesttothelocalDNSresolver
• DNSresolverusedasproxyforclient�inaccuratelocation• CDNdoesn’tknowclient’saddressatnameresolutiontime
• distancecanbemeasuredusingdifferentmetrics,e.g.,latency,lossrate�onlyestimated
• deliverycost(ISPpricing)