Utility Computing - University of...

8
Computer Networks Lecture 24: Cloud Computing and Data Center Networking Utility Computing August 2006: Amazon Elastic Compute Cloud, EC2+S3 first successful IaaS offering IaaS == Infrastructure as a Service swipe your credit card, and spin up your VM Provides utility computing: computing resources as a metered service (“pay as you go”) ability to dynamically provision virtual machines Why utility computing? cost: CAPEX vs. OPEX scalability: “infinite” capacity elasticity: scale “out” (or in) on demand [Joshi&Lagar-Cavilla, Lin] I think there is a world market for about five computers. Evolution into PaaS Platform as a Service (PaaS) is higher level simpleDB (relational tables) simple queue service elastic load balancing flexible payment service PaaS diversity (and lock-in) Amazon’s Elastic Beanstalk (upload your JAR) Microsoft Azure: .NET, SQL Google AppEngine: python, java, GQL, memcache Heroku: ruby, python, node.js, php, java Joyent: node.js and javascript [Joshi&Lagar-Cavilla] IaaS vs. PaaS Hardware-centric vs. API-centric Never care about drivers again or sys-admins, or power bills You can scale if you have the money you can deploy on two continents and ten thousand servers and 20TB of storage x86 JAR Byte Key Value IaaS PaaS [Joshi&Lagar-Cavilla]

Transcript of Utility Computing - University of...

Computer Networks

Lecture24:CloudComputingand

DataCenterNetworking

UtilityComputingAugust2006:AmazonElasticComputeCloud,EC2+S3• firstsuccessfulIaaSoffering

IaaS==InfrastructureasaService• swipeyourcreditcard,andspinupyourVM

Providesutilitycomputing:•  computingresourcesasameteredservice(“payasyougo”)•  abilitytodynamicallyprovisionvirtualmachines

Whyutilitycomputing?•  cost:CAPEXvs.OPEX•  scalability:“infinite”capacity•  elasticity:scale“out”(orin)ondemand

[Joshi&Lagar-Cavilla,Lin]

Ithinkthereisaworldmarketforaboutfivecomputers.

EvolutionintoPaaSPlatformasaService(PaaS)ishigherlevel• simpleDB(relationaltables)• simplequeueservice• elasticloadbalancing• flexiblepaymentservice

PaaSdiversity(andlock-in)• Amazon’sElasticBeanstalk(uploadyourJAR)• MicrosoftAzure:.NET,SQL• GoogleAppEngine:python,java,GQL,memcache• Heroku:ruby,python,node.js,php,java• Joyent:node.jsandjavascript

[Joshi&Lagar-Cavilla]

IaaSvs.PaaSHardware-centricvs.API-centric

Nevercareaboutdriversagain•  orsys-admins,orpowerbills

Youcanscaleifyouhavethemoney•  youcandeployontwocontinents•  andtenthousandservers•  and20TBofstorage

x86 JAR

Byte KeyValue

IaaS PaaS

[Joshi&Lagar-Cavilla]

YourNewConcernsAppprovider:• howwillIhorizontallyscalemyapplication• howwillmyapplicationdealwithdistribution

•  latency,partitioning,concurrency

• howwillIguaranteeavailability•  failureswillhappen•  dependenciesareunknown

Cloudprovider:• howwillImaximizemultiplexing?• canIscaleandprovideperformanceguarantees?• howcanIdiagnoseinfrastructureproblems?

[Joshi&Lagar-Cavilla]

FromCloud-User’sPOVCloudisliketheIPlayer•  itprovidesabest-effortsubstrate•  iscost-effective•  ison-demand•  providescomputeandstorageinfrastructure

Butyouhavetobuildyourownreliableservice•  faulttolerance•  availability,durability,QoS

[Joshi&Lagar-Cavilla]

EverythingasaServiceUtilitycomputing=InfrastructureasaService(IaaS)• whybuymachineswhenyoucanrentcycles?•  examples:Amazon’sEC2,Rackspace

PlatformasaService(PaaS)•  givemeaniceAPIandtakecareofthemaintenance,upgrades,…•  example:GoogleAppEngine,Heroku

SoftwareasaService(SaaS)•  Justrunitforme!•  example:Gmail,GoogleDocs,Salesforce,Adobe’sCreativeCloud,Microsoft’sOffice365

[Lin]

CloudComputing:SummaryNIST’sdefinition:servicesaccessedoverastandardizednetworkwiththefollowingcharacteristics:

• on-demandself-service:acustomercanordercomputeresourceswithoutanyhumaninteractionwithprovider

• resourcepooling:provider’sphysicalandvirtualresourcespooledtoservemultiplecustomersdynamically

• rapidelasticity:resourcesappearunlimitedandcanbescaledupordownrapidly

• measuredservice:meteredusage(andbilling)

• broadnetworkaccess:availableovertheInternet,platformindependent:mobile,laptops,tablets

AnatomyofaDatacenter

Source:BarrosoandUrsHölzle(2009) Source: IEEE Spectrum and Google

HowMuchPowerNeeded?

• 0.0003kWhtoansweratypicalGooglesearch• 0.05 kWtousealaptopforanhour• 0.1 kWtorunaceilingfanforanhour• 1.1 kWtouseacoffeemakerforanhour• Howmuchpoweris30 MW?• 6,000averagehomeswithcentralair(~5 kW/home)• 300fastfoodrestaurants• 45largeretailstores • 37grocerystores• 30largehomeimprovementstores• 1.5SearsTowers• 1computerdatacenter

DataCenterNetworksTenstohundredsofthousandsofhosts,oftencloselycoupled,incloseproximity:•  e-commerce(e.g.,Amazon)•  contentservers(e.g.,NetFlix,YouTube,Apple,Microsoft)•  searchengines,datamining(e.g.,Google)Challenges:• multipleapplications,eachservingmassivenumbersofclients

• managing/balancingload,avoidingprocessing,networking,databottlenecks

Insidea40-ftMicrosoftcontainer,Chicagodatacenter

Server(blade)racks

Top-of-Rack(ToR)/edgeswitches

Tier-1/coreswitches

Tier-2/aggregationswitches

loadbalancer

loadbalancer

1 2 3 4 5 6 7 8

borderrouter

accessrouter

Internet

DataCenterNetworksLoadbalancer:layer-4“switch”•  receivesexternalclientrequests•  directsworkloadwithindatacenter•  returnsresultstoexternalclient(hidingdatacenterinternalsfromclient)

PotentialNetworkBottleneckHost–ToR:1GbpsToR–Tier2andTier1–Tier2:each10Gbps10hostsonrack1eachtalktoadifferenthostonrack5Similarlybetweenracks2–6,3–7,and4–840flowssharethe10GbpsA–Blink,eachgetsonly10/40= 250Mbps,only¼ofthe1Gbpshost–ToRcapacity

B

1 2 3 4 5 6 7 8

A C10Gbps

10Gbps

1Gbps

Fat-treeTopologywithk = 4Richinterconnectionamongswitches,a.k.a.Closnetwork•  increasedthroughputbetweenracksEqualCostMulti-Path(ECMP)routing•  increasedreliabilityviaredundancy• originallyintendedfordatacenterwithoff-the-shelfparts

Server/hosts

Top-of-Rack(ToR)/edgeswitches

Tier-1/coreswitches

Tier-2/aggregationswitches

Fat-treeArchitecturek-aryfat-tree:three-layertopology•  kpods,eachconsistsof(k/2)2hostsandtwolayersofswitches,eachlayerhas k/2k-portswitches

•  eachToRswitchconnectstok/2hostsandk/2Tier-2switches•  eachTier-2switchconnectstok/2ToRandk/2Tier-1switches•  (k/2)2 Tier-1switches:eachconnectstoallkpods•  supportsk3/4hosts,k < 256,fat-treedoesnotscaleindefinitely

Server/hosts

Top-of-Rack(ToR)/edgeswitches

Tier-1/coreswitches

Tier-2/aggregationswitches

[Beyer]

CostAnalysis

Maximumpossibleclustersizewithallhostscapableoffullyutilizinguplinkcapacity

Hierarchicaldesignuseshigher-speed,andmoreexpensive,switcheshigherupinthehierarchy(scaleup)

AddressinginFat-treeUse10.0.0.0/8privateaddressingblock

Podswitcheshaveaddress10.pod.switch.1 •  podandswitchinrange[0, k-1], basedonpositionTier-1switcheshaveaddress10.k.i.j •  iandjdenoteswitchpositionin(k/2)2Tier-1switches

Hostshaveaddress10.pod.switch.ID •  IDinrange[2, (k/2) + 1],fork = 4,IDcanonlybe2or3

10.0.0.1

10.0.3.1

10.0.0.2 10.0.0.3 10.0.1.3 10.2.1.2 10.2.1.3

[Beyer]

ForwardinginFat-treeTier-1switchescontain(10.pod.0.0/16, port)entries•  staticallyforwardsinter-podtrafficonspecifiedport

• 10.4.1.1’sroutingtable:

[Beyer]

Top-of-Rack(ToR)/edgeswitches

Tier-1/coreswitches

Tier-2/aggregationswitches

Prefix Output port10.0.0.0/16 010.1.0.0/16 110.2.0.0/16 210.3.0.0/16 3

01 2

3

10.2.1.2 10.2.1.3

10.2.1.1

10.2.0.2 10.2.0.3

Tier-2’sTwo-LevelLookupTablePrefixtablecontains(10.pod.switch.0/24, port)entries•  switchvalueistheToRswitchnumber•  usedforforwardingintra-podtraffic

Suffixtableusedforforwardinginter-podtraffic

0 12

3

[Beyer]

Recall:fork = 4,hostIDcanonlybe2or3

01 2

3

Tier-2’sForwardingAlgorithm

Prefixtablepreventsintra-podtrafficfromleavingpod

Suffixtableforinter-podtrafficbasedoffhostIDs:•  ensuresspreadoftrafficacrossTier-1switches•  preventspacketreorderingbyassigningasinglestaticpathforeachhost-to-hostcommunication•  betterthanhavingasinglepathbetweensubnets

[Beyer]

ToRSwitch’sForwarding

Inter-racktrafficreliesonswitch’soriginalbackwardlearningalgorithmAssumesforwardingtablesgeneratedbyacentralcontrollerwithfullknowledgeoftopology•  centralcontrolleralsoresponsiblefordetectingswitchfailuresandre-routingtraffic

•  andforansweringARPandDHCPrequests

[Beyer]

Fat-treeRoutingExample

Server/hosts

Top-of-Rack(ToR)/edgeswitches

Tier-1/coreswitches

Tier-2/aggregationswitches

Packetsfromsource10.0.1.2todestination10.2.0.3takethedashedpath

Two-LevelLookupImplementationImplementedinhardwareusingaTCAM•  TCAM:Ternary(0,1,don’tcare)Content-AddressableMemory•  canperformparallellookupsacrosstable•  storesdon’tcarebits,suitableforvariablelengthprefixes

Prefixespreferredoversuffixes

lookupincomingaddress

[Beyer]

of switch 10.2.2.1 in the example network

TopologyPower/HeatDissipation

PackagingProblemFat-treehassignificantcablingoverhead•  1GigEswitchesusedtoreducecost•  lackof10GigEportsleadstomorecabling

Apackagingsolutionfork = 48•  generalizestoothervaluesofk

Cablingingeneralcanbeaproblemindatacenternetworks....

[Beyer]

VL2:•  alsobasedon

Closnetwork•  buthasamoreflexibleaddressingscheme

•  runslink-staterouting•  doesnetworkloadbalancing

Othertopologieshavethehoststhemselvesalsoserveasrouters

OtherDCNetworkTopologies

NetworkSecurityEvolvedVirtualprivateclouds•  internalVLANswithincloud•  virtualnetworkfunctions(VNFs):virtualgateways,virtualfirewalls:middleboxesimplementedinsoftware• removeexternaladdressability• MPLSVPNconnectiontocloudgateway• butdoesn’tprotectexternalfacingassets

•  providers:Amazon,Google,Microsoft,etc.

[AmazonAWS]

InformationLeakageIsyourtargetinacloud?•  traceroute•  networktriangulation

EveryVMgetsitsprivate/publicIPAreyouonthesamemachineastarget?•  IPaddresses•  latencychecks•  sidechannels(cacheinterference)

Canyougetonthesamemachine?•  pigeon-holeprinciple(nitems,mcontainers,n > m⇒somecontainersmustbeshared)

•  placementlocality[Joshi&Lagar-Cavilla]

Source:VoasandZhang,“CloudComputing:NewWineorJustaNewBottle?”ITProfessional,11(2):15–17,March2009

[Joshi&Lagar-Cavilla]

IBM PC�1981

Ethernet 802.3�1983

Commercial Internet�1995

Amazon EC2�2006

Datacenternetworkasaswitch

SETI@home�1999

Thecircleisnowcomplete…