CS 2630 Computer...

43
CS 2630 Computer Organization What did we accomplish in 8 weeks? Brandon Myers University of Iowa

Transcript of CS 2630 Computer...

Page 1: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

CS2630ComputerOrganization

Whatdidweaccomplishin8weeks?BrandonMyers

UniversityofIowa

Page 2: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

Courseevaluations

• ICON|student(course?)tools|Evaluations

Page 3: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

Whytake2630?

• Brandon’sesotericanswer:graduatesofaComputer Scienceprogramshouldhaveanappreciationforhowrealcomputers work• ACMandIEEE’sanswer:ComputerOrganizationandArchitectureisatopicinComputerScienceCurricula2013(https://www.acm.org/education/CS2013-final-report.pdf )• Butmoreconcretely...

require

Page 4: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

Whytake2630?require

1.Itwillbeuptoyoutodesignournewcomputersystems(softwareANDhardware)...computerarchitectshavebeenpanickingfornearlyadecadeandtheyarenot calmingdown

Page 5: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

2.atsomepointyouwillbehavetomeasureasystemyou’vebuilt:performance(latency&throughput),energyusage,reliability,...Tobeabletomeasure/interpret/improveyoursystem,ithelpstounderstandhowmoreofthecomputerworks

https://commons.wikimedia.org/wiki/File:Toyota_mirai_trimmed.jpg

Whatmetricswouldyoumeasuretoknowhowgoodacaris?

Whatcantheengineerdotochangethevaluesofthosemetrics?

Whytake2630?require

Page 6: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

Theinsightsyoubroughttothecourse:CATopics

Page 7: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

App

High-levellanguage(e.g.,C,Java)

Instructionsetarchitecture(e.g.,MIPS)

Compiler

Operatingsystem(e.g.,Linux,Windows)

Memorysystem I/OsystemProcessor

Datapath &Control

Digitallogic

Circuits

Devices(e.g.,transistors)

Physics

Page 8: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

lw $t0, 4($s0)addi $t0, $t0, 10sw $t0, 8($s0)

YoulearnedhowtowriteassemblycodeinHW2(usuallythecompilerdoestheworkforus)

rug:don’tneedtowriteassemblycodeforaparticulararchitecture.InsteadwriteportableJava/Python/Ccode

bumps:someCcodeisn’tportable;someprogrammerswritesnippetsofassemblycodewhenthecompilerdoesn’tdothebestthing

Page 9: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

100011100000100000000000000001000010000100001000000000000000101010101110000010000000000000001000

lw $t0, 4($s0)addi $t0, $t0, 10sw $t0, 8($s0)

Project1– MiniMAtheMIPSassemblerrug:wecanwriteMIPS

programsinalanguagemadeofhuman-readablecharacters,usepseudoinstructions,refertolabelseventhoughthemachinereadsbinarynumbers

Page 10: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

100011100000100000000000000001000010000100001000000000000000101010101110000010000000000000001000

lw $t0, 4($s0)addi $t0, $t0, 10sw $t0, 8($s0)

100011100000100000000000000001000010000100001000000000000000101010101110000010000000000000001000

100011100000100000000000000001000010000100001000000000000000101010101110000010000000000000001000

100011100000100000000000000001000010000100001000000000000000101010101110000010000000000000001000100001000001011111000000001111000011100100001101010000000000100000000110000010000000000000001000100011100000100000000000000001000010000100001000000000000000101010101110000010000000000000001000

math.astdlib.a

browser.exe

rug: linkerallowsustowriteourprogramsmodularly

Page 11: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

PEERINSTRUCTION(actually,asurvey)stronglyagreeagreeneutraldisagreestronglyagree

abcde

1. Iunderstandtherelationshipbetweenbits,numbers,andinformation.

2. Iunderstandthestoredprogramconcept3. Iunderstandtheroleoftheinstructionsetarchitectureina

computer4. Iunderstandwhyabstractionsareessentialforbuildingcomplex

systems5. Iunderstandwhythedigitalabstractionisimportant6. Iunderstandwhythesynchronousabstractionisimportant7. Iunderstandthetradeoffsinthememoryhierarchy8. Iunderstandhowproblemscanbedecomposedintoadatapath and

acontrol9. Iappreciatethelayersofthecomputingstackandwhytheymay

needtochangeinthenearfuture.

Page 12: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

100011100000100000000000000001000010000100001000000000000000101010101110000010000000000000001000

browser.exe

InProject2-2youplayedtheroleoftheLoaderbyloadingyourheximageintotheInstructionmemory

rug: ourprogramhastheillusionofhavingaccesstotheentireaddressspace(e.g.all232bytes)ofthecomputer

Page 13: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

Project2-2,youdesigned,built,andtestedaprocessorthatrunsassembledMIPSprograms datapath andcontrol

rug:machinecodefortheMIPSarchitectureoughttorunonanyMIPSprocessor,regardlessofitsdesign(itsmicroarchitecture)

bumps:choicesaboutthearchitecturesometimesarebasedonassumptionsaboutthemicroarhitecture (e.g.,MIPSbranchdelayslot).

Page 14: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

Project2-1,HW4youbuiltcomponents(likeregisterfilesandfinitestatemachines)fromsequentiallogic

rug:wecanbuildacomplexsystemoutofbasiccomponents.Synchronous abstractionallowsustonothavetoworryaboutinterfacesbetweencomponents

Page 15: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

Digitallogic

Circuits

Instructionsetarchitecture(e.g.,MIPS)

GPUinstructionsetneuralnetwork

structureandweights

Youdon’thavetobuildthe5-stagepipelinedMIPSprocessor

http://people.cs.pitt.edu/~cho/cs2410/papers/yeager-micromag96.pdf

MIPSR10000(out-of-ordersuperscalar)

youdon’thavetobuildaMIPSprocessor

systolicarrayandcontrol

Page 16: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

everycomponentismadeoflogicgates;youlearnedhowtobuildlogiccircuitsinHW3

Page 17: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

logicgatemadeofpMOS andnMOStransistorsarrangedinaCMOSconfiguration

rug: ItisabiteasierjusttothinkofgatesthatarefunctionsasopposedtotransistorsattachingoutputtoVsource orVground

rug:CMOSensureseverygatehasapure0or1output!Thisideaisthedigitalabstractionthatletsthelayersabovecomposetwoelectricalcircuitswithoutworryingabouthowtheyaffecteachother.

Page 18: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

layoutengineer’sviewofaNORgate

https://commons.wikimedia.org/wiki/File:NOR_gate_layout.png

rug:Whenbuildingafunctionaldigitallogiccircuit,noneedtoworryabouthowitisarrangedonthesilicon

Page 19: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

https://commons.wikimedia.org/wiki/File:NOR_gate_layout.png

nMOScrosssection

https://commons.wikimedia.org/wiki/File:MOSFET_functioning_body.svg

rug:deviceengineersprovidelayoutengineerswith“designrules”.Iftheyobeytherequiredspacingbetweencomponentsthenthetransistorswillwork

Page 20: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

OFF ON

rug:WhenoperatingatransistorinthesaturationregimesitlookslikeanelectricalswitchbetweenvoltagesGNDandVDD.Partofsupportingthedigitalabstraction.

Page 21: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

It’snotenoughtojustbuildsoftwarethesedays

ProjectCatapult:customhardwarerunningpartofBingsearch

computervisionprocessorsrunningGoogle’saugmentedrealityplatformTango

“holographic”processorforMicrosoft’saugmentedrealityplatformHoloLens

Sparc M7chipisbuiltspecificallyforforacceleratingdatabasequeries

allinproductionnotjustresearch

GoogleTPUisbuiltformachinelearning

Page 22: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

LifebeyondLogisim?• Logisim’s mainmodeofinputisschematicentry

• Muchdigitallogicdesignuseshardwaredescriptionlanguages(HDL)likeVerilog(lookupVeriloginyourtextbookindex)• HDLisnotmuchdifferentthanwhatyoudid,exceptitistextualinsteadofgraphical

• Typicallyhavepowerfulcompilersthanmakedevelopmenteasier thanusingLogisim,e.g.,writeastatementlike

case(ALUCtrl){0:R=X+Y1:R=X-Y…

}AndyougetanALU!

Page 23: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

Didwereallybuildarealprocessor?

• Yes!YouimplementedmuchoftheMIPSInstructionSetArchitecture.YourProject2-2couldrunLinux(at4KHzclockfrequency)givenabootloaderprogramandLinuxcompiledforMIPS

No,Imeanlikereal hardware

Page 24: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

No,Imeanlikereal hardware

Ifweuseahardwarecompiler,wecouldturnyourlogisim files(lookinside;it’sjustsomeXMLlistingabunchofcomponentsandwires)intoanFPGAdesignorstandard-cellVLSIdesign

moretolearnabouthowtodealwiththedetailsofthesedesignflows,butyouhaveagoodstartingpoint

Page 25: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

Administrivia

• FinalExam• Friday,10am-12in205MLH!• opennotes/book,noelectronics• reminder:practicematerialsonICONinFiles|Exams

• Reviewtomorrowinclass• takeatleastoneofthepreviousfinalexamsandgradeyourself• comepreparedwiththelistofproblemsorspecificquestions

Page 26: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

StuffinyourtextbookIrecommendyoulookat!

• Ch 3.4FiniteStateMachines• statetransitiondiagramsaretosequentiallogicAStruthtablesaretocombinationallogic• involvesideasthatrelatetootherpartsofCS,likefiniteautomata

• Ch 8Memorysystems• Howdowedealwiththetradeoffbetweencapacityandspeedinmemorytechnologies?(oneanswer:caches)• Howdoweruntwoprogramsonthecomputersimultaneouslyifboththinktheyownthewholeaddressspace?(oneanswer:virtualmemory)

Page 27: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

Whatcoursesnext?• CS:3620OperatingSystems

• CS:3210ProgramminglanguagesandTools(inC++)

• CS:3640IntroductiontoNetworksandTheirApplications

• CS:3820ProgrammingLanguageConcepts

• CS:4640 ComputerSecurity

• CS:4700HighPerformanceandParallelComputing

• CS:5610:Highperformancecomputerarchitecture

• CS:4980TopicsinCS(CompilerConstructiononraspberrypi)• CS:4980TopicsinCS(AdvancedComputerArchitecturethisSpring!)

Page 28: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

What’stolearnnext:operatingsystems

Page 29: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

Questionswedidn’tgettoanswerfullyinCS2630Operatingsystems

• howdomultipleprogramssharethecomputer?• 2-64processors• 1 networkinterface• 1 memory• 1keyboard,mouse,screen• 100’sofrunningprograms

• howdoyoukeepprogramsisolatedfromeachotheroroneprogramfromconsumingallresources?

• howdoyouimplementsyscalls?

• howdoyouloadtheOScodeintomemorywhenyoupoweronthecomputer?

Page 30: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

What’stolearnnext:computerarchitecture

Page 31: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

What’stolearnnext:computerarchitecture

• theroleofparallelisminmicroarchitectures

• everyimplementationanditseffectonperformance...

𝑠𝑒𝑐𝑜𝑛𝑑𝑠𝑝𝑟𝑜𝑔𝑟𝑎𝑚 =

𝑠𝑒𝑐𝑜𝑛𝑑𝑠𝑐𝑦𝑐𝑙𝑒 ∗

𝑐𝑦𝑐𝑙𝑒𝑠𝑖𝑛𝑠𝑡𝑟𝑢𝑐𝑡𝑖𝑜𝑛 ∗

𝑖𝑛𝑠𝑡𝑟𝑢𝑐𝑡𝑖𝑜𝑛𝑠𝑝𝑟𝑜𝑔𝑟𝑎𝑚

...andcostandenergy

Page 32: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

Parallelisminarchitecturespipelining

vector

superscalar

dataflow

andothers...multicore,VLIW,multithreading,...

Page 33: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

Vectormachinesfoundin...• earlysupercomputers

• IntelAVX• GPUs

SIMD:singleinstruction,multipledata

Page 34: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

Superscalarmachines

Replicateresources,e.g.,• twodecoders,2-wide

instructioncachereadport:fetchtwoinstructionsatatime

• twoALUs:executetwoinstructionsatatime

• moreregisterfilewriteports:writebacktworegistersinonecycle

foundin...mostCPUsinserversandsmartphones

superscalar+pipelining

superscalar

Page 35: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

Dataflowmachinesaprocessorneedstogettheinstructionandtheinputdatatothesamephysicalplaceatthesametime(knownas“dataflowlocality”)

DataflowmachineshaveabunchofExecutionunitsofvariouskind;thedata”flows”throughtheoperators

Challenges?

Page 36: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

What’stolearnnext:parallelcomputing

Page 37: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

Metricforperformancecomparison:Time

Myprogramrunsin100seconds

IfI“parallelizeit”on10processorsIsawthatitrunsin12seconds

Whatisthespeedup?

Tserial /Tparallel =100/12=8.33X

Page 38: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

Predictingparallelrunningtime(Tpar)fromserialrunningtimeMyprogramrunsinTser =100seconds

IfI“parallelizeit”on10processors,howfastwillitrun(i.e.,whatisTpar)?

𝑇4567689:𝑇6;<54=>?

=1

1 − 𝑟 + 𝑟𝑠

𝑇6;<54=>? = 𝑇4567689: ∗ ( 1 − 𝑟 ∗ 1 + 𝑟 ∗1𝑠)

Inthisform,itiscalledAmdahl’slaw:saysyourspeedupislimitedbyhowmuchoftheprogramisimproved(e.g.,parallelize)

r=fractionofprogramthatisabletobeimproveds=speedupwhenapplyingtheimprovement

Page 39: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

https://en.wikipedia.org/wiki/Amdahl's_law#/media/File:AmdahlsLaw.svg

Amdahl’slawappliedtoparallelization

r

s

Page 40: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

Asequentialabstractmachinemodelyoualreadyknow

• RAM:randomaccessmemory• justlikeanyothercomputationalstep,accessingmemoryiscostof1

memory

processor

RAM

Page 41: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

Oneofthefoundationalparallelmachinemodels:ParallelRandomAccessMachine(PRAM)• Allprocessorsareattachedtoasharedmemory

• Memoryaccesstakes1step

• MorerealisticvariantsofPRAMincurgreatercostfor“conflicting”memoryaccesses

• usedveryoftenforunderstandingthespeeduplimitsofparallelalgorithms;notveryrealistic

Page 42: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

Oneofthefoundationalparallelmachinemodels:Bulksynchronousparallel(BSP)

https://en.wikipedia.org/wiki/Bulk_synchronous_parallel

w1 w2wp

𝑙

ℎ6

(seeblackboardnotes)

thisabstractmachinedoesnotsupportasmanyalgorithmsasCTA,butitissimpler

Page 43: CS 2630 Computer Organizationhomepage.cs.uiowa.edu/~bdmyers/cs2630_su17/public/lectures/lectur… · ( •But more concretely... require. ... run on any MIPS processor, regardless

ThefutureofCS2630• Stayintouch!• TellothershowawesomeCS2630is!• Signuptobeanapprovedtutor!https://cs.uiowa.edu/resources/approved-tutors

• CS2630continuingintheTILEclassrooms• ThankyouforbeingthefirsttoparticipateinTILEversionofCS2630andnewlabassignments• workingon:potentiallyafutureopportunityforincludinglabassistants• helpstudentsbutdonotgradework• gainexperiencewithteaching• learnthematerialevenbetterbyteachingothers