L09 Hardware and State - inst.eecs.berkeley.edu

9
9/21/17 1 CS 61C: Great Ideas in Computer Architecture Introduction to Hardware: Representations and State Instructors: Krste Asanović and Randy H. Katz http://inst.eecs.Berkeley.edu/~cs61c/fa17 1 Fall 2017 -- Lecture #9 9/21/17 From Last Time: Where Are We Now? 9/21/17 Fall 2017 - Lecture #9 2 Translation: Compilation to Assembly 1. Replace pseudo with real instructions 2. Resolve local jumps, branches 3. Create Symbol, Relocation Tables Symbol Table: Labels defined or referenced Relocation Table: Address locations to be adjusted .o file 1 text 1 data 1 info 1 .o file 2 text 2 data 2 info 2 Linker a.out Relocated text 1 Relocated text 2 Relocated data 1 Relocated data 2 Linker (2/3) 9/21/17 Fall 2017 - Lecture #9 3 Four Types of Addresses PC-Relative Addressing (beq, bne, jal; lla: auipc/addi) Never need to relocate (PIC: position independent code) Absolute Function Address (la: auipc/lw + jalr) Always relocate External Function Reference (la: auipc/lw + jalr) Always relocate Static Data References (lui/addi) Always relocate 4 9/21/17 Fall 2017 - Lecture #9 Resolving References (1/2) Linker assumes first word of first text segment is at address 0x10000 for RV32. (More later when we study “virtual memory”) Linker knows: Length of each text and data segment Ordering of text and data segments Linker calculates: Absolute address of each label to be jumped to (internal or external) and each piece of data being referenced 5 9/21/17 Fall 2017 - Lecture #9 Resolving References (2/2) To resolve references: Search for reference (data or label) in all “user” symbol tables If not found, search library files (e.g., for printf) Once absolute address is determined, fill in the machine code appropriately Output of linker: executable file containing text and data (plus header) 6 9/21/17 Fall 2017 - Lecture #9

Transcript of L09 Hardware and State - inst.eecs.berkeley.edu

Page 1: L09 Hardware and State - inst.eecs.berkeley.edu

9/21/17

1

CS61C:GreatIdeasinComputerArchitecture

IntroductiontoHardware:RepresentationsandState

Instructors:Krste Asanović andRandyH.Katz

http://inst.eecs.Berkeley.edu/~cs61c/fa17

1Fall2017 -- Lecture#99/21/17

FromLastTime:WhereAreWeNow?

9/21/17 Fall2017 - Lecture#9 2

Translation:CompilationtoAssembly1. Replacepseudowithrealinstructions2. Resolvelocaljumps,branches3. CreateSymbol,RelocationTables

SymbolTable:LabelsdefinedorreferencedRelocationTable:Addresslocationstobeadjusted

3.2. CALLING CONVENTION 29

Figure 3.1: Steps of translation from C source code to a running program. These are the logical steps,although some steps are combined to accelerate translation. We use the Unix file suffix name convention for

each type of file. The equivalent suffixes in MS-DOS are .C, .ASM, .OBJ, .LIB, and .EXE.

.o file 1text 1data 1info 1

.o file 2text 2data 2info 2

Linker

a.outRelocated text 1Relocated text 2Relocated data 1Relocated data 2

Linker(2/3)

9/21/17 Fall2017 - Lecture#9 3

FourTypesofAddresses

• PC-RelativeAddressing(beq,bne,jal; lla: auipc/addi)– Neverneedtorelocate(PIC:positionindependentcode)

• AbsoluteFunctionAddress(la:auipc/lw + jalr)– Alwaysrelocate

• ExternalFunctionReference(la:auipc/lw + jalr)– Alwaysrelocate

• StaticDataReferences(lui/addi)– Alwaysrelocate

49/21/17 Fall2017 - Lecture#9

ResolvingReferences(1/2)

• Linkerassumesfirstwordoffirsttextsegmentisataddress0x10000 forRV32.– (Morelaterwhenwestudy“virtualmemory”)

• Linkerknows:– Lengthofeachtextanddatasegment– Orderingoftextanddatasegments

• Linkercalculates:– Absoluteaddressofeachlabeltobejumpedto(internalorexternal)andeachpieceofdatabeingreferenced

59/21/17 Fall2017 - Lecture#9

ResolvingReferences(2/2)

• Toresolvereferences:– Searchforreference(dataorlabel)inall“user”symboltables– Ifnotfound,searchlibraryfiles(e.g.,forprintf)– Onceabsoluteaddressisdetermined,fillinthemachinecodeappropriately

• Outputoflinker:executablefilecontainingtextanddata(plusheader)

69/21/17 Fall2017 - Lecture#9

Page 2: L09 Hardware and State - inst.eecs.berkeley.edu

9/21/17

2

WhereAreWeNow?

79/21/17 Fall2017 - Lecture#9

LoaderBasics

• Input:ExecutableCode(e.g.,a.out forRISC-V)• Output:(programisrun)• Executablefilesarestoredondisk• Whenoneisrun,loaderloadsitintomemoryandstartit• Inreality,loaderistheoperatingsystem(OS)– LoadingisoneoftheOStasks

89/21/17 Fall2017 - Lecture#9

Loader…WhatDoesItDo?

• Readsexecutablefile’sheadertodeterminesizeoftextanddatasegments• Createsnewaddressspaceforprogramlargeenoughtoholdtextanddatasegments,alongwithastacksegment

• Copiesinstructions+datafromexecutablefileintothenewaddressspace• Copiesargumentspassedtotheprogramontothestack• Initializesmachineregisters

– Mostregisterscleared,butstackpointerassignedaddressof1stfreestacklocation• Jumpstostart-uproutinethatcopiesprogram’sargumentsfromstacktoregisters&setsthePC– Ifmainroutinereturns,start-uproutineterminatesprogramwiththeexitsystemcall

99/21/17 Fall2017 - Lecture#9

PeerInstructionAtwhatpointinprocessareallthemachinecodebitsdeterminedforthefollowingassemblyinstructions:1)add x6, x7, x82)jal x1, fprintf

A:1)&2)AftercompilationB:1)Aftercompilation,2)AfterassemblyC:1)Afterassembly,2)Afterlinking:1)Afterassembly,2)Afterloading

9/21/17 Fall2017 - Lecture#9 10

ExampleCProgram:Hello.c

#include <stdio.h> int main() { printf("Hello, %s\n", "world"); return 0;

}

119/21/17 Fall2017 - Lecture#9 11

CompiledHello.c:Hello.s.text.align 2.globl main

main:addi sp,sp,-16sw ra,12(sp)lui a0,%hi(string1)addi a0,a0,%lo(string1)lui a1,%hi(string2)addi a1,a1,%lo(string2)call printflw ra,12(sp)addi sp,sp,16li a0,0ret.section .rodata.balign 4

string1:.string "Hello, %s!\n"

string2:.string "world"

# Directive: enter text section# Directive: align code to 2^2 bytes# Directive: declare global symbol main# label for start of main# allocate stack frame# save return address# compute address of# string1# compute address of# string2# call function printf# restore return address# deallocate stack frame# load return value 0# return# Directive: enter read-only data section# Directive: align data section to 4 bytes# label for first string# Directive: null-terminated string# label for second string# Directive: null-terminated string

129/21/17 Fall2017 - Lecture#9 12

Page 3: L09 Hardware and State - inst.eecs.berkeley.edu

9/21/17

3

AssembledHello.s:LinkableHello.o00000000 <main>: 0: ff010113 addi sp,sp,-16 4: 00112623 sw ra,12(sp) 8: 00000537 lui a0,0x0 # addr placeholderc: 00050513 addi a0,a0,0 # addr placeholder10: 000005b7 lui a1,0x0 # addr placeholder14: 00058593 addi a1,a1,0 # addr placeholder18: 00000097 auipc ra,0x0 # addr placeholder1c: 000080e7 jalr ra # addr placeholder20: 00c12083 lw ra,12(sp) 24: 01010113 addi sp,sp,16 28: 00000513 addi a0,a0,0 2c: 00008067 jalr ra

139/21/17 Fall2017 - Lecture#9 13

LinkedHello.o:a.out000101b0 <main>: 101b0: ff010113 addi sp,sp,-16 101b4: 00112623 sw ra,12(sp) 101b8: 00021537 lui a0,0x21 101bc: a1050513 addi a0,a0,-1520 # 20a10 <string1> 101c0: 000215b7 lui a1,0x21 101c4: a1c58593 addi a1,a1,-1508 # 20a1c <string2> 101c8: 288000ef jal ra,0x10450 # 10450 <printf> 101cc: 00c12083 lw ra,12(sp) 101d0: 01010113 addi sp,sp,16 101d4: 00000513 addi a0,0,0101d8: 00008067 jalr ra

149/21/17 Fall2017 - Lecture#9

LUI/ADDIAddressCalculationinRISC-VTargetaddressof<string1>is0x00020 A10InstructionsequenceLUI 0x00020,ADDI 0xA10 doesnotquiteworkbecauseimmediates inRISC-Varesignextended(and0xA10 hasa1inthehighorderbit)!

0x00020 000 + 0xFFFFF A10 = 0x0001F A10 (Offby0x00001000)Sowegettherightaddressifwecalculateitasfollows:

(0x00020 000 + 0x00001 000) + 0xFFFFF A10 = 0x00020 A10Whatis0xFFFFF A10?

Twoscomplementof0xFFFFF A10 =0x00000 5EF + 1 =0x00000 5F0 =1520tenSo0xFFFFF A10 =-1520ten

InstructionsequenceLUI 0x00021,ADDI -1520 calculates0x00020 A10

159/21/17 Fall2017 - Lecture#9

Break!

169/21/17 Fall2017 - Lecture#9

• ParallelRequestsAssignedtocomputere.g.,Search“Katz”

• ParallelThreadsAssignedtocoree.g.,Lookup,Ads

• ParallelInstructions>[email protected].,5pipelinedinstructions

• ParallelData>[email protected].,Addof4pairsofwords

• HardwaredescriptionsAllgates@onetime

• ProgrammingLanguages9/21/17 Fall2017 -- Lecture#9 17

SmartPhone

WarehouseScale

Computer

SoftwareHardware

HarnessParallelism&AchieveHighPerformance

LogicGates

Core Core…

Memory(Cache)

Input/Output

Computer

CacheMemory

Core

InstructionUnit(s) FunctionalUnit(s)

A3+B3A2+B2A1+B1A0+B0

Today

YouareHere! LevelsofRepresentation/Interpretation

lw $t0,0($2)lw $t1,4($2)sw $t1,0($2)sw $t0,4($2)

HighLevelLanguageProgram(e.g.,C)

AssemblyLanguageProgram(e.g.,RISC-V)

MachineLanguageProgram(RISC-V)

HardwareArchitectureDescription(e.g.,blockdiagrams)

Compiler

Assembler

MachineInterpretation

temp=v[k];v[k]=v[k+1];v[k+1]=temp;

0000 1001 1100 0110 1010 1111 0101 10001010 1111 0101 1000 0000 1001 1100 0110 1100 0110 1010 1111 0101 1000 0000 1001 0101 1000 0000 1001 1100 0110 1010 1111

LogicCircuitDescription(CircuitSchematicDiagrams)

ArchitectureImplementation

Anythingcanberepresentedasanumber,

i.e.,dataorinstructions

9/21/17 18Fall2017 -- Lecture#9

Page 4: L09 Hardware and State - inst.eecs.berkeley.edu

9/21/17

4

Agenda

• SwitchingNetworks,Transistors• GatesandTruthTablesforCircuits• BooleanAlgebra• Logisim• AndinConclusion,…

9/21/17 19Fall2017 -- Lecture#9

Agenda

• SwitchingNetworks,Transistors• GatesandTruthTablesforCircuits• BooleanAlgebra• Logisim• AndinConclusion,…

9/21/17 20Fall2017 -- Lecture#9

HardwareDesign

• Nextseveralweeks:howamodernprocessorisbuilt,startingwithbasicelementsasbuildingblocks

• Whystudyhardwaredesign?– Understandcapabilitiesandlimitationsofhwingeneralandprocessorsinparticular

– Whatprocessorscandofastandwhattheycan’tdofast(avoidslowthingsifyouwantyourcodetorunfast!)

– Backgroundformoreindepthhwcourses(CS152)– Hardtoknowwhatwillneedfornext30years– Thereisjustsomuchyoucandowithstandardprocessors:youmayneedtodesignowncustomhwforextraperformance– Evensomecommercialprocessorstodayhavecustomizablehardware!

9/21/17 Fall2017 -- Lecture#9 21

SynchronousDigitalSystems

9/21/17 Fall2017 -- Lecture#9 22

Synchronous:• Alloperationscoordinatedbyacentralclock

§ “Heartbeat”ofthesystem!

Digital:• Representallvaluesbytwodiscretevalues• Electricalsignalsaretreatedas1’sand0’s

•1and0arecomplementsofeachother• High/lowvoltagefortrue/false,1/0

Hardwareofaprocessor,suchastheRISC-V,isanexampleofaSynchronousDigitalSystem

Fall2017 -- Lecture#9

A Z

Switches:BasicElementofPhysicalImplementations

• Implementingasimplecircuit(arrowshowsactionifwirechangesto“1”orisasserted):

Z º A

A Z

9/21/17 23

Closeswitch(ifAis“1”orasserted)andturnonlightbulb(Z)

Openswitch(ifAis“0”orunasserted)andturnofflightbulb(Z)

Fall2017 -- Lecture#9

AND

OR

Z º A and B

Z º A or B

A B

A

B

Switches(cont’d)

• Composeswitchesintomorecomplexones(Booleanfunctions):

9/21/17 24

Page 5: L09 Hardware and State - inst.eecs.berkeley.edu

9/21/17

5

HistoricalNote

• Earlycomputerdesignersbuiltadhoccircuitsfromswitches• Begantonoticecommonpatternsintheirwork:ANDs,ORs,…• Master’sthesis(byClaudeShannon)madelinkbetweenworkand19th CenturyMathematicianGeorgeBoole– Calledit“Boolean”inhishonor

• Couldapplymathtogivetheorytohardwaredesign,minimization,…

9/21/17 Fall2017 -- Lecture#9 25 Fall2017 -- Lecture#9

Transistors

• Highvoltage(Vdd)represents1,ortrue• Lowvoltage(0voltsorGround)represents0,orfalse• Letthresholdvoltage(Vth)decideifa0ora1• Ifswitchescontrolwhethervoltagescanpropagatethroughacircuit,canbuildacomputer

• Ourswitches:CMOStransistors

9/21/17 26

Fall2017 -- Lecture#9

CMOSTransistorNetworks

• ModerndigitalsystemsdesignedinCMOS– MOS:Metal-OxideonSemiconductor– Cforcomplementary: usepairsofnormally-openandnormally-closedswitches• UsedtobecalledCOS-MOSforcomplementary-symmetry-MOS

• CMOStransistorsactasvoltage-controlledswitches– Similar,thougheasiertoworkwith,thanrelayswitchesfromearlierera– Useenergyprimarilywhenswitching

9/21/17 27

n-channel transitoropen when voltage at Gate is low

closes when:voltage(Gate) > voltage (Threshold)

(High resistance when gate voltage Low,Low resistance when gate voltage High)

Fall2017 -- Lecture#9

p-channel transistorclosed when voltage at Gate is low

opens when:voltage(Gate) > voltage (Threshold)

(Low resistance when gate voltage Low,High resistance when gate voltage High)

CMOSTransistors

• Threeterminals:source,gate,anddrain– Switchaction:ifvoltageongateterminalis(someamount)higher/lowerthansourceterminalthenconductingpathestablishedbetweendrainandsourceterminals(switchisclosed)

Gate

Source Drain

Gate

Source Drain

9/21/17 28

Notecirclesymboltoindicate“NOT”or“complement”

Gate

DrainSource

Fall2017 -- Lecture#9

CMOSCircuitRules• Don’tpassweakvalues=>UseComplementaryPairs

– N-typetransistorspassweak1’s(Vdd - Vth)– N-typetransistorspassstrong0’s(ground)– UseN-typetransistorsonlytopass0’s(Nfornegative)– ConverseforP-typetransistors:Passweak0s,strong1s

• Passweak0’s(Vth),strong1’s(Vdd)• UseP-typetransistorsonlytopass1’s(Pforpositive)

– UsepairsofN-typeandP-typetogetstrongvalues• Neverleaveawireundriven

– Makesurethere’salwaysapathtoVdd orgnd• NevercreateapathfromVdd tognd (ground)

9/21/17 29

Agenda

• SwitchingNetworks,Transistors• GatesandTruthTablesforCircuits• BooleanAlgebra• Logisim• AndinConclusion,…

9/21/17 30Fall2017 -- Lecture#9

Page 6: L09 Hardware and State - inst.eecs.berkeley.edu

9/21/17

6

Fall2017 -- Lecture#9

1v

X

Y 0 volts(gnd)

x y

1 volt(Vdd)

0v

what is the relationship

between x and y?

MOSNetworks

9/21/17 31

p-channel transistorclosed when voltage at Gate is low

opens when:voltage(Gate) > voltage (Threshold)

n-channel transitoropen when voltage at Gate is low

closes when:voltage(Gate) > voltage (Threshold) Calledaninverterornotgate

Fall2017 -- Lecture#9

x y z

0 volts

1 volt

0 volt

1 volt

0 volts

0 volts1 volt

1 volt

what is the relationship between x, y and z?

TwoInputNetworks

1v

X Y

0v

Z

1v

X Y

0v

Z

9/21/17 32

x y z

0 volts

1 volt

0 volt

1 volt

0 volts

0 volts1 volt

1 volt

TruthTablesListoutputsforallpossibleinputs

9/21/17 Fall2017 -- Lecture#9 33

F Y

AB

CD

0

TruthTableExample#1:y=F(a,b):1iff a≠b

a b y0 0 00 1 11 0 11 1 0

9/21/17 34Fall2017 -- Lecture#9

Y=AB+AB

Y=A+B

XOR

TruthTableExample#2:2-bitAdder

9/21/17 Fall2017 -- Lecture#9 35

HowManyRows?

+ C1

A1A0

B1B0

C2

C0

TruthTableExample#3:32-bitUnsignedAdder

9/21/17 Fall2017 -- Lecture#9 36

HowManyRows?

Page 7: L09 Hardware and State - inst.eecs.berkeley.edu

9/21/17

7

TruthTableExample#4:3-inputMajorityCircuit

9/21/17 Fall2017 -- Lecture#9 37

Y=ABC+ABC+ABC+ABC

Y=BC+A(BC+BC)

Y=BC+A(B+C)

ThisiscalledSumofProductsform;JustanotherwaytorepresenttheTTasalogicalexpression

Moresimplifiedforms(fewergatesandwires)

Fall2017 -- Lecture#9

CombinationalLogicSymbols

Z

AB Z

Z

A

AB

EasytoimplementwithCMOStransistors(theswitcheswehaveavailableandusemost)

9/21/17 38

• Commoncombinationallogicsystemshavestandardsymbolscalledlogicgates

– Buffer,NOT

– AND,NAND

– OR,NOR

Agenda

• SwitchingNetworks,Transistors• GatesandTruthTablesforCircuits• BooleanAlgebra• Logisim ifthereistime• AndinConclusion,…

9/21/17 39Fall2017 -- Lecture#9

BooleanAlgebra

• UseplusforOR– “logicalsum”

• UseproductforAND(a�b orimpliedviaab)– “logicalproduct”

• “Hat”tomeancomplement(NOT)• Thusab +a+c

= a�b +a+c= (aANDb)ORaOR(NOTc )

9/21/17 Fall2017 -- Lecture#9 40

BooleanAlgebra:Circuit&AlgebraicSimplification

9/21/17 Fall2017 -- Lecture#9 41

LawsofBooleanAlgebra

9/21/17 Fall2017 -- Lecture#9 42

XX=0X0=0X1=XXX=XXY=YX

(XY)Z=Z(YZ)X(Y+Z)=XY+XZ

XY+X=XXY+X=X+YXY=X+Y

X+X=1X+1=1X+0=XX+X=X

X+Y=Y+X(X+Y)+Z=Z+(Y+Z)X+YZ=(X+Y)(X+Z)

(X+Y)X=X(X+Y)X=XYX+Y=XY

ComplementarityLawsof0’sand1’s

IdentitiesIdempotentLawsCommutativityAssociativityDistribution

UnitingTheoremUnitedTheoremv.2DeMorgan’s Law

Page 8: L09 Hardware and State - inst.eecs.berkeley.edu

9/21/17

8

BooleanAlgebraicSimplificationExample

9/21/17 Fall2017 -- Lecture#9 43

BooleanAlgebraicSimplificationExample

9/21/17 Fall2017 -- Lecture#9 44

ab c y00000011010001111001101111011111

Fall2017 -- Lecture#9

system

datapath control

stateregisters

combinationallogicmultiplexer comparatorcode

registers

register logic

switchingnetworks

DesignHierarchy

9/21/17 45

AConceptualRISC-VDatapath

9/21/17 Fall2017 -- Lecture#9 46

Administrivia

• Midterm#1NextTuesday:September26INCLASS!– ReviewsessionSaturday,September23:2-4pm@VLSB2050

• Approximatelyafourthofthetimededicatedtoreview• Restoftimewillbesolvingexam-levelquestions

–Midterm1roomassignmentswillbepostedonPiazza– Studentswhohaveanexamconflict:headTAStevenwillcontactyoubyemailwithyourtestingarrangements.

– Homework1Part2due11:59pmthisFriday– GuerrillaSessiontoday7-9pminCory540AB

479/21/17 Fall2017 - Lecture#9

Break!

489/21/17 Fall2017 - Lecture#9

Page 9: L09 Hardware and State - inst.eecs.berkeley.edu

9/21/17

9

Agenda

• SwitchingNetworks,Transistors• GatesandTruthTablesforCircuits• BooleanAlgebra• Logisim• StateMachines• AndinConclusion,…

9/21/17 49Fall2017 -- Lecture#9

Logisim• Freeschematiccapture/logicsimulationprograminJava

– “Agraphicaltoolfordesigningandsimulatinglogiccircuits”– Searchanddownloadversion2.7.1,onlinetutorial– ozark.hendrix.edu/~burch/logisim/

• Drawinginterfacebasedontoolbar– Color-codedwiresaidinsimulatinganddebuggingacircuit– Wiringtooldrawshorizontalandverticalwires,automaticallyconnectingtocomponentsandtootherwires.

• Circuitlayoutsusedas"subcircuits"ofothercircuits,allowinghierarchicalcircuitdesign

• Includedcircuitcomponents:inputsandoutputs,gates,multiplexers,arithmeticcircuits,flip-flops,RAMmemory

9/21/17 Fall2017 -- Lecture#9 50

Logisim Wires• Bluewires:valueatthatpointis"unknown”• Graywires:notconnectedtoanything• OKwheninprocessofbuildingacircuit• Whenfinished=>wiresnotbeblueorgray• Ifconnected,allwiresshouldbegreen– Brightgreena1– Darkgreena0

9/21/17 Fall2017 -- Lecture#9 51

CommonMistakesinLogisim

• Connectingwirestogether• Usinginputforoutput• Connectingtoedgewithoutconnectingtoactualinput– Unexpecteddirectionofinput

9/21/17 Fall2017 -- Lecture#9 52

Agenda

• SwitchingNetworks,Transistors• GatesandTruthTablesforCircuits• BooleanAlgebra• Logisim• AndinConclusion,…

9/21/17 53Fall2017 -- Lecture#9

AndinConclusion,…• LinkingandLoading– Linkercombinestogetherseparatemodules,usingthesymbolandrelocationtablestoadjustaddressesasnecessary

– Loadermovesanexecutablefilefromdisktomemoryinallowitsexecution

• MultipleHardwareRepresentations– Analogvoltagesquantizedtorepresentlogic0andlogic1– Transistorswitchesformgates:AND,OR,NOT,NAND,NOR– Truthtablemappedtogatesforcombinationallogicdesign– Booleanalgebraforgateminimization

9/21/17 Fall2017 -- Lecture#9 54