CS 110 Computer Architecture - ShanghaiTech · 2020. 6. 16. · off when voltage at Gate is low...
Transcript of CS 110 Computer Architecture - ShanghaiTech · 2020. 6. 16. · off when voltage at Gate is low...
CS110ComputerArchitecture
SynchronousDigitalSystems
Instructor:SörenSchwertfeger
http://shtech.org/courses/ca/
School of Information Science and Technology SIST
ShanghaiTech University
1Slides based on UC Berkley's CS61C
Compiling,Assembling,Linking,Loading(CALL)aProgram
2
Compiler• Input:High-LevelLanguageCode(e.g.,foo.c)
• Output:AssemblyLanguageCode(e.g.,foo.s forMIPS)
• Note:Outputmay containpseudo-instructions• Pseudo-instructions:instructionsthatassemblerunderstandsbutnotinmachineForexample:– move $s1,$s2Þ add $s1,$s2,$zero
3
Assembler• Input:AssemblyLanguageCode(MAL)(e.g.,foo.s forMIPS)
• Output:ObjectCode,informationtables(TAL)(e.g.,foo.o forMIPS)
• ReadsandUsesDirectives• ReplacePseudo-instructions• ProduceMachineLanguage• CreatesObjectFile
4
Linker• Input:Objectcodefiles,informationtables(e.g.,foo.o,libc.o forMIPS)
• Output:Executablecode(e.g.,a.out forMIPS)
• Combinesseveralobject(.o)filesintoasingleexecutable(“linking”)
• Step1:combinetextsegmentsfrom.o files• Step2:combinedatasegmentsfrom.ofiles• Step3:Resolvereferences:– GothroughRelocationTable;handleeachentry=>Resolveabsoluteaddresses
5
LoaderBasics
• Input:ExecutableCode(e.g.,a.out forMIPS)
• Output:(programisrun)• Executablefilesarestoredondisk• Whenoneisrun,loader’sjobistoloaditintomemoryandstartitrunning
• Inreality,loaderistheoperatingsystem(OS)– loadingisoneoftheOStasks
6
StaticvsDynamicallylinkedlibraries
• Whatwe’vedescribedisthetraditionalway:statically-linked approach– Thelibraryisnowpartoftheexecutable,soifthelibraryupdates,wedon’tgetthefix(havetorecompileifwehavesource)
– Itincludestheentire libraryevenifnotallofitwillbeused
– Executableisself-contained• Analternativeisdynamicallylinkedlibraries(DLL),commononWindows(.dll)&UNIX(.so)(sharedobject)platforms
7
Dynamicallylinkedlibraries
• Space/timeissues+Storingaprogramrequireslessdiskspace+Sendingaprogramrequireslesstime+Executingtwoprogramsrequireslessmemory(iftheysharealibrary)– Atruntime,there’stimeoverheadtodolink
• Upgrades+Replacingonefile(libXYZ.so)upgradeseveryprogramthatuseslibrary“XYZ”– Havingtheexecutableisn’tenoughanymore
Overall, dynamic linking adds quite a bit of complexity to the compiler, linker, and operating system. However, it provides many benefits that often outweigh these
en.wikipedia.org/wiki/Dynamic_linking
8
Dynamicallylinkedlibraries
• Theprevailingapproachtodynamiclinkingusesmachinecodeasthe“lowestcommondenominator”– Thelinkerdoesnotuseinformationabouthowtheprogramorlibrarywascompiled(i.e.,whatcompilerorlanguage)
– Thiscanbedescribedas“linkingatthemachinecodelevel”
– Thisisn’ttheonlywaytodoit...
9
InConclusion…§ Compiler converts a single HLL file
into a single assembly language file.§ Assembler removes pseudo-
instructions, converts what it can to machine language, and creates a checklist for the linker (relocation table). A .s file becomes a .o file.ú Does 2 passes to resolve addresses,
handling internal forward references
§ Linker combines several .o files and resolves absolute addresses.ú Enables separate compilation, libraries
that need not be compiled, and resolves remaining addresses
§ Loader loads executable into memory and begins execution.
10
LevelsofRepresentation/Interpretation
lw $t0,0($2)lw $t1,4($2)sw $t1,0($2)sw $t0,4($2)
HighLevelLanguageProgram(e.g.,C)
AssemblyLanguageProgram(e.g.,MIPS)
MachineLanguageProgram(MIPS)
HardwareArchitectureDescription(e.g.,blockdiagrams)
Compiler
Assembler
MachineInterpretation
temp=v[k];v[k]=v[k+1];v[k+1]=temp;
0000 1001 1100 0110 1010 1111 0101 10001010 1111 0101 1000 0000 1001 1100 0110 1100 0110 1010 1111 0101 1000 0000 1001 0101 1000 0000 1001 1100 0110 1010 1111
LogicCircuitDescription(CircuitSchematicDiagrams)
ArchitectureImplementation
Anythingcanberepresentedasanumber,
i.e.,dataorinstructions
11
• ParallelRequestsAssignedtocomputere.g.,Search“Katz”
• ParallelThreadsAssignedtocoree.g.,Lookup,Ads
• ParallelInstructions>[email protected].,5pipelinedinstructions
• ParallelData>[email protected].,Addof4pairsofwords
• HardwaredescriptionsAllgates@onetime
• ProgrammingLanguages12
SmartPhone
WarehouseScale
Computer
SoftwareHardware
HarnessParallelism&AchieveHighPerformance
LogicGates
Core Core…
Memory(Cache)
Input/Output
Computer
CacheMemory
Core
InstructionUnit(s) FunctionalUnit(s)
A3+B3A2+B2A1+B1A0+B0
Today
YouareHere!
HardwareDesign• Nextseveralweeks:howamodernprocessorisbuilt,
startingwithbasicelementsasbuildingblocks• Whystudyhardwaredesign?
– UnderstandcapabilitiesandlimitationsofHWingeneralandprocessorsinparticular
– Whatprocessorscandofastandwhattheycan’tdofast(avoidslowthingsifyouwantyourcodetorunfast!)
– Backgroundformorein-depthHWcourses– Hardtoknowwhatyou’llneedfornext30years– Thereisonlysomuchyoucandowithstandardprocessors:you
mayneedtodesignowncustomHWforextraperformance– Evensomecommercialprocessorstodayhavecustomizablehardware!– E.g.GoogleTensorProcessingUnit(TPU)
13
SynchronousDigitalSystems
14
Synchronous:• Alloperationscoordinatedbyacentralclock
§ “Heartbeat”ofthesystem!
Digital:• Representallvalues bydiscretevalues• Twobinarydigits:1and0• Electricalsignalsaretreatedas1’sand0’s
• 1and0arecomplementsofeachother• High /low voltagefortrue /false,1 /0
Hardwareofaprocessor,suchastheMIPS,isanexampleofaSynchronousDigitalSystem
A Z
Switches:BasicElementofPhysicalImplementations
• Implementingasimplecircuit(arrowshowsactionifwirechangesto“1”orisasserted):
Z º A
A Z
15
On-switch(ifAis“1”orasserted)turns-onlightbulb(Z)
Off-switch(ifAis“0”orunasserted)turns-offlightbulb(Z)
AND
OR
Z º A and B
Z º A or B
A B
A
B
Switches(cont’d)
• Composeswitchesintomorecomplexones(Booleanfunctions):
16
HistoricalNote
• Earlycomputerdesignersbuiltadhoccircuitsfromswitches
• Begantonoticecommonpatternsintheirwork:ANDs,ORs,…
• Master’sthesis(byClaudeShannon,1940)madelinkbetweenworkand19th CenturyMathematicianGeorgeBoole– Calledit“Boolean”inhishonor
• Couldapplymathtogivetheorytohardwaredesign,minimization,…
17
Transistors• Highvoltage(Vdd)represents1,ortrue
– Inmodernmicroprocessors,Vdd ~1.0Volt• Lowvoltage(0Voltor Ground)represents0,orfalse• Pickamidpointvoltagetodecideifa0ora1
– Voltagegreaterthanmidpoint=1– Voltagelessthanmidpoint=0– Thisremovesnoiseassignalspropagate– abigadvantageof
digitalsystemsoveranalogsystems• If oneswitchcancontrolanotherswitch,wecanbuilda
computer!• Ourswitches:CMOStransistors
18
CMOSTransistorNetworks• ModerndigitalsystemsdesignedinCMOS– MOS:Metal-OxideonSemiconductor– Cforcomplementary: usepairsofnormally-on andnormally-off switches
• CMOStransistorsactasvoltage-controlledswitches– Similar,thougheasiertoworkwith,thanelectro-mechanicalrelayswitchesfromearlierera
– Useenergyprimarilywhenswitching
19
n-channel transitoroff when voltage at Gate is low
on when:voltage (Gate) > voltage (Threshold)
(High resistance when gate voltage Low,Low resistance when gate voltage High)
p-channel transistoron when voltage at Gate is low
off when:voltage (Gate) > voltage (Threshold)
(Low resistance when gate voltage Low,High resistance when gate voltage High)
CMOSTransistors• Threeterminals: source,gate,anddrain– Switchaction:ifvoltageongateterminalis(someamount)higher/lowerthansourceterminalthenconductingpathestablishedbetweendrainandsourceterminals(switchisclosed)
Gate
Source Drain
Gate
Source Drain
20
Notecirclesymboltoindicate“NOT”or“complement”
Gate
DrainSource
field-effecttransistor(FET)=>CMOScircuitsuseacombinationofp-typeandn-typemetal–oxide–semiconductorfield-effecttransistors=>
MOSFET
21
GordonMooreIntelCofounder
#oftran
sistorsonan
integrated
circuit(IC)
Year
#2:Moore’sLaw
Predicts:2XTransistors/chip
every2years
Modernmicroprocessorchipsincludeseveralbilliontransistors
Intel14nmTechnology
22Planviewoftransistors
Sideviewofwiringlayers
1nm=1/1,000,000,000m;wavelengthvisiblelight:400– 700nm
SenseofScale
23
Source:MarkBohr,IDF14
1nm=1/1,000,000,000m;wavelengthvisiblelight:400– 700nm
CMOSCircuitRules• Don’tpassweakvalues=>UseComplementaryPairs– N-typetransistorspassweak1’s(Vdd - Vth)– N-typetransistorspassstrong0’s(ground)– UseN-typetransistorsonlytopass0’s(Nfornegative)– ConverseforP-typetransistors:Passweak0s,strong1s
• Passweak0’s(Vth),strong1’s(Vdd)• UseP-typetransistorsonlytopass1’s(Pforpositive)
– UsepairsofN-typeandP-typetogetstrongvalues• Neverleaveawireundriven– Makesurethere’salwaysapathtoVdd orGND
• NevercreateapathfromVdd toGND(ground)– Thiswouldshort-circuitthepowersupply!
24
1V
X
Y 0Volt(GND)
X Y
1 Volt(Vdd)
0V
whatistherelationship
betweenxandy?
CMOSNetworks
25
p-channel transistoron when voltage at Gate is low
off when:voltage(Gate) > voltage (Threshold)
n-channel transitoroff when voltage at Gate is low
on when:voltage(Gate) > voltage (Threshold) Calledaninverterornotgate
1 Volt(Vdd)
0Volt(GND)
whatistherelationship betweenx,y andz?
Two-InputNetworks
1V
X Y
0V
Z
26
X Y Z
0Volt
1Volt
0Volt
1Volt
0Volt
0Volt
1Volt
1Volt
1Volt
1Volt
1Volt
0Volt
CalledaNANDgate(NOTAND)
X Y
0Volt
1Volt
0Volt
1Volt
0Volt
0Volt
1Volt
1Volt
Question
1V
X Y
0v
Z
27
Volts
Volts
Volts
Volts
Z
0 0 1
0 1 0
0 1 0 1
1 1 0 0
A B C
• Commoncombinationallogicsystemshavestandardsymbolscalledlogicgates
– Buffer,NOT
– AND,NAND
– OR,NOR
CombinationalLogicSymbols
Z
AB Z
Z
A
AB
Invertingversions(NOT,NAND,NOR)easiest
toimplement withCMOStransistors (the
switcheswehaveavailableandusemost)
28
1V
X Y
0V
1V
XY
0V
Remember…
•AND•OR
29
Admin
• MidtermI:April19th!– Allowedmaterial:1hand-writtenbyyouEnglishdouble-sidedA4cheatsheet.• Notcopied– originalhandwritten– everything• Violations:
– Foundbeforemidterm:confiscatecheatsheet– During/after:0ptsinmidterm
– MIPSgreencardprovidedbyus!– Noelectronicdevices– no Calculator!– Content:Numberrepresentation,C,MIPS,CALL– ReviewsessiononApril17th.
• Project1.1autograder30
BooleanAlgebra
• Useplus“+”forOR– “logicalsum” 1+0=0+1=1(True);1+1=2(True);0+0=0(False)
• UseproductforAND(a�b orimpliedviaab)– “logicalproduct”0*0=0*1=1*0=0(False);1*1=1(True)
• “Hat”tomeancomplement(NOT)• Thusab +a+c
= a�b +a+c= (aANDb)ORaOR(NOTc )
31
TruthTablesforCombinationalLogic
32
F Y
AB
CD
0
Exhaustivelistoftheoutputvaluegeneratedforeachcombinationofinputs
HowmanylogicfunctionscanbedefinedwithNinputs?
TruthTableExample#1:y=F(a,b):1iff a≠b
a b y0 0 00 1 11 0 11 1 0
33
Y=AB+AB
Y=A+B
XOR
TruthTableExample#2:2-bitAdder
34
HowManyRows?
+ C1
A1A0
B1B0
C2
C0
TruthTableExample#3:32-bitUnsignedAdder
35
HowManyRows?
TruthTableExample#4:3-inputMajorityCircuit
36
Y=ABC+ABC+ABC+ABC
Y=BC+A(BC+BC)
Y=BC+A(B+C)
ThisiscalledSumofProductsform;JustanotherwaytorepresenttheTTasalogicalexpression
Moresimplifiedforms(fewergatesandwires)
BooleanAlgebra:Circuit&AlgebraicSimplification
37
RepresentationsofCombinationalLogic(groupsoflogicgates)
TruthTable
GateDiagramBooleanExpression
SumofProducts,ProductofSumsMethods
EnumerateInputs
EnumerateInputs
UseEquivalencybetweenbooleanoperatorsand
gates
LawsofBooleanAlgebra
39
XX=0X0=0X1=XXX=XXY=YX
(XY)Z=Z(YZ)X(Y+Z)=XY+XZ
XY+X=XXY+X=X+YXY=X+Y
X+X=1X+1=1X+0=XX+X=X
X+Y=Y+X(X+Y)+Z=Z+(Y+Z)X+YZ=(X+Y)(X+Z)
(X+Y)X=X(X+Y)X=XYX+Y=XY
ComplementarityLawsof0’sand1’s
IdentitiesIdempotentLawsCommutativityAssociativityDistribution
UnitingTheoremUnitingTheoremv.2DeMorgan’s Law
BooleanAlgebraicSimplificationExample
40
BooleanAlgebraicSimplificationExample
41
ab c y00000011010001111001101111011111
Question
• SimplifyZ=A+BC+A(BC)
• A: Z=0• B: Z=A(1+BC)• C:Z=(A+BC)• D:Z=BC• E:Z=1
42
News(2017):OpenComputeProjectSummit:
Google&STMicroelectronics:48VtoChip• Point-of-Load-(PoL)Converter• 48Vto0.5V..1V..upto12V>300W@1V!• Efficiency:230VAC89.3%;48VDC92.1%
43
44
45
SignalsandWaveformsan-1 an-1 a0
Noisy!Delay!
SignalsandWaveforms:Grouping
SignalsandWaveforms:CircuitDelay
2
3
3 4 5
10 0 1
5 13 4 6
SampleDebuggingWaveform
TypeofCircuits• SynchronousDigitalSystemsconsistoftwobasictypesofcircuits:• CombinationalLogic(CL)circuits
–Outputisafunctionoftheinputsonly,notthehistoryofitsexecution– E.g.,circuitstoaddA,B(ALUs)
• SequentialLogic(SL)• Circuitsthat“remember”orstoreinformation• aka“StateElements”• E.g.,memoriesandregisters(Registers)
50