Post on 03-Jun-2022
CS110ComputerArchitecture
(a.k.a.MachineStructures)Lecture1:CourseIntroduction
Instructor:SörenSchwertfeger
http://shtech.org/courses/ca/
School of Information Science and Technology SIST
ShanghaiTech University
1Slides based on UC Berkley's CS61C
Agenda
• ThinkingaboutMachineStructures• GreatIdeasinComputerArchitecture• Whatyouneedtoknowaboutthisclass• EverythingisaNumber
2
Agenda
• ThinkingaboutMachineStructures• GreatIdeasinComputerArchitecture• Whatyouneedtoknowaboutthisclass• EverythingisaNumber
3
CS110isNOTreallyaboutCProgramming
• Itisaboutthehardware-software interface– Whatdoestheprogrammerneedtoknowtoachievethehighestpossibleperformance
• Cisclosetotheunderlyinghardware,unlikelanguageslikeScheme,Python,Java!– Allowsustotalkaboutkeyhardwarefeaturesinhigherlevelterms
– Allowsprogrammertoexplicitlyharnessunderlyinghardwareparallelismforhighperformance
4
OldSchoolComputerArchitecture
5
NewSchoolComputerArchitecture(1/3)
6
PersonalMobileDevices
NewSchoolComputerArchitecture(2/3)
7
NewSchoolComputerArchitecture(3/3)
8
OldSchoolMachineStructures
9
I/OsystemProcessor
CompilerOperatingSystem(MacOSX)
Application(ex:browser)
DigitalDesignCircuitDesign
InstructionSetArchitecture
Datapath&Control
transistors
MemoryHardware
Software Assembler
New-SchoolMachineStructures(It’sabitmorecomplicated!)
• ParallelRequestsAssigned tocomputere.g.,Search“cats”
• ParallelThreadsAssigned tocoree.g.,Lookup,Ads
• ParallelInstructions>1instruction@onetimee.g.,5pipelined instructions
• ParallelData>1dataitem@one timee.g.,Addof4pairsofwords
• HardwaredescriptionsAllgatesfunctioning in
parallelatsametime10
SmartPhone
Warehouse-Scale
Computer
SoftwareHardware
HarnessParallelism&AchieveHighPerformance
LogicGates
Core Core…
Memory(Cache)
Input/Output
Computer
MainMemory
Core
InstructionUnit(s) FunctionalUnit(s)
A3+B3A2+B2A1+B1A0+B0
Project1
Project3
Project2
Agenda
• ThinkingaboutMachineStructures• GreatIdeasinComputerArchitecture• Whatyouneedtoknowaboutthisclass• EverythingisaNumber
11
6GreatIdeasinComputerArchitecture
1. Abstraction(LayersofRepresentation/Interpretation)
2. Moore’sLaw(Designingthroughtrends)3. PrincipleofLocality(MemoryHierarchy)4. Parallelism5. PerformanceMeasurement&Improvement6. DependabilityviaRedundancy
12
GreatIdea#1:Abstraction(LevelsofRepresentation/Interpretation)
13
lw $t0,0($2)lw $t1,4($2)sw $t1,0($2)sw $t0,4($2)
HighLevelLanguageProgram(e.g.,C)
AssemblyLanguageProgram(e.g.,MIPS)
MachineLanguageProgram(MIPS)
HardwareArchitectureDescription(e.g.,blockdiagrams)
Compiler
Assembler
MachineInterpretation
temp=v[k];v[k]=v[k+1];v[k+1]=temp;
0000 1001 1100 0110 1010 1111 0101 10001010 1111 0101 1000 0000 1001 1100 0110 1100 0110 1010 1111 0101 1000 0000 1001 0101 1000 0000 1001 1100 0110 1010 1111
LogicCircuitDescription(CircuitSchematicDiagrams)
ArchitectureImplementation
Anythingcanberepresentedasanumber,
i.e.,dataorinstructions
#2:Moore’sLaw
14
GordonMooreIntelCofounder
Predicts:2XTransistors/chip
every2years
InterestingTimes
15
Moore’sLawreliedonthecostoftransistorsscalingdownastechnologyscaledtosmallerandsmallerfeaturesizes.
BUTnewest,smallestfabricationprocesses<14nm,mighthavegreatercost/transistor!!!!So,whyshrink????
JimGray’sStorageLatencyAnalogy:HowFarAwayistheData?
RegistersOn Chip CacheOn Board Cache
Main Memory
Disk
12
10
100
Tape /Optical Robot
109
106
This CampusThis Room
My Head
10 min
1.5 hr
2 Years
1 min
Pluto
2,000 Years
Andromeda
(ns)
JimGrayTuringAward
Suzhou
GreatIdea#3:PrincipleofLocality/MemoryHierarchy
2/23/16 17
GreatIdea#4:Parallelism
18
2/23/16 19
Caveat:Amdahl’sLaw
GeneAmdahlComputerPioneer
GreatIdea#5:PerformanceMeasurementandImprovement
• Tuningapplicationtounderlyinghardwaretoexploit:– Locality– Parallelism– Specialhardwarefeatures,likespecializedinstructions(e.g.,matrixmanipulation)
• Latency– Howlongtosettheproblemup– Howmuchfasterdoesitexecuteonceitgetsgoing– Itisallabouttimetofinish
20
CopingwithFailures
• 4disks/server,50,000servers• Failurerateofdisks:2%to10%/year
– Assume4%annualfailurerate• Onaverage,howoftendoesadiskfail?
a) 1/monthb) 1/weekc) 1/dayd) 1/hour
21
CopingwithFailures
• 4disks/server,50,000servers• Failurerateofdisks:2%to10%/year
– Assume4%annualfailurerate• Onaverage,howoftendoesadiskfail?
a) 1/monthb) 1/weekc) 1/dayd) 1/hour
22
50,000x 4=200,000disks200,000x 4%=8000disksfail
365daysx 24hours=8760hours
NASAFixingRover’sFlashMemory
• OpportunitystillactiveonMarsafter>10years
• Butflashmemorywornout
• Newsoftwareupdatetoavoidusingwornoutmemorybanks
23http://www.engadget.com/2014/12/30/nasa-opportunity-rover-flash-fix/
GreatIdea#6:DependabilityviaRedundancy
• Redundancysothatafailingpiecedoesn’tmakethewholesystemfail
24
1+1=2 1+1=2 1+1=1
1+1=22of3agree
FAIL!
Increasingtransistordensity reducesthecostofredundancy
GreatIdea#5:DependabilityviaRedundancy
• Appliestoeverythingfromdatacenterstostoragetomemorytoinstructors– Redundantdatacenters sothatcanlose1datacenterbutInternetservicestaysonline
– Redundantdisks sothatcanlose1diskbutnotlosedata(RedundantArraysofIndependentDisks/RAID)
– Redundantmemorybits ofsothatcanlose1bitbutnodata(ErrorCorrectingCode/ECCMemory)
25
Agenda
• ThinkingaboutMachineStructures• GreatIdeasinComputerArchitecture• Whatyouneedtoknowaboutthisclass• EverythingisaNumber
26
WeeklySchedule
Lecture Tuesday,08:15-09:55.room:H2109Lecture Friday,10:15-11:55. room:H2109Discussions Tuesday,18:40-20:20.教学楼309.Lab1 Monday,15:00-16:40.教学楼308Lab2 Tuesday,15:00-16:40.教学楼309Lab3 Thursday,15:00-16:40.行政楼405
27
CourseInformation• CourseWeb:http://shtech.org/course/ca/• Acknowledgement:InstructorsofUCBerkeley’sCS61C:
http://www-inst.eecs.berkeley.edu/~cs61c/• Instructor:
– SörenSchwertfeger• TeachingAssistants:(seewebpage)• Textbooks:Average15pagesofreading/week
– Patterson&Hennessey,ComputerOrganizationandDesign,5th Edition(Chineseversionis4th edition– significantdifferences!)
– Kernighan&Ritchie,TheCProgrammingLanguage,2nd Edition– Barroso&Holzle,TheDatacenterasaComputer,2nd Edition
• Piazza:– Everyannouncement,discussion,clarificationhappensthere
28
CourseGrading• Projects:33%• Homework:17%• Lab:10%• Exams:35%
– Midterm1:7.5%– Midterm2:7.5%– Final:20%
• Participation:5%
29
LatePolicy…SlipDays!• Assignmentsdueat11:59:59PM• Youhave3 slipdaytokens(NOThourormin)• Everydayyourprojectorhomeworkislate(evenbyaminute)wedeductatoken
• Afteryou’veusedupalltokens,it’s25%deductedperday.– Nocreditifmorethan3dayslate– Saveyourtokensforprojects,worthmore!!
• Noneedforsobstories,justuseaslipday!• Gradebot isopentill3daysafterduedate!• Ifyouneedmoretime(slipdaysplusdeduction)sendtheTAandProf.anemail!
30
PolicyonAssignmentsandIndependentWork
• ALLPROJECTSWILLBEDONEWITHAPARTNER• Withtheexceptionoflaboratoriesandassignmentsthatexplicitlypermityouto
workingroups, allhomeworkandprojectsaretobeYOURworkandyourworkALONE.
• PARTNERTEAMSMAYNOTWORKWITHOTHERPARTNERTEAMS• Youareencouraged todiscussyourassignmentswithotherstudents,andcreditwill
beassignedtostudentswhohelpothers,particularlybyansweringquestionsonPiazza,butweexpectthatwhatyouhandinisyours.
• ItisNOTacceptabletocopysolutions fromother students.• ItisNOTacceptabletocopy(or startyour) solutions fromtheWeb.• ItisNOTacceptabletousePUBLICgithub archives(giving youranswersaway)• Wehavetoolsandmethods, developedovermanyyears,fordetectingthis.You
WILLbecaught,andthepenaltiesWILLbesevere.• AttheminimumFinthecourse,andalettertoyouruniversityrecorddocumenting
theincidenceofcheating.• BothGiverandReceiverareequallyculpableandsufferequalpenalties
31
Discussion&Labs&HW1
• Firstdiscussiontoday!Tuesday,18:40-20:20教学楼309– Topic:Numberrepresentation– Letusknowwhattopicsyou’dliketohavecovered!– Topicnextdiscussion:C
• Labs:Findapartnerforyourlab-workandtheprojects– fromyoulabclass!– SendanemailtoXuQingwen (xuqw)– Labsstartnextweek
• HW1willbepostedonFriday.
32
ArchitectureofatypicalLecture
33
Attention
Time(minutes)10 35 60 78 90
Administrivia “Andinconclusion…”
Full
Fun/News
Agenda
• ThinkingaboutMachineStructures• GreatIdeasinComputerArchitecture• Whatyouneedtoknowaboutthisclass• EverythingisaNumber
34
KeyConcepts• Insidecomputers,everythingisanumber• Butnumbersusuallystoredwithafixedsize
– 8-bitbytes,16-bithalfwords,32-bitwords,64-bitdoublewords,…
• Integerandfloating-pointoperationscanleadtoresultstoobig/smalltostorewithintheirrepresentations:overflow/underflow
35
NumberRepresentation
• Valueofi-th digitisd × Baseiwherei startsat0andincreasesfromrighttoleft:
• 12310=110 x 10102 +210 x 10101 +310 x 10100
=1x10010 +2x1010 +3x110=10010 +2010 +310=12310
• Binary(Base2),Hexadecimal(Base16),Decimal(Base10)differentwaystorepresentaninteger– Weuse1two,5ten,10hex tobeclearer
(vs.12,48,510,1016)
36
NumberRepresentation
• Hexadecimaldigits:0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F
• FFFhex =15tenx16ten2 +15tenx16ten1 +15tenx16ten0=3840ten +240ten +15ten=4095ten
• 111111111111two =FFFhex =4095ten• Mayputblankseverygroupofbinary,octal,orhexadecimaldigitstomakeiteasiertoparse,likecommasindecimal
37
SignedandUnsignedIntegers
• C,C++,andJavahavesignedintegers,e.g.,7,-255:int x, y, z;
• C,C++alsohaveunsigned integers,e.g.foraddresses
• 32-bitwordcanrepresent232 binarynumbers• Unsignedintegersin32bitwordrepresent0to232-1(4,294,967,295)(4Gig)
38
UnsignedIntegers00000000000000000000000000000000two =0ten00000000000000000000000000000001two =1ten00000000000000000000000000000010two =2ten
... ...01111111111111111111111111111101two =2,147,483,645ten01111111111111111111111111111110two =2,147,483,646ten01111111111111111111111111111111two =2,147,483,647ten10000000000000000000000000000000two =2,147,483,648ten10000000000000000000000000000001two =2,147,483,649ten10000000000000000000000000000010two =2,147,483,650ten
... ...11111111111111111111111111111101two =4,294,967,293ten11111111111111111111111111111110two =4,294,967,294ten11111111111111111111111111111111two =4,294,967,295ten
39
SignedIntegersandTwo’s-ComplementRepresentation
• SignedintegersinC;want½numbers<0,want½numbers>0,andwantone0
• Two’scomplementtreats0aspositive,so32-bitwordrepresents232integersfrom-231(–2,147,483,648) to231-1(2,147,483,647)– Note:onenegativenumberwithnopositiveversion– Booklistssomeotheroptions,allofwhichareworse– Everycomputerusestwo’scomplementtoday
• Most-significantbit(leftmost)isthesignbit,since0meanspositive(including0),1meansnegative– Bit31ismostsignificant,bit0isleastsignificant
40
Two’s-ComplementIntegers00000000000000000000000000000000two =0ten00000000000000000000000000000001two =1ten00000000000000000000000000000010two =2ten
... ...01111111111111111111111111111101two =2,147,483,645ten01111111111111111111111111111110two =2,147,483,646ten01111111111111111111111111111111two =2,147,483,647ten10000000000000000000000000000000two =–2,147,483,648ten10000000000000000000000000000001two =–2,147,483,647ten10000000000000000000000000000010two =–2,147,483,646ten
... ...11111111111111111111111111111101two =–3ten11111111111111111111111111111110two =–2ten11111111111111111111111111111111two =–1ten
41
SignBit
WaystoMakeTwo’sComplement• ForN-bitword,complementto2tenN
– For4bitnumber3ten=0011two,two’scomplement
(i.e.-3ten)wouldbe
16ten-3ten=13ten or10000two – 0011two =1101two
42
• Hereisaneasierway:– Invertallbitsandadd1
– Computersactuallydoitlikethis,too
0011two
1100two+1two
3ten
1101two
Bitwisecomplement
-3ten
Two’s-ComplementExamples
• Assumeforsimplicity4bitwidth,-8to+7represented
43
00110010
3+25 0101
00111110
3+(-2)
1 10001
01110001
7+1-8 1000Overflow!
11011110
-3+(-2)
-5 11011
10001111
-8+(-1)+7 10111
CarryintoMSB=CarryOutMSB
CarryintoMSB=CarryOutMSB
Underflow!
Overflow/Underflowwhenmagnitudeofresulttoobig/toosmalltofitintoresultrepresentation
Carryin=carryfromlesssignificantbitsCarryout=carrytomoresignificantbits
0to+31
-16to+15
-32to+31☐
☐
☐
☐
44
Supposewehada5-bitword.Whatintegerscanberepresentedintwo’scomplement?
0to+31
-16to+15
-32to+31☐
☐
☐
☐
45
Supposewehada5bitword.Whatintegerscanberepresentedintwo’scomplement?
Summary• ComputerArchitecture:Learn6greatideasincomputerarchitecturetoenablehighperformanceprogrammingviaparallelism,notjustlearnC1. Abstraction
(LayersofRepresentation/Interpretation)2. Moore’sLaw3. PrincipleofLocality/MemoryHierarchy4. Parallelism5. PerformanceMeasurementandImprovement6. DependabilityviaRedundancy
• EverythingisaNumber!46