CS 110 Computer Architecture (a.k.a. Machine Structures ...

Post on 03-Jun-2022

2 views 0 download

Transcript of CS 110 Computer Architecture (a.k.a. Machine Structures ...

CS110ComputerArchitecture

(a.k.a.MachineStructures)Lecture1:CourseIntroduction

Instructor:SörenSchwertfeger

http://shtech.org/courses/ca/

School of Information Science and Technology SIST

ShanghaiTech University

1Slides based on UC Berkley's CS61C

Agenda

• ThinkingaboutMachineStructures• GreatIdeasinComputerArchitecture• Whatyouneedtoknowaboutthisclass• EverythingisaNumber

2

Agenda

• ThinkingaboutMachineStructures• GreatIdeasinComputerArchitecture• Whatyouneedtoknowaboutthisclass• EverythingisaNumber

3

CS110isNOTreallyaboutCProgramming

• Itisaboutthehardware-software interface– Whatdoestheprogrammerneedtoknowtoachievethehighestpossibleperformance

• Cisclosetotheunderlyinghardware,unlikelanguageslikeScheme,Python,Java!– Allowsustotalkaboutkeyhardwarefeaturesinhigherlevelterms

– Allowsprogrammertoexplicitlyharnessunderlyinghardwareparallelismforhighperformance

4

OldSchoolComputerArchitecture

5

NewSchoolComputerArchitecture(1/3)

6

PersonalMobileDevices

NewSchoolComputerArchitecture(2/3)

7

NewSchoolComputerArchitecture(3/3)

8

OldSchoolMachineStructures

9

I/OsystemProcessor

CompilerOperatingSystem(MacOSX)

Application(ex:browser)

DigitalDesignCircuitDesign

InstructionSetArchitecture

Datapath&Control

transistors

MemoryHardware

Software Assembler

New-SchoolMachineStructures(It’sabitmorecomplicated!)

• ParallelRequestsAssigned tocomputere.g.,Search“cats”

• ParallelThreadsAssigned tocoree.g.,Lookup,Ads

• ParallelInstructions>1instruction@onetimee.g.,5pipelined instructions

• ParallelData>1dataitem@one timee.g.,Addof4pairsofwords

• HardwaredescriptionsAllgatesfunctioning in

parallelatsametime10

SmartPhone

Warehouse-Scale

Computer

SoftwareHardware

HarnessParallelism&AchieveHighPerformance

LogicGates

Core Core…

Memory(Cache)

Input/Output

Computer

MainMemory

Core

InstructionUnit(s) FunctionalUnit(s)

A3+B3A2+B2A1+B1A0+B0

Project1

Project3

Project2

Agenda

• ThinkingaboutMachineStructures• GreatIdeasinComputerArchitecture• Whatyouneedtoknowaboutthisclass• EverythingisaNumber

11

6GreatIdeasinComputerArchitecture

1. Abstraction(LayersofRepresentation/Interpretation)

2. Moore’sLaw(Designingthroughtrends)3. PrincipleofLocality(MemoryHierarchy)4. Parallelism5. PerformanceMeasurement&Improvement6. DependabilityviaRedundancy

12

GreatIdea#1:Abstraction(LevelsofRepresentation/Interpretation)

13

lw $t0,0($2)lw $t1,4($2)sw $t1,0($2)sw $t0,4($2)

HighLevelLanguageProgram(e.g.,C)

AssemblyLanguageProgram(e.g.,MIPS)

MachineLanguageProgram(MIPS)

HardwareArchitectureDescription(e.g.,blockdiagrams)

Compiler

Assembler

MachineInterpretation

temp=v[k];v[k]=v[k+1];v[k+1]=temp;

0000 1001 1100 0110 1010 1111 0101 10001010 1111 0101 1000 0000 1001 1100 0110 1100 0110 1010 1111 0101 1000 0000 1001 0101 1000 0000 1001 1100 0110 1010 1111

LogicCircuitDescription(CircuitSchematicDiagrams)

ArchitectureImplementation

Anythingcanberepresentedasanumber,

i.e.,dataorinstructions

#2:Moore’sLaw

14

GordonMooreIntelCofounder

Predicts:2XTransistors/chip

every2years

InterestingTimes

15

Moore’sLawreliedonthecostoftransistorsscalingdownastechnologyscaledtosmallerandsmallerfeaturesizes.

BUTnewest,smallestfabricationprocesses<14nm,mighthavegreatercost/transistor!!!!So,whyshrink????

JimGray’sStorageLatencyAnalogy:HowFarAwayistheData?

RegistersOn Chip CacheOn Board Cache

Main Memory

Disk

12

10

100

Tape /Optical Robot

109

106

This CampusThis Room

My Head

10 min

1.5 hr

2 Years

1 min

Pluto

2,000 Years

Andromeda

(ns)

JimGrayTuringAward

Suzhou

GreatIdea#3:PrincipleofLocality/MemoryHierarchy

2/23/16 17

GreatIdea#4:Parallelism

18

2/23/16 19

Caveat:Amdahl’sLaw

GeneAmdahlComputerPioneer

GreatIdea#5:PerformanceMeasurementandImprovement

• Tuningapplicationtounderlyinghardwaretoexploit:– Locality– Parallelism– Specialhardwarefeatures,likespecializedinstructions(e.g.,matrixmanipulation)

• Latency– Howlongtosettheproblemup– Howmuchfasterdoesitexecuteonceitgetsgoing– Itisallabouttimetofinish

20

CopingwithFailures

• 4disks/server,50,000servers• Failurerateofdisks:2%to10%/year

– Assume4%annualfailurerate• Onaverage,howoftendoesadiskfail?

a) 1/monthb) 1/weekc) 1/dayd) 1/hour

21

CopingwithFailures

• 4disks/server,50,000servers• Failurerateofdisks:2%to10%/year

– Assume4%annualfailurerate• Onaverage,howoftendoesadiskfail?

a) 1/monthb) 1/weekc) 1/dayd) 1/hour

22

50,000x 4=200,000disks200,000x 4%=8000disksfail

365daysx 24hours=8760hours

NASAFixingRover’sFlashMemory

• OpportunitystillactiveonMarsafter>10years

• Butflashmemorywornout

• Newsoftwareupdatetoavoidusingwornoutmemorybanks

23http://www.engadget.com/2014/12/30/nasa-opportunity-rover-flash-fix/

GreatIdea#6:DependabilityviaRedundancy

• Redundancysothatafailingpiecedoesn’tmakethewholesystemfail

24

1+1=2 1+1=2 1+1=1

1+1=22of3agree

FAIL!

Increasingtransistordensity reducesthecostofredundancy

GreatIdea#5:DependabilityviaRedundancy

• Appliestoeverythingfromdatacenterstostoragetomemorytoinstructors– Redundantdatacenters sothatcanlose1datacenterbutInternetservicestaysonline

– Redundantdisks sothatcanlose1diskbutnotlosedata(RedundantArraysofIndependentDisks/RAID)

– Redundantmemorybits ofsothatcanlose1bitbutnodata(ErrorCorrectingCode/ECCMemory)

25

Agenda

• ThinkingaboutMachineStructures• GreatIdeasinComputerArchitecture• Whatyouneedtoknowaboutthisclass• EverythingisaNumber

26

WeeklySchedule

Lecture Tuesday,08:15-09:55.room:H2109Lecture Friday,10:15-11:55. room:H2109Discussions Tuesday,18:40-20:20.教学楼309.Lab1 Monday,15:00-16:40.教学楼308Lab2 Tuesday,15:00-16:40.教学楼309Lab3 Thursday,15:00-16:40.行政楼405

27

CourseInformation• CourseWeb:http://shtech.org/course/ca/• Acknowledgement:InstructorsofUCBerkeley’sCS61C:

http://www-inst.eecs.berkeley.edu/~cs61c/• Instructor:

– SörenSchwertfeger• TeachingAssistants:(seewebpage)• Textbooks:Average15pagesofreading/week

– Patterson&Hennessey,ComputerOrganizationandDesign,5th Edition(Chineseversionis4th edition– significantdifferences!)

– Kernighan&Ritchie,TheCProgrammingLanguage,2nd Edition– Barroso&Holzle,TheDatacenterasaComputer,2nd Edition

• Piazza:– Everyannouncement,discussion,clarificationhappensthere

28

CourseGrading• Projects:33%• Homework:17%• Lab:10%• Exams:35%

– Midterm1:7.5%– Midterm2:7.5%– Final:20%

• Participation:5%

29

LatePolicy…SlipDays!• Assignmentsdueat11:59:59PM• Youhave3 slipdaytokens(NOThourormin)• Everydayyourprojectorhomeworkislate(evenbyaminute)wedeductatoken

• Afteryou’veusedupalltokens,it’s25%deductedperday.– Nocreditifmorethan3dayslate– Saveyourtokensforprojects,worthmore!!

• Noneedforsobstories,justuseaslipday!• Gradebot isopentill3daysafterduedate!• Ifyouneedmoretime(slipdaysplusdeduction)sendtheTAandProf.anemail!

30

PolicyonAssignmentsandIndependentWork

• ALLPROJECTSWILLBEDONEWITHAPARTNER• Withtheexceptionoflaboratoriesandassignmentsthatexplicitlypermityouto

workingroups, allhomeworkandprojectsaretobeYOURworkandyourworkALONE.

• PARTNERTEAMSMAYNOTWORKWITHOTHERPARTNERTEAMS• Youareencouraged todiscussyourassignmentswithotherstudents,andcreditwill

beassignedtostudentswhohelpothers,particularlybyansweringquestionsonPiazza,butweexpectthatwhatyouhandinisyours.

• ItisNOTacceptabletocopysolutions fromother students.• ItisNOTacceptabletocopy(or startyour) solutions fromtheWeb.• ItisNOTacceptabletousePUBLICgithub archives(giving youranswersaway)• Wehavetoolsandmethods, developedovermanyyears,fordetectingthis.You

WILLbecaught,andthepenaltiesWILLbesevere.• AttheminimumFinthecourse,andalettertoyouruniversityrecorddocumenting

theincidenceofcheating.• BothGiverandReceiverareequallyculpableandsufferequalpenalties

31

Discussion&Labs&HW1

• Firstdiscussiontoday!Tuesday,18:40-20:20教学楼309– Topic:Numberrepresentation– Letusknowwhattopicsyou’dliketohavecovered!– Topicnextdiscussion:C

• Labs:Findapartnerforyourlab-workandtheprojects– fromyoulabclass!– SendanemailtoXuQingwen (xuqw)– Labsstartnextweek

• HW1willbepostedonFriday.

32

ArchitectureofatypicalLecture

33

Attention

Time(minutes)10 35 60 78 90

Administrivia “Andinconclusion…”

Full

Fun/News

Agenda

• ThinkingaboutMachineStructures• GreatIdeasinComputerArchitecture• Whatyouneedtoknowaboutthisclass• EverythingisaNumber

34

KeyConcepts• Insidecomputers,everythingisanumber• Butnumbersusuallystoredwithafixedsize

– 8-bitbytes,16-bithalfwords,32-bitwords,64-bitdoublewords,…

• Integerandfloating-pointoperationscanleadtoresultstoobig/smalltostorewithintheirrepresentations:overflow/underflow

35

NumberRepresentation

• Valueofi-th digitisd × Baseiwherei startsat0andincreasesfromrighttoleft:

• 12310=110 x 10102 +210 x 10101 +310 x 10100

=1x10010 +2x1010 +3x110=10010 +2010 +310=12310

• Binary(Base2),Hexadecimal(Base16),Decimal(Base10)differentwaystorepresentaninteger– Weuse1two,5ten,10hex tobeclearer

(vs.12,48,510,1016)

36

NumberRepresentation

• Hexadecimaldigits:0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F

• FFFhex =15tenx16ten2 +15tenx16ten1 +15tenx16ten0=3840ten +240ten +15ten=4095ten

• 111111111111two =FFFhex =4095ten• Mayputblankseverygroupofbinary,octal,orhexadecimaldigitstomakeiteasiertoparse,likecommasindecimal

37

SignedandUnsignedIntegers

• C,C++,andJavahavesignedintegers,e.g.,7,-255:int x, y, z;

• C,C++alsohaveunsigned integers,e.g.foraddresses

• 32-bitwordcanrepresent232 binarynumbers• Unsignedintegersin32bitwordrepresent0to232-1(4,294,967,295)(4Gig)

38

UnsignedIntegers00000000000000000000000000000000two =0ten00000000000000000000000000000001two =1ten00000000000000000000000000000010two =2ten

... ...01111111111111111111111111111101two =2,147,483,645ten01111111111111111111111111111110two =2,147,483,646ten01111111111111111111111111111111two =2,147,483,647ten10000000000000000000000000000000two =2,147,483,648ten10000000000000000000000000000001two =2,147,483,649ten10000000000000000000000000000010two =2,147,483,650ten

... ...11111111111111111111111111111101two =4,294,967,293ten11111111111111111111111111111110two =4,294,967,294ten11111111111111111111111111111111two =4,294,967,295ten

39

SignedIntegersandTwo’s-ComplementRepresentation

• SignedintegersinC;want½numbers<0,want½numbers>0,andwantone0

• Two’scomplementtreats0aspositive,so32-bitwordrepresents232integersfrom-231(–2,147,483,648) to231-1(2,147,483,647)– Note:onenegativenumberwithnopositiveversion– Booklistssomeotheroptions,allofwhichareworse– Everycomputerusestwo’scomplementtoday

• Most-significantbit(leftmost)isthesignbit,since0meanspositive(including0),1meansnegative– Bit31ismostsignificant,bit0isleastsignificant

40

Two’s-ComplementIntegers00000000000000000000000000000000two =0ten00000000000000000000000000000001two =1ten00000000000000000000000000000010two =2ten

... ...01111111111111111111111111111101two =2,147,483,645ten01111111111111111111111111111110two =2,147,483,646ten01111111111111111111111111111111two =2,147,483,647ten10000000000000000000000000000000two =–2,147,483,648ten10000000000000000000000000000001two =–2,147,483,647ten10000000000000000000000000000010two =–2,147,483,646ten

... ...11111111111111111111111111111101two =–3ten11111111111111111111111111111110two =–2ten11111111111111111111111111111111two =–1ten

41

SignBit

WaystoMakeTwo’sComplement• ForN-bitword,complementto2tenN

– For4bitnumber3ten=0011two,two’scomplement

(i.e.-3ten)wouldbe

16ten-3ten=13ten or10000two – 0011two =1101two

42

• Hereisaneasierway:– Invertallbitsandadd1

– Computersactuallydoitlikethis,too

0011two

1100two+1two

3ten

1101two

Bitwisecomplement

-3ten

Two’s-ComplementExamples

• Assumeforsimplicity4bitwidth,-8to+7represented

43

00110010

3+25 0101

00111110

3+(-2)

1 10001

01110001

7+1-8 1000Overflow!

11011110

-3+(-2)

-5 11011

10001111

-8+(-1)+7 10111

CarryintoMSB=CarryOutMSB

CarryintoMSB=CarryOutMSB

Underflow!

Overflow/Underflowwhenmagnitudeofresulttoobig/toosmalltofitintoresultrepresentation

Carryin=carryfromlesssignificantbitsCarryout=carrytomoresignificantbits

0to+31

-16to+15

-32to+31☐

44

Supposewehada5-bitword.Whatintegerscanberepresentedintwo’scomplement?

0to+31

-16to+15

-32to+31☐

45

Supposewehada5bitword.Whatintegerscanberepresentedintwo’scomplement?

Summary• ComputerArchitecture:Learn6greatideasincomputerarchitecturetoenablehighperformanceprogrammingviaparallelism,notjustlearnC1. Abstraction

(LayersofRepresentation/Interpretation)2. Moore’sLaw3. PrincipleofLocality/MemoryHierarchy4. Parallelism5. PerformanceMeasurementandImprovement6. DependabilityviaRedundancy

• EverythingisaNumber!46