Download - ECEN/CSCI 5593: Advanced Computer …ecee.colorado.edu/~ecen5593/ECEN5593_Syllabus.pdfECEN/CSCI 5593: Advanced Computer Architecture (ACA) Course Syllabus Instructor: Dan Connors E-Mail

Transcript
Page 1: ECEN/CSCI 5593: Advanced Computer …ecee.colorado.edu/~ecen5593/ECEN5593_Syllabus.pdfECEN/CSCI 5593: Advanced Computer Architecture (ACA) Course Syllabus Instructor: Dan Connors E-Mail

ECEN/CSCI5593:AdvancedComputerArchitecture(ACA)CourseSyllabus

Instructor: DanConnorsE-Mail [email protected]

Website: Desire2Learn:https://learn.colorado.edu

I. CourseOverview

Advanced Computer Architecture (ACA) covers advanced topics in computer architecture focusing onmulticore, graphics-processor unit (GPU), and heterogeneous SOC multiprocessor architectures and theirimplementation issues (architect's perspective). A range of levels are explored from deep submicron CMOScharacteristics, microarchitecture, compiler optimization, parallel programming, run-time optimization,performanceanalysis&tuning,faulttolerance,andpower-awarecomputingtechniques.Theobjectiveof the course is toprovide in-depth coverageof current andemerging trends in computer

architecture focusing on performance and the hardware/software interface. The course emphasis is on

analyzing fundamental issues in architecture design and their impact on application performance. To

enable a better understanding of the concepts, hands-on assignments are used to explore issues in

multicoreandGPUarchitecturesystems.Studentshaveoptionsinexploringtheirowninterestsincustom

projectsandassignments.

NewrecordedvideolecturesinSpring2017

New projects in Spring 2017: Students work in groups of up to two people, for projects related to

accelerationandperformance tuningofmachine learning, computervision, anddeep learning. Students

takingthecoursecaninvestigateprojectswithaccesstoNVIDIA,Xilinx,andRaspberryPiresources:

NVIDIAJetsonTX1(http://www.nvidia.com/object/jetson-tx1-module.html)istheworld'sleadingAI

computingplatformforGPU-acceleratedparallelprocessinginthemobileembeddedsystemsmarket.Itshigh-

performance,low-energycomputingfordeeplearningandcomputervisionmakesJetsontheidealsolutionfor

compute-intensiveembeddedprojects.JetsonTX1isasupercomputeronamodulethat'sthesizeofacredit

card.ItfeaturesthenewNVIDIAMaxwell™architecture:GPU1TFLOP/s256-cores,withCPU64-bitARM®A57

CPUsMemory4GBLPDDR4|25.6GB/s

• Projectpotential:Drones&UnmannedAerialVehicles(UAVs),AutonomousRoboticSystems,Mobile

MedicalImaging,IntelligentVideoAnalytics(IVA)

Page 2: ECEN/CSCI 5593: Advanced Computer …ecee.colorado.edu/~ecen5593/ECEN5593_Syllabus.pdfECEN/CSCI 5593: Advanced Computer Architecture (ACA) Course Syllabus Instructor: Dan Connors E-Mail

PYNQ-PythonProductivityforXilinxZynqProgrammableHardware

http://www.pynq.io/

PYNQisanopen-sourceprojectfromXilinxthatmakesiteasytodesignembeddedsystemswithZynqAll

ProgrammableSystemsonChips(APSoCs).UsingthePythonlanguageandlibraries,designerscanexploitthe

benefitsofprogrammablelogicandmicroprocessorsinZynqtobuildmorecapableandexcitingembedded

systems.UsingthePythonlanguageandlibraries,designerscanexploitthebenefitsofprogrammablelogicand

microprocessorsinZynqtobuildmorecapableandexcitingembeddedsystems.

PYNQuserscannowcreatehighperformanceembeddedapplicationswith

• parallelhardwareexecution

• hardwareacceleratedalgorithms

• real-timesignalprocessing

• highbandwidthIOandlowlatencycontrol

Zynq-7000AllProgrammableSoCFeatures

• DualARM®Cortex™-A9MPCore™withCoreSight™

• 32KBInstruction,32KBDataperprocessorL1Cache

• 512KBunifiedL2Cache,256KBOn-ChipMemory,630KBoffastblockRAM

• 85Klogiccells(13300logicslices,eachwithfour6-inputLUTsand8flip-flops)

RaspberryPi–ARMLinux-basedEmbeddedSystem(https://www.raspberrypi.org)

• LowTransistorCount

• LowPowerConsumption/HeatProduction

• Usedinmostmobiledevices:phonesandsmalldigitaldevices

• RaspberryPihassimilarrequirementstomobiledevices

Page 3: ECEN/CSCI 5593: Advanced Computer …ecee.colorado.edu/~ecen5593/ECEN5593_Syllabus.pdfECEN/CSCI 5593: Advanced Computer Architecture (ACA) Course Syllabus Instructor: Dan Connors E-Mail

II. CoursePrerequisites

Thiscourserequirestheunderstandingofdesignofprocessors,specificallycomputerorganizationandthe

instruction set architecture (ISA): ECEN 4593 (Computer Organization) or an equivalent first course in

computerorganizationanddesign.Studentsshouldalreadyunderstandsomecomputerinstructionsetand

knowhowtodesignacontrolunit,arithmeticunit,memory(cacheandvirtual),andvariousinput/output

interfaces.

III. CourseOutline

1. IntroductiontoComputerDesignandQuantitativePrinciplesofArchitecturePerformanceAnalysis

• Technologyandcomputertrends

• Measuringcomputersystemperformance

• Benchmarksandmetrics

2. InstructionSetPrinciplesandExamples

• ClassificationofInstructionSetArchitectures(ISA)–RISC,CISC,VLIW,EPIC

• Predicatedexecutionandcompiler-controlledspeculation

3. AdvancedMicroarchitectureandInstruction-LevelParallelism

• Superscalarandpipelineoperation

• Instruction-LevelParallelism(ILP)

• Dynamicinstructionscheduling(Tomasulo,scoreboarding,reservationstationdesign)

• Overcomingcontrolhazard-branchprediction(2-bit,two-level)

• Compileroptimizationandanalysis

4. Memory-HierarchyDesign

• Multi-levelcachedesignissues

• Performanceevaluation

• Memoryprefetchingtechniques

5. Thread-LevelParallelism

• Multicoresystems

• Threadcontrolmodels(fine-grained,coarse-grained,hyper-threading)

6. Data-LevelParallelism

• Vectorprocessing

• GraphicsProcessingUnits(GPU)

• NVIDIAarchitecturemodels–Fermi,Tesla,Kepler,Maxwell,Pascal

• CUDA/OpenCLprogramming

7. Performance-tuningandAnalysisofModernApplications

• Run-timeoptimization

• Binaryinstrumentation

• Hardwareperformancemonitoring

• Performancetuning

8. ArchitectureImplementationIssuesandAnalysis

• Power-DynamicVoltageFrequencyScaling(DVFS),Energy-DelayProduct(EDP)

• Architecture physical layer concepts including device&layout, manufacturing constraints,

architectures,defecttolerance,anddesignvariability.

Page 4: ECEN/CSCI 5593: Advanced Computer …ecee.colorado.edu/~ecen5593/ECEN5593_Syllabus.pdfECEN/CSCI 5593: Advanced Computer Architecture (ACA) Course Syllabus Instructor: Dan Connors E-Mail

CourseSchedule

WEEK1-Introduction,InstructionSetArchitecture,andPipelines

Topics:

• Descriptionofarchitecture,micro-architectureandinstructionsetarchitectures.

• PipeliningReview-basicconceptofpipelineandtwodifferenttypesofhazards.

• PipelineCPI

• ProcessorPipelineHazards

• ComputerArchitecture&TechTrends

• ProcessorSpeed,Cost,Power

• MeasuringPerformance

• BenchmarksStandards

• IronLawofPerformance

• Moore'sLaw

• Amdahl'sLaw

• Lhadma'sLaw

• Gustafson'slaw

WEEK2-ControlHazards

Topics:

• MispredictionPenalties

• BranchPredictionTechniques

• Two-levelCorrelationPredictors:PAg,GAg

• HybridPredictors

• ReturnAddressStack

• LoopPrediction

• UnderstandingCodeExecutionandCodingPracticesforBranchPrediction

WEEK3andWEEK4–BaseCacheMemory,DynamicExecutionandSuperscalarModel

Topics:

• Cachememorycharacteristics

• InstructionLevelParallelism(ILP)

• Out-of-order execution- common methods used to improve the performance of out-of-order

processorsincludingregisterrenamingandmemorydisambiguation.

• Commonissuesforsuperscalararchitecture.

• Kindsofarchitecturesforout-of-orderprocessors.

WEEK5andWEEK6–VLIW,EPIC,andILPCompilerOptimizationsforArchitectures

Topics:

• TraditionalCompilerOptimization:Peephole,LoopUnrolling,Inter-procedural,andInlining

• CompilerOptimizationforInstructionLevelParallelism(ILP)andProfile-DirectedTechniques

• Out-of-order execution- common methods used to improve the performance of out-of-order

processorsincludingregisterrenamingandmemorydisambiguation.

Page 5: ECEN/CSCI 5593: Advanced Computer …ecee.colorado.edu/~ecen5593/ECEN5593_Syllabus.pdfECEN/CSCI 5593: Advanced Computer Architecture (ACA) Course Syllabus Instructor: Dan Connors E-Mail

WEEK7-MulticoreArchitecturesandVector/MultimediaInstructionSets

Topics:

• Simultaneousmultithreaded(SMT)architectures

• SMTArchitectureAlternatives

• SMTarchitecture:OSimpactandadaptivearchitectures

• Multi-coreArchitectures

• SingleInstructionMultipleData(SIMD)

• IntelArchitectureDevelopment:MMX,SSE

• InlineAssemblyandAssemblyIntrinsics

WEEK8thruWEEK13–GraphicsProcessingUnit(GPU)Architecture

Topics:

• NVIDIACUDA/GPUProgrammingModel

• GPUHardwareandParallelCommunication

• GPUFundamentalParallelAlgorithms

• OptimizingGPUPrograms

• TheFrontiersandFutureofGPUComputing

• OpenCL–OpenComputeLanguage

• MobileGPUSystemArchitectureExploration:NVIDIATX1

WEEK14–RuntimeOptimizationandCompilation

Topics:

• DynamiccompilationandCodeTranslations

IV. LearningOutcomes

Astudentwhohassuccessfullycompletedthiscourseshouldbeableto:

1. Analyzevariousperformancecharacteristicsofacomputersystem.

2. Applydigitaldesigntechniquestothemicroarchitectureconstructionofaprocessor.

3. Translateassemblylanguageprogramsto/fromhigh-levellanguagecodesandalgorithms.

4. Analyzehardware&softwaretrade-offstodesigntheinstructionsetarchitecture(ISA)interface.

5. Understandadvancedissuesindesignofcomputerprocessors,caches,andmemory.

6. Analyzeperformancetrade-offsincomputerdesign.

7. Applyknowledgeofprocessordesigntoimproveperformanceinalgorithmsandsoftwaresystems.

8. Acquireexperiencewithtoolsforstatisticalanalysisofinstructionsettrade-offs.

9. GaintheabilitytodevelopparallelGPGPUsolutionsofCUDAandOpenCL

V. RequiredTextandMaterials

HennessyandPatterson,ComputerArchitecture-AQuantitativeApproach,4thor laterEdition(ISBN-13:

978-0123704900ISBN-10:0123704901Edition:4th)-thisisthemaintextbookfortheclass.

VI. Assessment&Assignments

Assignments:Thefollowingprogrammingassignmentsarescheduled:• Pin–Binaryinstrumentationtooltoanalyzeprogrambehaviors

o Choiceofbranchpredictionorcachedesignsimulation.

• CUDAprogramming-Vectoraddition

• CUDAprogramming-Histogramgeneration

• CUDAprogramming-Imagefiltering

Page 6: ECEN/CSCI 5593: Advanced Computer …ecee.colorado.edu/~ecen5593/ECEN5593_Syllabus.pdfECEN/CSCI 5593: Advanced Computer Architecture (ACA) Course Syllabus Instructor: Dan Connors E-Mail

ReadingAssignments: There are several technical papers (conferenceproceedings, journal articles, andtechnical reports) assigned through the semester. Reading technical papers in the field of computer

architectureisimperativetounderstandingfuturedirectionsinthefield.Assignmentswillrequirestudents

towritebriefoverviewsoranswertechnicalquestionsaboutthepapersassigned.Subjectmatterfromthe

readingassignmentsarelikelytobecoveredinexams.

FinalExam:Therewilla take-homefinalexamthatcovers theconceptsof thecourse.Theexamproblemsarecloselyrelatedtothelectures,homeworkassignments,andassignedreadings.The

finalexamwillbecumulative,coveringallsubjecttopics.

FinalProject:Therewillbeaprojectforyoutoworkonasanindividualorinagroupoftwopeople.Theprojectwill count as15%of your grade, andwill be a significant amountofwork.The assignment is to

extendthesemesterprojectortoanalyzesomeinterestingdataornewarchitecturefeature.Studentsare

able to write survey papers as a second option to the project. The project will be divided into several

milestones, one checkpoint being a presentation ofwork.Details about the project and schedulewill be

announcedlaterinthesemester.

BasisforFinalGrade

Student’sgradeswillbeassessedbasedontheircompletedhomework,quizzes,project,in-classexams,and

the final exam. Homework assignments are designed to provide active learning for the student by

exercisingthevarioustopicscoveredbythecourse.Examswillbedesignedtoassessthestudent’sability

tomaster thedifferent topicareas, and theiraptitude ineachof the learningoutcomes. Thepercentage

giventoeachassessmentmethodisgivenbyTable1.

Table1.GradeAssessmentAssessment %ofFinalGrade

ReadingAssignments 10%

Assignments&Checkpoints 40%

Project 20%

FinalExam 30%

Total 100%

CoursePolicies

LateWorkPolicy: Homeworkassignmentsmustbe turned inat thebeginningof class,else itwillbe

consideredlate.Astudent’sscorewillbereducedbya20%penaltyforsubmittingwork,onesecondto24

hourslate.

StudentHonorCode:StudentsshouldbefamiliarwiththeCollegeofEngineeringandAppliedSciencesstudenthonorcode.Allhonorcoderuleswillbeadheredtointhisclass.

Appointments:Studentsareencouragedtomakeatleastoneappointmentwiththeprofessorduringthesemester.Appointmentscanbemadebyemail.Studentsareencouragedtoexploreresearchopportunities,

expressingconcerns,offeringsuggestions,andseekingadviceareamongthewelcometopics.