Post on 18-Apr-2022
9/27/2016 CS152,Fall2016
CS152ComputerArchitectureandEngineering
Lecture 8- AddressTranslation
JohnWawrzynekElectricalEngineeringandComputerSciences
UniversityofCaliforniaatBerkeley
http://www.eecs.berkeley.edu/~johnwhttp://inst.eecs.berkeley.edu/~cs152
9/27/2016 CS152,Fall2016
CS152Administrivia
2
§ Lab2dueFriday§ PS2dueTuesday§Quiz2nextThursday!
9/27/2016 CS152,Fall2016
LasttimeinLecture 7§ 3C’sofcachemisses
– Compulsory,Capacity,Conflict
§Writepolicies– Writeback,write-through,write-allocate, nowriteallocate
§Multi-levelcachehierarchies reducemisspenalty– 3levelscommoninmodernsystems(somehave4!)– Canchangedesigntradeoffs ofL1cacheifknowntohaveL2
§ Prefetching:retrievememorydatabeforeCPUrequest– Prefetching canwastebandwidthandcausecachepollution– Softwarevs hardwareprefetching
§ Softwarememoryhierarchyoptimizations– Loopinterchange, loopfusion,cachetiling
3
9/27/2016 CS152,Fall2016
BareMachine
§ Inabaremachine,theonlykindofaddressisaphysicaladdress
4
PCInst.Cache D Decode E M
DataCache W+
MainMemory(DRAM)
MemoryController
PhysicalAddress
PhysicalAddress
PhysicalAddress
PhysicalAddress
PhysicalAddress
9/27/2016 CS152,Fall2016
AbsoluteAddresses
§ Onlyoneprogramranatatime,withunrestrictedaccesstoentiremachine(RAM+I/Odevices)
§ Addresses inaprogramdependeduponwheretheprogramwastobeloadedinmemory
§ But itwasmoreconvenient forprogrammerstowritelocation-independent subroutines
5
EDSAC,early50’s
Howcouldlocationindependencebeachieved?
Linkerand/orloadermodifyaddressesofsubroutinesandcallerswhenbuildingaprogrammemoryimage
9/27/2016 CS152,Fall2016
DynamicAddressTranslation§ Motivation
– Inearlymachines,I/OwasslowandeachI/OtransferinvolvedtheCPU(programmedI/O)
– Higherthroughputpossible ifCPUandI/Oof2ormoreprogramswereoverlapped,how?⇒multiprogrammingwithDMAI/Odevices,interrupts
§ Location-independent programs– Programmingandstoragemanagementease⇒ needforabase register
§ Protection– Independentprogramsshouldnotaffecteachotherinadvertently⇒ needforabound register
§ Multiprogrammingdrivesrequirement forresidentsupervisorsoftwaretomanagecontextswitchesbetweenmultipleprograms
6
PhysicalM
emory
Program1
Program2
OS
9/27/2016 CS152,Fall2016
SimpleBaseandBoundTranslation
7
LoadX
ProgramAddressSpace
BoundRegister
BoundsViolation?
PhysicalMemory
CurrentSegment
BaseRegister
+PhysicalAddressLogical
Address
Baseandboundsregistersarevisible/accessibleonlywhenprocessorisrunninginthesupervisormode
BasePhysicalAddress
SegmentLength
≥
9/27/2016 CS152,Fall2016
SeparateAreasforProgramandData
8
PhysicalAddress
PhysicalAddress
LoadX
ProgramAddressSpace
MainMem
ory
DataSegment
DataBoundRegister
Mem.AddressRegister
DataBaseRegister +
BoundsViolation?
ProgramBoundRegister
ProgramCounter
ProgramBaseRegister +
ProgramSegment
LogicalAddress
LogicalAddress
Whatisanadvantageofthisseparation?
(SchemeusedonallCrayvectorsupercomputerspriortoX1,2002)
≥
≥ BoundsViolation?
9/27/2016 CS152,Fall2016
BaseandBoundMachine
Canfoldadditionofbaseregisterinto(register+immediate)addresscalculationusingacarry-saveadder(sumsthreenumberswithonlyafewgatedelaysmorethanaddingtwonumbers)
9
PCInst.Cache D Decode E M
DataCache W+
MainMemory(DRAM)
MemoryController
PhysicalAddress
PhysicalAddress
PhysicalAddress
PhysicalAddress
DataBoundRegister
DataBaseRegister
+
LogicalAddress
BoundsViolation?
PhysicalAddress
ProgramBoundRegister
ProgramBaseRegister
+
LogicalAddress
BoundsViolation?≥ ≥
9/27/2016 CS152,Fall2016
MemoryFragmentation
10
Asuserscomeandgo,thestorageis“fragmented”.Therefore, atsomestageprogramshavetobemovedaroundtocompactthestorage.
OSSpace
16K
24K
24K
32K
24K
user1
user2
user3
OSSpace
16K
24K
16K
32K
24K
user1
user2
user3
user5
user48K
Users4&5arrive
Users2&5leave
OSSpace
16K
24K
16K
32K
24K
user1
user48K
user3
free
9/27/2016 CS152,Fall2016
PagedMemorySystems§ Processor-generated address canbesplitinto:
11
Pagetablesmakeitpossibletostorethepagesofaprogramnon-contiguously.
0123
0123
AddressSpaceofUser-1
PageTableofUser-1
10
2
3
• APageTablecontainsthephysicaladdressatthestartofeachpage
PhysicalMemory
PageNumber Offset
9/27/2016 CS152,Fall2016
PrivateAddressSpaceperUser
12
• Eachuserhasapagetable• Pagetablecontainsanentryforeachuserpage
VA1User1
PageTable
VA1User2
PageTable
VA1User3
PageTable
Physica
lMem
ory
free
OSpages
9/27/2016 CS152,Fall2016
WhereShouldPageTablesReside?
§ Spacerequiredbythepagetables(PT)isproportionaltotheaddressspace,numberofusers, ...
⇒ Too largetokeepinregisters
§ Idea:KeepPTs inthemainmemory– needsonereferencetoretrievethepagebaseaddressandanothertoaccessthedataword⇒ doublesthenumberofmemoryreferences!
13
9/27/2016 CS152,Fall2016
PageTablesinPhysicalMemory
14
VA1
User1VirtualAddressSpace
User 2VirtualAddressSpace
PTUser1
PTUser2
VA1
Physica
lMem
ory
9/27/2016 CS152,Fall2016
AProblemintheEarlySixties
§Thereweremanyapplicationswhosedatacouldnotfitinthemainmemory,e.g.,payroll–Pagedmemorysystemreducedfragmentationbutstillrequiredthewholeprogramtoberesidentinthemainmemory
15
9/27/2016 CS152,Fall2016
DemandPaginginAtlas(1962)
17
Secondary(Drum)
32x6pages
Primary32Pages
512words/page
CentralMemoryUsersees32x6x512words
ofstorage
“Apagefromsecondarystorageisbroughtintotheprimarystoragewhenever itis(implicitly)demandedbytheprocessor.”
TomKilburn
Primarymemoryasacacheforsecondarymemory
The Atlas Computer was a joint development between the University of Manchester, Ferranti, and Plessey.
9/27/2016 CS152,Fall2016
HardwareOrganizationofAtlas
18
InitialAddressDecode
16ROMpages0.4-1µsec
2subsidiarypages1.4µsec
Main32pages1.4µsec
Drum(4)192pages 8Tapedecks
88sec/word
48-bitwords512-wordpages
1PageAddressRegister(PAR)per“pageframe”
Onmemoryaccesscomparetheeffective pageaddressagainstall32PARsmatch ⇒ normalaccessnomatch ⇒ pagefault
savethestateofthepartiallyexecutedinstruction
EffectiveAddress
systemcode(notswapped)
systemdata(notswapped)
0
31
PARs
<effectivePN,status>
9/27/2016 CS152,Fall2016
AtlasDemandPagingScheme
Onapagefault:§ Transferfromdrumtoafreepageinprimarymemory isinitiated
§ ThePageAddressRegister(PAR)isupdated§ Ifnofreepageisleft,apageisselectedtobereplaced(basedonusage)
§ Thereplacedpageiswrittenonthedrum– tominimizedrumlatencyeffect,thefirstemptypageonthedrumwasselected
§ Thepagetableisupdatedtopointtothenewlocationofthepageonthedrum
19
9/27/2016 CS152,Fall2016
LinearPageTable
20
§ PageTableEntry (PTE)contains:– Abittoindicate ifapageexists– PPN(physicalpagenumber)fora
memory-residentpage– DPN(diskpagenumber)forapage
onthedisk– Statusbitsforprotectionand
usage§ OSsetsthePageTableBase
Registerwheneveractiveuserprocesschanges
VPN OffsetVirtualaddressfromCPUExecuteStage
PTBaseRegister
VPN
Dataword
DataPages
Offset
PPNPPN
DPNPPN
PPNPPN
PageTable
DPN
PPN
DPNDPN
DPNPPN
SupervisorAccessibleControlRegister insideCPU
9/27/2016 CS152,Fall2016
SizeofLinearPageTable
§With32-bitaddresses,4-KBpages&4-bytePTEs:⇒ 232 /212 =220 virtualpagesperuser,assuming4-BytePTEs,⇒ 220 PTEs,i.e,4MBpagetableperuser!– 4GBofswapneededtobackupfullvirtualaddress
space
§ Largerpageshelps,but:– Internalfragmentation (Notallmemoryinpageisused)– Largerpagefaultpenalty(moretimetoreadfromdisk)
§Whatabout64-bitvirtualaddressspace???– Even1MBpageswouldrequire2448-bytePTEs(35TB!)
Whatisthe“savinggrace”?
21
9/27/2016 CS152,Fall2016
HierarchicalPageTable
22
Level1PageTable
Level2PageTables
DataPages
pageinprimarymemorypageinsecondarymemory
RootoftheCurrentPageTable
p1
offset
p2
VirtualAddress fromCPU
(ProcessorRegister)
PTEofanonexistentpage
p1 p2offset01112212231
10-bitL1index
10-bitL2index
Physica
lMem
ory
9/27/2016 CS152,Fall2016
AddressTranslation&Protection
24
• Every instructionanddataaccessneedsaddresstranslationandprotectionchecks
AgoodVMdesignneedstobefast(~onecycle)andspaceefficient
PhysicalAddress
VirtualAddress
AddressTranslation
VirtualPageNo.(VPN) offset
PhysicalPageNo.(PPN) offsetException?
Kernel/UserMode
Read/Write ProtectionCheck
9/27/2016 CS152,Fall2016
TranslationLookaside Buffers(TLB)
25
Addresstranslationisveryexpensive!Inatwo-levelpagetable,eachreferencebecomesseveralmemoryaccesses
Solution:CachetranslationsinTLBTLBhit ⇒ Single-CycleTranslationTLBmiss ⇒ Page-TableWalktorefill
VPN offset
V RWD tag PPN
physicaladdress PPN offset
virtualaddress
hit?
(VPN=virtualpagenumber)
(PPN=physicalpagenumber)
9/27/2016 CS152,Fall2016
TLBDesigns§ Typically32-128entries,usuallyfullyassociative
– Eachentrymapsalargepage,hence lessspatial localityacrosspagesèmorelikelythattwoentriesconflict
– Sometimes largerTLBs (256-512entries)are4-8wayset-associative– Largersystemssometimes havemulti-level (L1andL2)TLBs
§ RandomorFIFOreplacementpolicy
§ NoprocessinformationinTLB?§ TLBReach:SizeoflargestvirtualaddressspacethatcanbesimultaneouslymappedbyTLB
Example:64TLBentries,4KBpages,onepageperentry
TLBReach=_____________________________________________?
26
64entries*4KB=256KB(ifcontiguous)
9/27/2016 CS152,Fall2016
HandlingaTLBMiss
§Software(MIPS,DEC/Alpha)– TLBmisscausesanexceptionandtheoperatingsystemwalksthepagetablesandreloadsTLB.Aprivileged“untranslated” addressingmodeusedforwalk.
§Hardware(SPARCv8,x86,PowerPC,RISC-V)– Amemorymanagementunit(MMU)walksthepagetablesandreloadstheTLB.
– Ifamissing(dataorPT)pageisencountered duringtheTLBreloading,MMUgivesupandsignalsaPageFaultexceptionfortheoriginalinstruction.
27
9/27/2016 CS152,Fall2016
HierarchicalPageTableWalk:SPARCv8
28
31 110
VirtualAddress Index1 Index2 Index3 Offset31 2317110
ContextTableRegister
ContextRegister
rootptr
PTPPTP
PTE
ContextTable
L1Table
L2TableL3Table
PhysicalAddress PPN Offset
MMUdoesthistablewalkinhardwareonaTLBmiss
9/27/2016 CS152,Fall2016
Page-BasedVirtual-MemoryMachine(HardwarePage-TableWalk)
§ Assumespagetablesheldinuntranslated physicalmemory
29
PCInst.TLB
Inst.Cache D Decode E M
DataCache W+
PageFault?
Protectionviolation?PageFault?
Protectionviolation?
DataTLB
MainMemory(DRAM)
MemoryControllerPhysicalAddress
PhysicalAddress
PhysicalAddress
PhysicalAddress
Page-TableBaseRegister
VirtualAddress Physical
Address
VirtualAddress
HardwarePageTableWalker
Miss? Miss?
9/27/2016 CS152,Fall2016
AddressTranslation:puttingitalltogether
30
VirtualAddress
TLBLookup
PageTableWalk
UpdateTLBPageFault(OSloadspage)
ProtectionCheck
PhysicalAddress(tocache)
miss hit
the pageis∉ memory ∈ memory denied permitted
ProtectionFault
hardwarehardwareorsoftwaresoftware
SEGFAULTWhere?
9/27/2016 CS152,Fall2016
Acknowledgements
§ Theseslidescontainmaterialdeveloped andcopyrightby:– Arvind(MIT)– KrsteAsanovic(MIT/UCB)– JoelEmer(Intel/MIT)– JamesHoe(CMU)– JohnKubiatowicz(UCB)– DavidPatterson(UCB)
§ MITmaterialderivedfromcourse6.823§ UCBmaterialderivedfromcourseCS252
31