Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan...
-
Upload
julia-ellis -
Category
Documents
-
view
219 -
download
0
Transcript of Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan...
![Page 1: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/1.jpg)
Theory of Memory
W. Paul Saarland University and DFKI
bmb+f Projekt Verisoft-XT
joint work withUlan Degebaev and Norbert Schirmer
Saarland University
![Page 2: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/2.jpg)
why might his be important?
• Unites theories of– store buffers– interlocking– caches– cache coherence– out of order execution– X64 instruction set– address translation– optimized compilation– structured parallel C
semantics
• Explains why hypervisor might run structured parallel C
• VCC is supposed to mirror structured parallel C semantics
• thus VCC might be(come) sound
![Page 3: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/3.jpg)
Specifying Memory
M(x)x
![Page 4: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/4.jpg)
Store Buffer
memory M
w(i)r(j)
sbuf(y)
![Page 5: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/5.jpg)
Store Buffer
memory M
w(i)r(j)
sbuf(y)
![Page 6: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/6.jpg)
Caches
M
ca
![Page 7: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/7.jpg)
Many Caches: Snooping
M
ca(1) ca(p)
![Page 8: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/8.jpg)
Many Caches
M
ca(1) ca(p)
x.la x.off
![Page 9: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/9.jpg)
Many Caches
M
ca(1) ca(p)
x.la x.off
![Page 10: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/10.jpg)
Many Caches
M
ca(1) ca(p)
x.off
![Page 11: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/11.jpg)
Overlapping Transactions
public (a) a
c
c
b
c
![Page 12: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/12.jpg)
Sequentially Consistent Memorylemma 5
public (a) a
c
c
b
c
![Page 13: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/13.jpg)
Tomasulo Schedulers for OOO
IF
WB
reservation stations
ROB
issue
funct.
units
CDB
![Page 14: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/14.jpg)
Two Memory Units
MMU
ROB
funct.
units
CDB
LS
RS RSsbuf
m
![Page 15: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/15.jpg)
Single Processor OOO correctnesslemma 6
MMU
ROB
funct.
units
CDB
LS
RS RSsbuf
m
![Page 16: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/16.jpg)
Multi Processor OOO implementation
MMUfunct.
units
CDB
LS
RS RSsbuf
m
ROB
data(i,j)
![Page 17: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/17.jpg)
Multi Processor OOO correctnesslemma 7
MMUfunct.
units
CDB
LS
RS RSsbuf
m
ROB
data(i,j)
![Page 18: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/18.jpg)
Multi Processor OOO correctnesslemma 7
MMUfunct.
units
CDB
LS
RS RSsbuf
m
ROB
data(i,j)
![Page 19: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/19.jpg)
X64 architecture
• CPU core– R: user registers– SR: system registers
• CR3
– acc: access– segmentation
• mmu: memory management unit– tlb: translation look aside
buffer
• memory system– mm: main memory– ca: cache– sbuf: store buffer
sbuf
core
acc CR3
R
ca
mm
mmutlb
acc
segmentation
![Page 20: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/20.jpg)
segmentation offlemma 8
• 1 segment• large as entire address
space• segmentation invisible
sbuf
core
acc CR3
R
ca
mm
mmutlb
acc
segmentation
![Page 21: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/21.jpg)
Bad news: cache state is visible
• CPU core– acc: access
• acc.adr: address• acc.r: rights (user,write,
exe)• acc.data• acc.mmode: memory
mode– WB: write back– WT: write through ...– NC: no cache
sbuf
core
acc CR3
R
ca
mm or devices
mmutlb
acc
![Page 22: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/22.jpg)
Good News: no device, no NC mode
• acc.mmode: memory mode– WB: write back– WT: write through ...– NC: no cache not usedsbuf
core
acc CR3
R
ca
mm
mmutlb
acc
![Page 23: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/23.jpg)
Sequentially Consistent Physical Memorylemma 9
• acc.mmode: memory mode– WB: write back– WT: write through ...
mix on same address
• PM: sequentially consistent physical memory abstraction– Proof: MOESI invariants
are maintained
sbuf
PM
core
acc CR3
R
mmutlb
acc
![Page 24: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/24.jpg)
Initialize page tables
• 1 processor– sbuf invisible
• operating mode: paging disabled– mmu invisible
• set up page table tree in PM
sbuf
PM
core
acc CR3
R
mmutlb
acc
page tables
![Page 25: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/25.jpg)
Translated Linear Memory
• many processors• operating mode: paging
enabled• keep tlb consistent
sbuf
PM
core
acc CR3
R
mmutlb
acc
page tables
![Page 26: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/26.jpg)
Translated Consistent Linear Memory+ sbufs lemma 10
• many processors• operating mode: paging
enabled• keep tlb consistent
sbuf
LM
core
acc CR3
R
page tables
![Page 27: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/27.jpg)
C0: Pascal with C syntaxconfigurations
• c = ( pr, rd, lms, hm,gm)– pr program rest
– rd recursion depth
– lms: [0: recursion depth]!{local memories}
– hm: heap memory
– gm: global memory
• subvariables– (m,i)[17].gpr[3]
• value of pointers: subvariables !
va(c,(m,i))
ba(m,i)
memory m
size(m,i)
![Page 28: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/28.jpg)
Parallel C
• c = ( pr, rd, lms, hm,gm)– pr program rest
– rd recursion depth
– lms: [0: recursion depth]!{local memories}
– hm: heap memory
– gm: global memory
• Share– gm
– hm
• Interleave at small steps semantics steps
va(c,(m,i))
ba(m,i)
memory m
size(m,i)
![Page 29: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/29.jpg)
Parallel C
• c = ( pr, rd, lms, hm,gm)– pr program rest
– rd recursion depth
– lms: [0: recursion depth]!{local memories}
– hm: heap memory
– gm: global memory
• Share– gm
– hm
• Interleave at small steps semantics steps• Problem:
– Processor interleaves instructions
of compiled programs code(p)
va(c,(m,i))
ba(m,i)
memory m
size(m,i)
![Page 30: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/30.jpg)
simulation relation consis(c, alloc, d)
p
y
alloc(c,p)
alloc(c,y)
LM
![Page 31: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/31.jpg)
Non optimizing compiler:step by step simulation
![Page 32: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/32.jpg)
Optimizing compiler:simulation between IO-steps
![Page 33: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/33.jpg)
IO-steps (1): volatile accesses
![Page 34: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/34.jpg)
Volatiles Sequentially Consistentlemma 11
![Page 35: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/35.jpg)
Structured Parallel C
• Implement Locks using Volatiles• IO-steps (2): lock release• Run Processors alone on locked portions
of linear memory• Lemma 1: sbufs invisible• Lemma 10: Ordinary C code in linear memory
![Page 36: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University.](https://reader036.fdocuments.us/reader036/viewer/2022062322/5697bff91a28abf838cbfef4/html5/thumbnails/36.jpg)
Summary
• Implement Locks using Volatiles• IO-steps (2): lock release• Run Processors alone on locked portions
of linear memory• Lemma 1: sbufs invisible• Lemma 10: Ordinary C code in linear memory
• Outlined correctness proof for implementation of structured parallel C– Initialisation– compilation