Release Consistency
description
Transcript of Release Consistency
Release Consistency
Yujia Jin2/27/02
Motivations
Place partial order on memory accesses for correct parallel program behavior
Relax partial order for memory accesses overlap
Tradeoff between programmer productivity and processor performance
Basic Assumptions
Uniprocessor control and data dependences are respected
Memory coherence Read write to same the address is
serialized
Previous Consistency Models
Sequential Consistency (SC) Each processor runs in program order Operations of all processors serialized
Processor Consistency (PC) Same as SC except
Read can bypass write before write is performed Non-atomic write
Weak Consistency (WCsc) Memory access cannot be reordered pass
synchronization accesses Synchronization accesses are sequential
consistent
Access Classification
Shared access
Competing Non-Competing
synchronization Non-synchronization
acquire release
Key Observations
Acquire Getting permission from other processors
for subsequent memory accesses Previous memory accesses can be
overlapped Release
Giving permission to other processors for previous memory accesses
Subsequent memory accesses can be overlapped
Release Consistency
Cannot start memory access before previous acquires are performed
Cannot start release access before previous memory accesses are performed
Competing accesses in PC
Comparison
store
load
load
load
store
SC
store
load
load
load
store
store store
PC
acquire
release
load/store…
load/store
nsync store
load/store…
load/store
load/store…
load/store
acquire
release
load/store…
load/store
acquire
release
nsync store
acquire
release
WCsc
acquire
release
load/store…
load/store
nsync store
load/store…
load/store
load/store…
load/store
acquire
release
load/store…
load/store
acquire
release
nsync store
acquire
release
RCpc
Properly-Labeled Program
Add enough syncL lables such that there is an appropriate syncL separating any possible pairing of two conflicting memory accesses from different processors, where at least one of the access is ordinaryL
Competing specialL
(shared access) sharedL
Less conservative label gives better performance
Label by compiler or by programmer thourgh predefined synchronization constucts
Shared access
Competing Non-Competing
synchronization Non-synchronization
acquire release
sharedL
specialL
syncLnsyncL
ordinaryL
acqL relL
Implementation
Use fence operations to block memory accesses
Conditional block, only if some relevant memory accesses have not performed
Three types of fence, full, write, immediate
Fence operations is flexible, can implement SC, PC, WC, RC, …
Discussion
One step closer to message passing?
With the additional complexity, how much improvement do we get?