Increasing the Cache Efficiency by Eliminating Noise Philip A. Marshall.
-
date post
19-Dec-2015 -
Category
Documents
-
view
212 -
download
0
Transcript of Increasing the Cache Efficiency by Eliminating Noise Philip A. Marshall.
Increasing the Cache Efficiency by Eliminating Noise
Philip A. Marshall
Outline
Background Motivation for Noise Prediction Concepts of Noise Prediction Implementation of Noise Prediction Related Work Prefetching Data Profiling Conclusion
Background
Cache FetchOn Cache MissPrefetch
Exploiting Spatial LocalityCache words are fetched in blocksFetch neighboring block(s) on a cache missResults in fewer cache missesFetches words that aren’t needed
Background
Cache noiseWords that are fetched into the cache but
never used Cache utilization
The fraction of words in the cache that are used
Represents how efficiently the cache is used
Motivation for Noise Prediction
Level 1 data cache utilization is ~57% for SPEC2K benchmarks [2]
Fetching unused words: Increases bandwidth requirements between
cache levels Increases hardware and power requirementsWastes valuable cache space
[2] D. Burger et. al., Memory bandwidth limitations of future microprocessors, Proc. ISCA-23, 1996
Motivation for Noise Prediction
Cache block size Larger blocks
Exploit spatial locality better Reduce cache tag overhead Increase bandwidth requirements
Smaller blocks Reduced cache noise
Any block size results in suboptimal performance
Motivation for Noise Prediction
Sub-blockingOnly portions of the cache blocks are fetchedDecreases tag overhead by associating one
tag with many sub-blocksWords fetched must be in contiguous blocks
of fixed sizeHigh miss-rate and cache noise for non-
contiguous access patterns
Motivation for Noise Prediction
By predicting which words will actually be used, cache noise can be reduced
But:Fetching fewer words could increase the
number of cache misses
Concepts of Noise Prediction
Selective fetchingFor each block, fetch only the words that are
predicted to be accessed If no prediction is available, fetch the entire
blockUses a valid bit for each word and a words
usage bit to track which words have been used
Concepts of Noise Prediction
Cache Noise PredictorsPhase Context Predictor (PCP)
Based on the usage pattern of the most recently evicted block
Memory Context Predictor (MCP) Based on the MSBs of the memory address
Code Context Predictor (CCP) Based on the MSBs of the PC
Concepts of Noise Prediction
Prediction table sizeLarger tables decrease the probability of “no
predictions”Smaller tables use less power
A prediction is considered successful if all the needed words are fetched If extra words are fetched, still considered a
success
Concepts of Noise Prediction
Improving PredictionMiss Initiator Based History (MIBH)
Keep separate histories according to which word in the block caused the miss
Improves predictability if relative position of words accessed is fixed
Example: looping through a struct and accessing only one field
Concepts of Noise Prediction
Improving PredictionOR-ing Previous Two Histories (OPTH)
Increases predictability by looking at more than the most recent access
Reduces cache utilization OR-ing more than two accesses reduces utilization
substantially
Results
Empirically, CCP provides the best results MIBH greatly increases predictability OPTH improves predictability only
marginally while increasing cache noise Cache utilization increased from 57% to
92%
Results
Results
Related Work
Existing work focuses reducing cache misses, not on improving utilization
Sub-blocked caches used mainly to decrease tag overhead
Some existing work on prediction of which sub-blocks to load in a sub-blocked cache
No existing techniques for predicting and fetching non-contiguous words
Related Work
Prefetching
Prefetching improves the cache miss rate Commonly, prefetching is implemented by
also fetching the next block on a cache miss
Prefetching increases noise and increases bandwidth requirements
Prefetching
Noise prediction leads to more intelligent prefetching but requires extra hardware
On average, prefetching with noise prediction leads to less energy consumption
In the worst case, energy requirements increase
Prefetching
Data Profiling
For some benchmarks there are a low number of predictions
The predictor table is too small to hold all the word usage histories
Don’t increasing table size, profile the data Profiling increases prediction rate by ~7% Gains aren’t as high as expected
Data Profiling
Analysis of Noise Prediction
ProsSmall increase in miss rate (0.1%)Decreased power requirements in most casesDecreased bandwidth requirements between
cache levelsAdapts effective block size to access patternsDynamic technique but profiling can be usedScaleable to different predictor sizes
Analysis of Noise Prediction
Cons Increased hardware overhead Increases power in the worst caseNot all programs benefitProfiling provides limited improvement
Other Thoughts
How were benchmarks chosen? 6 of 12 integer and 8 of 14 floating point SPEC2K
benchmarks were used Not all predictors were examined equally
22-bit MCP predictor performed slightly poorer than a 28-bit CCP
28-bit MCP? How can the efficiency of the prediction table be
increased?