Dynamic Filtering: Multi-Purpose Architecture Support for Language Runtime Systems
-
Upload
yuanxuan-wang -
Category
Technology
-
view
458 -
download
0
description
Transcript of Dynamic Filtering: Multi-Purpose Architecture Support for Language Runtime Systems
![Page 1: Dynamic Filtering: Multi-Purpose Architecture Support for Language Runtime Systems](https://reader033.fdocuments.us/reader033/viewer/2022060111/5565438bd8b42a902d8b49c2/html5/thumbnails/1.jpg)
Dynamic Filtering: Multi-Purpose
Architecture Support for Language
Runtime Systems Tim Harris, Sasa Tomic, Adrian Cristal, Osman Unsal
Microsoft Research, BSC-Microsoft Research Center
![Page 2: Dynamic Filtering: Multi-Purpose Architecture Support for Language Runtime Systems](https://reader033.fdocuments.us/reader033/viewer/2022060111/5565438bd8b42a902d8b49c2/html5/thumbnails/2.jpg)
Old-to-young References (1)
• Observations ▫ Most allocated objects will die young ▫ Few references from older to younger objects
exist
![Page 3: Dynamic Filtering: Multi-Purpose Architecture Support for Language Runtime Systems](https://reader033.fdocuments.us/reader033/viewer/2022060111/5565438bd8b42a902d8b49c2/html5/thumbnails/3.jpg)
Old-to-young References (2)
• Young Generation ▫ Most newly allocated objects are allocated in
the young generation ▫ The number of objects that survive a minor
collection is expected to be low
• Old Generation ▫ Objects that are longer-lived ▫ Major collections are infrequent
![Page 4: Dynamic Filtering: Multi-Purpose Architecture Support for Language Runtime Systems](https://reader033.fdocuments.us/reader033/viewer/2022060111/5565438bd8b42a902d8b49c2/html5/thumbnails/4.jpg)
Card Table in Java HotSpot VM (1)
• Eliminating the need to scan the entire old generation ▫ Old generation is split into 512-byte chunks
called cards ▫ The card table is an array with one byte entry
per card in the heap ▫ Mark the container card dirty when a field is
updated ▫ Scan only dirty cards in minor collection
![Page 5: Dynamic Filtering: Multi-Purpose Architecture Support for Language Runtime Systems](https://reader033.fdocuments.us/reader033/viewer/2022060111/5565438bd8b42a902d8b49c2/html5/thumbnails/5.jpg)
Card Table in Java HotSpot VM (2)
![Page 6: Dynamic Filtering: Multi-Purpose Architecture Support for Language Runtime Systems](https://reader033.fdocuments.us/reader033/viewer/2022060111/5565438bd8b42a902d8b49c2/html5/thumbnails/6.jpg)
Card Table in Java HotSpot VM (3)
• A Write barrier to maintain the card table ▫ Executed every time a reference field is
updated ▫ Do impact performance on the execution a bit ▫ Allows for much faster minor collections
![Page 7: Dynamic Filtering: Multi-Purpose Architecture Support for Language Runtime Systems](https://reader033.fdocuments.us/reader033/viewer/2022060111/5565438bd8b42a902d8b49c2/html5/thumbnails/7.jpg)
Observations
• Many barriers perform checks but no real work ▫ old-to-young references are rare
• Many barriers are self-healing ▫ No need to further check a logged old-to-young
reference
![Page 8: Dynamic Filtering: Multi-Purpose Architecture Support for Language Runtime Systems](https://reader033.fdocuments.us/reader033/viewer/2022060111/5565438bd8b42a902d8b49c2/html5/thumbnails/8.jpg)
Approach
• Accelerate barriers by keeping track • Extend the instruction set with an operation
dftl ▫ Test if the barrier’s input are already in the
set
![Page 9: Dynamic Filtering: Multi-Purpose Architecture Support for Language Runtime Systems](https://reader033.fdocuments.us/reader033/viewer/2022060111/5565438bd8b42a902d8b49c2/html5/thumbnails/9.jpg)
Original Barrier
• A simple original barrier in pseudo-code void writeBarrier(void **addr, void *tgt) {
if (inOldGen(addr) && inYoungGen(tgt)) { // T1
log(addr); // L1
}
}
• Unnecessary when ▫ (addr, tgt) pair has already passed the full
check ▫ addr has already been logged
![Page 10: Dynamic Filtering: Multi-Purpose Architecture Support for Language Runtime Systems](https://reader033.fdocuments.us/reader033/viewer/2022060111/5565438bd8b42a902d8b49c2/html5/thumbnails/10.jpg)
A Dynamic Filter Way
void writeBarrierDyfl(void **addr, void *tgt) {
if ((!dyfl_card_pair(addr, tgt, 0x1)) && // A1
(!dyfl_addr(addr, 0x2))) { // A2
if (inOldGen(addr) && inYoungGen(tgt)) { // T1
dyfl_set_addr(addr, 0x2); // S2
log(addr); // L1
} else {
dyfl_set_card_pair(addr, tgt, 0x1); // S1
}
}
}
Test if a full check has already been done on (addr, tgt) pair
512-byte granularity for spatial locality Tag, distinguash this use from other uses
Single-address check Significant
![Page 11: Dynamic Filtering: Multi-Purpose Architecture Support for Language Runtime Systems](https://reader033.fdocuments.us/reader033/viewer/2022060111/5565438bd8b42a902d8b49c2/html5/thumbnails/11.jpg)
Dynamic Filtering in the ISA
dyfl(i1, i2, mask, tag) Test dynamic filter
dyfl_set(i1, i2, mask, tag) Set dynamic filter
dyfl_clear(i1, i2, mask, tag) Clear specific entry
dyfl_clear(tag) Clear all with tag
![Page 12: Dynamic Filtering: Multi-Purpose Architecture Support for Language Runtime Systems](https://reader033.fdocuments.us/reader033/viewer/2022060111/5565438bd8b42a902d8b49c2/html5/thumbnails/12.jpg)
Implementation Sketch
![Page 13: Dynamic Filtering: Multi-Purpose Architecture Support for Language Runtime Systems](https://reader033.fdocuments.us/reader033/viewer/2022060111/5565438bd8b42a902d8b49c2/html5/thumbnails/13.jpg)
Design Details
• Tag assignment ▫ 16 tags available
• Sharing ▫ Extend the tag implicitly to distinguish multiple
hardware threads on a same core
• Implementation ▫ Independently from the caches
![Page 14: Dynamic Filtering: Multi-Purpose Architecture Support for Language Runtime Systems](https://reader033.fdocuments.us/reader033/viewer/2022060111/5565438bd8b42a902d8b49c2/html5/thumbnails/14.jpg)
Using Dynamic Filtering
• Garbage Collection • Transaction Memory • Language-Based Security
![Page 15: Dynamic Filtering: Multi-Purpose Architecture Support for Language Runtime Systems](https://reader033.fdocuments.us/reader033/viewer/2022060111/5565438bd8b42a902d8b49c2/html5/thumbnails/15.jpg)
Transactional Memory
• STM with eager updates ▫ Dynamic filtering can be used to check whether
or not a location has already been accessed in the current transaction
• STM with deferred updates ▫ Less applicable, since slow path work is needed
for most cases ▫ Track locations have not been written, which
can be accessed directed
![Page 16: Dynamic Filtering: Multi-Purpose Architecture Support for Language Runtime Systems](https://reader033.fdocuments.us/reader033/viewer/2022060111/5565438bd8b42a902d8b49c2/html5/thumbnails/16.jpg)
Language-Based Security
• Control Flow Integrity (CFI) ▫ Record target-marker associations ▫ Record valid source-target address pairs
• XFI ▫ Data access check
![Page 17: Dynamic Filtering: Multi-Purpose Architecture Support for Language Runtime Systems](https://reader033.fdocuments.us/reader033/viewer/2022060111/5565438bd8b42a902d8b49c2/html5/thumbnails/17.jpg)
Evaluation
• Simulator ▫ Based on an x86 simulator ▫ A single multi-threaded user-mode process ▫ Simple time model
Each instruction takes 1 cycle + number of cycles spent on memory accesses
▫ dyfl implementation is separate from the caches
▫ 2048-entry filter
![Page 18: Dynamic Filtering: Multi-Purpose Architecture Support for Language Runtime Systems](https://reader033.fdocuments.us/reader033/viewer/2022060111/5565438bd8b42a902d8b49c2/html5/thumbnails/18.jpg)
Simulator v.s Real Hardware
![Page 19: Dynamic Filtering: Multi-Purpose Architecture Support for Language Runtime Systems](https://reader033.fdocuments.us/reader033/viewer/2022060111/5565438bd8b42a902d8b49c2/html5/thumbnails/19.jpg)
Generational GC Hit Rates (1)
![Page 20: Dynamic Filtering: Multi-Purpose Architecture Support for Language Runtime Systems](https://reader033.fdocuments.us/reader033/viewer/2022060111/5565438bd8b42a902d8b49c2/html5/thumbnails/20.jpg)
Generational GC Hit Rates (2)
![Page 21: Dynamic Filtering: Multi-Purpose Architecture Support for Language Runtime Systems](https://reader033.fdocuments.us/reader033/viewer/2022060111/5565438bd8b42a902d8b49c2/html5/thumbnails/21.jpg)
Generational GC Hit Rates (3)
![Page 22: Dynamic Filtering: Multi-Purpose Architecture Support for Language Runtime Systems](https://reader033.fdocuments.us/reader033/viewer/2022060111/5565438bd8b42a902d8b49c2/html5/thumbnails/22.jpg)
GC Acceleration Performance
![Page 23: Dynamic Filtering: Multi-Purpose Architecture Support for Language Runtime Systems](https://reader033.fdocuments.us/reader033/viewer/2022060111/5565438bd8b42a902d8b49c2/html5/thumbnails/23.jpg)
GC Acceleration Performance
![Page 24: Dynamic Filtering: Multi-Purpose Architecture Support for Language Runtime Systems](https://reader033.fdocuments.us/reader033/viewer/2022060111/5565438bd8b42a902d8b49c2/html5/thumbnails/24.jpg)
STM Performance (1)
![Page 25: Dynamic Filtering: Multi-Purpose Architecture Support for Language Runtime Systems](https://reader033.fdocuments.us/reader033/viewer/2022060111/5565438bd8b42a902d8b49c2/html5/thumbnails/25.jpg)
STM Performance (2)
![Page 26: Dynamic Filtering: Multi-Purpose Architecture Support for Language Runtime Systems](https://reader033.fdocuments.us/reader033/viewer/2022060111/5565438bd8b42a902d8b49c2/html5/thumbnails/26.jpg)
Sensitivity
JBBAtmoic, GC-STM JBBAtmoic, GC
JBBAtmoic, STM
![Page 27: Dynamic Filtering: Multi-Purpose Architecture Support for Language Runtime Systems](https://reader033.fdocuments.us/reader033/viewer/2022060111/5565438bd8b42a902d8b49c2/html5/thumbnails/27.jpg)
Conclusion
• Dynamic filtering ▫ An abstraction for accelerating read/write-
barriers used by runtime system ▫ Provide a mechanism for testing whether or not
a given runtime check has already been made