Garbage Collection Memory Management Garbage Collection –Language requirement –VM service...
-
Upload
lindsey-owen -
Category
Documents
-
view
218 -
download
2
Transcript of Garbage Collection Memory Management Garbage Collection –Language requirement –VM service...
Garbage Collection
Memory Management
• Garbage Collection– Language requirement– VM service– Performance issue in time and space
Garbage Collection
Outline
• Motivation for GC• Basics of GC
– How it works– Key design choices– Basic GCs
• A taxonomy of sorts• GC Performance
Garbage Collection
Outline
• Briefly introduce the challenges and key ideas in Memory Management– Explicit vs Automatic– Memory Organization
• Contiguous allocation (bump pointer)• Free lists (analogous to pages in memory)
– Reclamation• Tracing• Reference counting
Garbage Collection
Memory Management
• Program Objects/Data occupy memory• How does the runtime system efficiently
create and recycle memory on behalf of the program?– What makes this problem important?– What makes this problem hard?– Why are researchers (e.g. me) still working
on it?
Garbage Collection
Dynamic memory allocation and reclamation
• Heap contains dynamically allocated objects
• Object allocation: malloc, new• Deallocation:
– Manual/explicit: free, delete– automatic: garbage collection
Garbage Collection
Explicit Memory Management Challenges
• More code to maintain• Correctness
– Free an object too soon - core dump– Free an object too late - waste space– Never free - at best waste, at worst
fail
• Efficiency can be very high• Gives programmers “control”
Garbage Collection
Garbage collection:Automatic memory
management• reduces programmer burden• eliminates sources of errors• integral to modern object-oriented
languages, i.e., Java, C#, .net• now part of mainstream computing• Challenge:
– performance efficiency
Garbage Collection
Key Issues
• For both– Fast allocation– Fast reclamation– Low fragmentation (wasted space)– How to organize the memory space
• Garbage Collection– Discriminating live objects and
garbage
Garbage Collection
GC: How?
• Automatically collect dead objects• Liveness reachability
– Root sets
• Unreachable non-live garbage– Compared to dead in compiler
• ‘Once garbage always garbage’– (Always safe to collect unreachable objects)
Garbage Collection
Liveness: the GC and VM/Compiler Contract
• GC Maps - identify what is live– Root set
• Live registers, walk the stack to enumerate stack variables, globals at any potential GC point
– JIT/compiler generates a map for every program point where a GC may occur
– Can constrain optimizations (derived pointers)
– Required for type-accurate GC
Garbage Collection
VM/Compiler GC contract
• Write/Read barriers for generational & incremental collection– JIT must insert barriers in generated code– Usually inlines barriers– Barriers trade-off GC and mutator costs
• Cooperative scheduling– In many VMs, all mutator threads must be
stopped at GC points. One solution requires JITs to inject GC yieldpoints at regular intervals in the generated code
Garbage Collection
Basic GC Techniques
• Reference Counting• Tracing
– Mark-sweep– Mark-compact– Copying
Garbage Collection
Tracing/Reference Counting
• ‘Tracing’ (reachability)– Trace reachability
from program ‘roots’
• Registers• Stacks• Statics
– Objects not traced are unreachable
• A Reference Count for each object− #incoming pointers− incremental− count = 0 (unreachability)– Notice removal of last
reference to an object– Write barrier for
implementation
– Concerns– cycles are an issue– space for reference count
– Efficiency (deferred
reference counting)
1
1 1
1
0
2
21
1
Garbage Collection
Mark-Sweep
• How it works– Tracing to “mark” live objects– Sweep to reclaim dead objects
• Dead objects linked to free-lists
• Concerns– Fragmentation – Cost proportional to heap size– Locality
Garbage Collection
Mark-Compact
• How it works– Tracing to “mark” live (reachable)
objects– Sliding compacting marked objects
• Concerns– Fragmentation solved– Cost
• Two or more scans of live objects
– Locality problem ameliorated
Garbage Collection
Copying GC
• Copy reachable objects to a contiguous area– e.g., Semispace with Cheney scan
Fromspace
Root set
Tospace
Garbage Collection
Copying GC
• Copy reachable objects to a contiguous area– e.g., Semispace with Cheney scan
From space
Root set
To space
Garbage Collection
Copying Garbage Collection
‘from space’‘to space’ ‘from space’‘to space’‘to space’ ‘to space’‘from space’ ‘from space’‘to space’‘to space’ ‘to space’‘from space’
Garbage Collection
Cheney ScanRoot set
A
B
C DE
F
scan freeA B
scan freeA B
scan freeA B
C
DC
Garbage Collection
Space Management
• Two broad approaches:– Copying
• Bump allocation & en masse reclamation– Fast allocation & reclaim– Space overhead, copy cost
– Non-copying• Free-list allocation & reclamation
– Space efficiency– Fragmentation
Garbage Collection
Bump-Pointer
Fast (increment & bounds check)
Can't incrementally free & reuse: must free en masse
Relatively slow (consult list for fit)
Can incrementally free & reuse cells
Free-List
Allocation Choices
Garbage Collection
Allocation Choices
• Bump pointer– ~70 bytes IA32 instructions, 726MB/s
• Free list– ~140 bytes IA32 instructions, 654MB/s
• Bump pointer 11% faster in tight loop– < 1% in practical setting– No significant difference (?)
• Second order effects?– Locality??– Collection mechanism??
Garbage Collection
Generational GC
• Observation– Most objects die young– A small percentage long lived objects
• Avoid copying long-lived objects several times
• Generational GC segregates objects by age– Older object collects less often
Garbage Collection
Generational GC
• Design Issues– Advancement policies
• When to advance a live object into next generation
– Heap organization– Collection scheduling– Intergenerational references
• Remembered sets• Page marking, word marking, card
marking, store list
Garbage Collection
Incremental/Concurrent Approaches
• For real time and interactive application– Guarantee pause time– Can generational collection work?
• Techniques– Reference counting– Tracing
• concurrency
Garbage Collection
Incremental/Concurrent Approaches
• Approaches to coordinate the collector and the mutator– Tri-color marking
• Black, gray, white
– Mutator cannot install a pointer from black to white• Read barrier
– Color an white object gray before the mutator access it
• Write barrier– Trap an object when a pointer is write into it
Garbage Collection
Write Barrier Algorithms
• Snap-shot-at-beginning– No objects ever become in in accessible to
the GC while collection is in progress• Yussa’s: An overwriten value is first saved for
later examination• Objects are allocated black
• Incremental Update– Objects that die during GC before being
reached by GC traversal are not traversed and marked
– Objects are allocated white– Iteratively traverse black objects got
pointers store into
Garbage Collection
Baker’s Read Barrier Algorithms
• Incremental copying with Cheney scan
• Any fromspace object accessed by the mutator is copy to tospace– Use read barrier
• New objects allocated in tospace– black
Garbage Collection
Taxonomy of Sortsor: Key Design Dimensions
• Incrementality• Composability• Concurrency• Parallelism• Distribution
Garbage Collection
Incrementality
• ‘Full heap’ tracing:– ‘Pause time’ goes up with heap size
• Incremental tracing:– Bounded tracing time– Conservative assumption:
• All objects in rest of heap are live
– Remember pointers from rest of heap• Add ‘remembered set’ to roots for tracing
Garbage Collection
Composability
• Hybrids– Copy younger objects– Non-copying collection of older
objects
• Hierarchies– Copying intra-partition (increment)– Reference counting inter-partition
Garbage Collection
Concurrency
• ‘Mutator’ and GC operate concurrently
?
Garbage Collection
Parallelism
• Concurrency among multiple GCs– Load balancing– Race conditions when tracing– Synchronization
Garbage Collection
Distribution
• Typically implies:– Incrementality– Concurrency– Parallelism– Composability
• Detecting termination– When has a partition become
isolated?
Garbage Collection
GC Performance
• Three key dimensions:– Throughput (bandwidth)– Responsiveness (latency)– Space
• Measurement issues:– Selecting benchmarks– Understanding space time tradeoff