CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David...

41
CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans Lecture 12: Automating Memory Management http://www.cs.virginia.edu/cs216 Garbage Collectors, COAX, Seoul, 18 June 2002

description

CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans. Lecture 12: Automating Memory Management. Garbage Collectors, COAX, Seoul, 18 June 2002. http://www.cs.virginia.edu/cs216. Menu. Stack and Heap Mark and Sweep Stop and Copy - PowerPoint PPT Presentation

Transcript of CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David...

Page 1: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

CS216: Program and Data RepresentationUniversity of Virginia Computer Science

Spring 2006 David Evans

Lecture 12:Automating

Memory Management

http://www.cs.virginia.edu/cs216

Garbage Collectors, COAX, Seoul, 18 June 2002

Page 2: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

2UVa CS216 Spring 2006 - Lecture 12: Automating Memory Management

Menu• Stack and Heap• Mark and Sweep• Stop and Copy• Reference Counting• Java’s Garbage Collector• Python’s Solution

Page 3: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

3UVa CS216 Spring 2006 - Lecture 12: Automating Memory Management

Stack and Heap Reviewpublic class Strings { public static void test () { StringBuffer sb = new StringBuffer

("hello"); }

static public void main (String args[]) { test (); test (); }}

sb java.lang.StringBuffer“hello

Stack Heap

When do the stack and heap look like this?

A

B

123

Page 4: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

4UVa CS216 Spring 2006 - Lecture 12: Automating Memory Management

Stack and Heap Reviewpublic class Strings { public static void test () { StringBuffer sb = new StringBuffer

("hello"); }

static public void main (String args[]) { test (); test (); }}

sb java.lang.StringBuffer“hello

Stack Heap

B

2

Page 5: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

5UVa CS216 Spring 2006 - Lecture 12: Automating Memory Management

Garbage Heappublic class Strings { public static void test () { StringBuffer sb = new StringBuffer

("hello"); }

static public void main (String args[]) { while (true) test (); }}

“hello”

Stack Heap

“hello”“hello”“hello”

“hello”

“hello”

“hello”

“hello”

“hello”

“hello”“hello

”“hello

“hello”

“hello”

“hello”

“hello”

“hello”

“hello” “hello

”“hello

”“hello

“hello”

“hello”

“hello”“hello

”“hello”“hello

”“hello”

Page 6: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

6UVa CS216 Spring 2006 - Lecture 12: Automating Memory Management

Garbage Collection• System needs to reclaim storage on

the heap used by garbage objects

• How can it identify garbage objects?• How come we don’t need to garbage

collect the stack?

Page 7: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

7UVa CS216 Spring 2006 - Lecture 12: Automating Memory Management

Mark and Sweep

Page 8: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

8UVa CS216 Spring 2006 - Lecture 12: Automating Memory Management

Mark and Sweep• John McCarthy, 1960 (first LISP

implementation)• Start with a set of root references• Mark every object you can reach

from those references• Sweep up the unmarked objects

What are the root references?References on the stack.

Page 9: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

9UVa CS216 Spring 2006 - Lecture 12: Automating Memory Management

public class Phylogeny { static public void main (String args[]) { SpeciesSet ss = new SpeciesSet (); … (open file for reading) while (…not end of file…) { Species current = new Species (…name from file…, …genome from file…); ss.insert (current); }}

public class SpeciesSet { private Vector els; public void insert (/*@non_null@*/ Species s) { if (getIndex (s) < 0) els.add (s); }

Page 10: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

Stack

Top of Stack

Bottom of StackPhylogeny.main

“Duck”

“CATAG”

root: Species

name: genome:String[]: args

ss: SpeciesSet

“in.spc”

els:

current: Species

“Goat”

“CAGTG”

name: genome:

“Elf”

“CGGTG”

name: genome:

“Frog”

“CGATG”

name: genome:

SpeciesSet.insert

this: SpeciesSet

s: Species

public class Phylogeny { static public void main (String args[]) { SpeciesSet ss = new SpeciesSet (); while (…not end of file…) { Species current = new Species (…name…, …genome…); ss.insert (current); } }

public class SpeciesSet { private Vector els; public void insert (/*@non_null@*/ Species s) { if (getIndex (s) < 0) els.add (s); }

Page 11: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

Stack

Top of Stack

Bottom of Stack

Phylogeny.ma

in “Duck”

“CATAG”

root: Species

name: genome:String[]: args

ss: SpeciesSet

“in.spc”

els:

current: Species

“Goat”

“CAGTG”

name: genome:

“Elf”

“CGGTG”

name: genome:

“Frog”

“CGATG”

name: genome:

SpeciesSet.insert

this: SpeciesSet

s: Species

public class Phylogeny { static public void main (String args[]) { SpeciesSet ss = new SpeciesSet (); while (…not end of file…) { Species current = new Species (…name…, …genome…); ss.insert (current); } }

public class SpeciesSet { private Vector els; public void insert (/*@non_null@*/ Species s) { if (getIndex (s) < 0) els.add (s); }

After els.add (s)…

Page 12: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

Stack

Top of Stack

Bottom of Stack

Phylogeny.ma

in “Duck”

“CATAG”

root: Species

name: genome:String[]: args

ss: SpeciesSet

“in.spc”

els:

current: Species

“Goat”

“CAGTG”

name: genome:

“Elf”

“CGGTG”

name: genome:

“Frog”

“CGATG”

name: genome:

SpeciesSet.insert

this: SpeciesSet

s: Species

public class Phylogeny { static public void main (String args[]) { SpeciesSet ss = new SpeciesSet (); while (…not end of file…) { Species current = new Species (…name…, …genome…); ss.insert (current); } }

public class SpeciesSet { private Vector els; public void insert (/*@non_null@*/ Species s) { if (getIndex (s) < 0) els.add (s); }

SpeciesSet.insert returns…

Page 13: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

StackBottom of Stack

Phylogeny.ma

in “Duck”

“CATAG”

root: Species

name: genome:String[]: args

ss: SpeciesSet

“in.spc”

els:

current: Species

“Goat”

“CAGTG”

name: genome:

“Elf”

“CGGTG”

name: genome:

“Frog”

“CGATG”

name: genome:public class Phylogeny {

static public void main (String args[]) { SpeciesSet ss = new SpeciesSet (); while (…not end of file…) { Species current = new Species (…name…, …genome…); ss.insert (current); } }

public class SpeciesSet { private Vector els; public void insert (/*@non_null@*/ Species s) { if (getIndex (s) < 0) els.add (s); }

Finish while loop…

Top of Stack

Page 14: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

StackBottom of Stack

Phylogeny.ma

in “Duck”

“CATAG”

root: Species

name: genome:String[]: args

ss: SpeciesSet

“in.spc”

els:

“Goat”

“CAGTG”

name: genome:

“Elf”

“CGGTG”

name: genome:

“Frog”

“CGATG”

name: genome:public class Phylogeny {

static public void main (String args[]) { SpeciesSet ss = new SpeciesSet (); while (…not end of file…) { Species current = new Species (…name…, …genome…); ss.insert (current); } }

public class SpeciesSet { private Vector els; public void insert (/*@non_null@*/ Species s) { if (getIndex (s) < 0) els.add (s); }

Garbage Collection

Top of Stack

Page 15: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

StackBottom of Stack

Phylogeny.ma

in “Duck”

“CATAG”

root: Species

name: genome:String[]: args

ss: SpeciesSet

“in.spc”

els:

“Goat”

“CAGTG”

name: genome:

“Elf”

“CGGTG”

name: genome:

“Frog”

“CGATG”

name: genome:

Garbage Collection

Top of Stack

Initialize Mark and Sweeper: active = all objects on stack

Page 16: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

StackBottom of Stack

Phylogeny.ma

in “Duck”

“CATAG”

root: Species

name: genome:String[]: args

ss: SpeciesSet

“in.spc”

els:

“Goat”

“CAGTG”

name: genome:

“Elf”

“CGGTG”

name: genome:

“Frog”

“CGATG”

name: genome:

Garbage Collection

Top of Stack

active = all objects on stack while (!active.isEmpty ()) newactive = { } foreach (Object a in active) mark a as reachable (non-garbage) foreach (Object o that a points to) if o is not marked newactive = newactive U { o } active = newactive

Page 17: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

StackBottom of Stack

Phylogeny.ma

in “Duck”

“CATAG”

root: Species

name: genome:String[]: args

ss: SpeciesSet

“in.spc”

els:

“Goat”

“CAGTG”

name: genome:

“Elf”

“CGGTG”

name: genome:

“Frog”

“CGATG”

name: genome:

Garbage Collection

Top of Stack

active = all objects on stack while (!active.isEmpty ()) newactive = { } foreach (Object a in active) mark a as reachable (non-garbage) foreach (Object o that a points to) if o is not marked newactive = newactive U { o } active = newactive

Page 18: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

StackBottom of Stack

Phylogeny.ma

in “Duck”

“CATAG”

root: Species

name: genome:String[]: args

ss: SpeciesSet

“in.spc”

els:

“Goat”

“CAGTG”

name: genome:

“Elf”

“CGGTG”

name: genome:

“Frog”

“CGATG”

name: genome:

Garbage Collection

Top of Stack

active = all objects on stack while (!active.isEmpty ()) newactive = { } foreach (Object a in active) mark a as reachable (non-garbage) foreach (Object o that a points to) if o is not marked newactive = newactive U { o } active = newactive

Page 19: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

StackBottom of Stack

Phylogeny.ma

in “Duck”

“CATAG”

root: Species

name: genome:String[]: args

ss: SpeciesSet

“in.spc”

els:

“Goat”

“CAGTG”

name: genome:

“Elf”

“CGGTG”

name: genome:

“Frog”

“CGATG”

name: genome:

Garbage Collection

Top of Stack

active = all objects on stack while (!active.isEmpty ()) newactive = { } foreach (Object a in active) mark a as reachable (non-garbage) foreach (Object o that a points to) if o is not marked newactive = newactive U { o } active = newactive

Page 20: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

StackBottom of Stack

Phylogeny.ma

in “Duck”

“CATAG”

root: Species

name: genome:String[]: args

ss: SpeciesSet

“in.spc”

els:

“Goat”

“CAGTG”

name: genome:

“Elf”

“CGGTG”

name: genome:

“Frog”

“CGATG”

name: genome:

Garbage Collection

Top of Stack

active = all objects on stack while (!active.isEmpty ()) newactive = { } foreach (Object a in active) mark a as reachable (non-garbage) foreach (Object o that a points to) if o is not marked newactive = newactive U { o } active = newactive

Page 21: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

StackBottom of Stack

Phylogeny.ma

in “Duck”

“CATAG”

root: Species

name: genome:String[]: args

ss: SpeciesSet

“in.spc”

els:

“Goat”

“CAGTG”

name: genome:

“Elf”

“CGGTG”

name: genome:

“Frog”

“CGATG”

name: genome:

Garbage Collection

Top of Stack active = all objects on stack while (!active.isEmpty ()) newactive = { } foreach (Object a in active) mark a as reachable foreach (Object o that a points to) if o is not marked newactive = newactive U { o } active = newactive sweep () // remove unmarked objects on heap

Page 22: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

StackBottom of Stack

Phylogeny.ma

in “Duck”

“CATAG”

root: Species

name: genome:String[]: args

ss: SpeciesSet

“in.spc”

els:

“Goat”

“CAGTG”

name: genome:

“Elf”

“CGGTG”

name: genome:

“Frog”

“CGATG”

name: genome:

After main returns…

Top of Stack active = all objects on stack while (!active.isEmpty ()) newactive = { } foreach (Object a in active) mark a as reachable foreach (Object o that a points to) if o is not marked newactive = newactive U { o } active = newactive sweep () // remove unmarked objects on heap

Page 23: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

StackBottom of Stack

“Duck”

“CATAG”

name: genome:

“in.spc”

els:

“Goat”

“CAGTG”

name: genome:

“Elf”

“CGGTG”

name: genome:

“Frog”

“CGATG”

name: genome:

Garbage Collection

Top of Stack

active = all objects on stack while (!active.isEmpty ()) newactive = { } foreach (Object a in active) mark a as reachable foreach (Object o that a points to) if o is not marked newactive = newactive U { o } active = newactive sweep () // remove unmarked

objects on heap

Page 24: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

24UVa CS216 Spring 2006 - Lecture 12: Automating Memory Management

Problems with Mark and Sweep• Fragmentation: free space and alive

objects will be mixed– Harder to allocate space for new objects– Poor locality means bad memory

performance• Caches make it quick to load nearby memory

• Multiple Threads– One stack per thread, one heap shared by

all threads– All threads must stop for garbage collection

Page 25: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

25UVa CS216 Spring 2006 - Lecture 12: Automating Memory Management

Stop and Copy• Solves fragmentation problem• Copy all reachable objects to a new

memory area• After copying, reclaim the whole old

heap• Disadvantages:

– More complicated: need to change stack and internal object pointers to new heap

– Need to save enough memory to copy – Expensive if most objects are not garbage

Page 26: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

26UVa CS216 Spring 2006 - Lecture 12: Automating Memory Management

Generational Collectors• Observation:

– Many objects are short-lived • Temporary objects that get garbage collected right away

– Other objects are long-lived• Data that lives for the duration of execution

• Separate storage into regions– Short term: collect frequently– Long term: collect infrequently

• Stop and copy, but move copies into longer-lived areas

Page 27: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

27UVa CS216 Spring 2006 - Lecture 12: Automating Memory Management

Java’s Garbage Collector(Sun implementation)

• Mark and Sweep collector• Before collecting an object, it will call

finalize

• Can call garbage collector directly: System.gc ()

protected void finalize() throws Throwable

Page 28: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

28UVa CS216 Spring 2006 - Lecture 12: Automating Memory Management

Reference CountingWhat if each object kept track of the number of references to it?

If an object has zero references, it is garbage!

Page 29: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

29UVa CS216 Spring 2006 - Lecture 12: Automating Memory Management

Referencing CountingT x = new T ();

The x object has one referencey = x;

The object x references has 1 more refThe object ypre references has 1 less ref

} Leave scope where x is declaredThe object x references has 1 less ref

Page 30: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

30UVa CS216 Spring 2006 - Lecture 12: Automating Memory Management

Reference Countingclass Recycle { private String name; private Vector pals; public Recycle (String name) { this.name = name; pals = new Vector (); } public void addPal (Recycle r) { pals.addElement (r); }}

public class Garbage { static public void main (String args[]) { Recycle alice = new Recycle ("alice"); Recycle bob = new Recycle ("bob"); bob.addPal (alice); alice = new Recycle ("coleen"); bob = new Recycle ("dave"); } }

“Alice”

name:pals:refs: 1

“Bob”

name:pals:refs: 1

2

Page 31: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

31UVa CS216 Spring 2006 - Lecture 12: Automating Memory Management

Reference Countingclass Recycle { private String name; private Vector pals; public Recycle (String name) { this.name = name; pals = new Vector (); } public void addPal (Recycle r) { pals.addElement (r); }}

public class Garbage { static public void main (String args[]) { Recycle alice = new Recycle ("alice"); Recycle bob = new Recycle ("bob"); bob.addPal (alice); alice = new Recycle ("coleen"); bob = new Recycle ("dave"); } }

“Alice”

name:pals:refs: 1

“Bob”

name:pals:refs: 1

2

“Coleen”

name:pals:refs: 1

Page 32: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

32UVa CS216 Spring 2006 - Lecture 12: Automating Memory Management

Reference Countingclass Recycle { private String name; private Vector pals; public Recycle (String name) { this.name = name; pals = new Vector (); } public void addPal (Recycle r) { pals.addElement (r); }}

public class Garbage { static public void main (String args[]) { Recycle alice = new Recycle ("alice"); Recycle bob = new Recycle ("bob"); bob.addPal (alice); alice = new Recycle ("coleen"); bob = new Recycle ("dave"); } }

“Alice”

name:pals:refs: 1

“Bob”

name:pals:refs: 10

0

Page 33: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

33UVa CS216 Spring 2006 - Lecture 12: Automating Memory Management

Python’s Implementation• Automatic reference counting to

manage objects• ...but optional additional garbage

collector (we’ll see why soon)

Page 34: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

34UVa CS216 Spring 2006 - Lecture 12: Automating Memory Management

app1static intapp1(PyListObject *self, PyObject *v){ Py_ssize_t n = PyList_GET_SIZE(self);

assert (v != NULL); if (n == INT_MAX) { PyErr_SetString(PyExc_OverflowError,

"cannot add more objects to list"); return -1; }

if (list_resize(self, n+1) == -1) return -1;

Py_INCREF(v); PyList_SET_ITEM(self, n, v); return 0;}

Page 35: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

35UVa CS216 Spring 2006 - Lecture 12: Automating Memory Management

Python Objects#define _Py_NewReference(op) ( \ (op)->ob_refcnt = 1)

#define Py_INCREF(op) ( \ (op)->ob_refcnt++)

#define Py_DECREF(op) \ if (--(op)->ob_refcnt != 0) \ _Py_CHECK_REFCNT(op) \ else \ _Py_Dealloc((PyObject *)(op))

Note: I haveremoved somedebugging macrosfrom these.

#define _Py_CHECK_REFCNT(OP) \ { if ((OP)->ob_refcnt < 0) \ _Py_NegativeRefcount(__FILE__, __LINE__, \ (PyObject *)(OP)); }

Page 36: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

36UVa CS216 Spring 2006 - Lecture 12: Automating Memory Management

list_deallocstatic void list_dealloc(PyListObject *op) { Py_ssize_t i; PyObject_GC_UnTrack(op); Py_TRASHCAN_SAFE_BEGIN(op) if (op->ob_item != NULL) { /* Do it backwards, for Christian Tismer. There's a simple test case where somehow this reduces thrashing when a *very* large list is created and immediately deleted. */ i = op->ob_size; while (--i >= 0) { Py_XDECREF(op->ob_item[i]); } PyMem_FREE(op->ob_item); } if (num_free_lists < MAXFREELISTS && PyList_CheckExact(op)) free_lists[num_free_lists++] = op; else op->ob_type->tp_free((PyObject *)op); Py_TRASHCAN_SAFE_END(op) }

Page 37: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

37UVa CS216 Spring 2006 - Lecture 12: Automating Memory Management

Can reference counting ever fail to reclaim unreachable

storage?

Page 38: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

38UVa CS216 Spring 2006 - Lecture 12: Automating Memory Management

Circular Referencesclass Recycle { private String name; private Vector pals; public Recycle (String name) { this.name = name; pals = new Vector (); } public void addPal (Recycle r) { pals.addElement (r); }}

public class Garbage { static public void main (String args[]) { Recycle alice = new Recycle ("alice"); Recycle bob = new Recycle ("bob"); bob.addPal (alice); alice.addPal (bob); alice = null; bob = null; } }

“Alice”

name:pals:refs: 1

“Bob”

name:pals:refs: 1

2

2

Page 39: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

39UVa CS216 Spring 2006 - Lecture 12: Automating Memory Management

Reference Counting Summary

• Advantages– Can clean up garbage right away when

the last reference is lost– No need to stop other threads!

• Disadvantages– Need to store and maintain reference

count– Some garbage is left to fester (circular

references)– Memory fragmentation

Page 40: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

40UVa CS216 Spring 2006 - Lecture 12: Automating Memory Management

Python has both!• Reference counting:

– To quickly reclaim most storage• Garbage collector (optional, but on

by default):– To collect circular references

Page 41: CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans

41UVa CS216 Spring 2006 - Lecture 12: Automating Memory Management

Charge• In Java: be happy you have a garbage

collector to clean up for you• In Python: by happy you have a

reference counter managing your objects (but worry it might miss things!)

• In C:– Why is it hard to write a garbage collector

for C? • Return Exam 1