Knowing your Python Garbage Collector

Post on 15-Jun-2015

248 views 1 download

Tags:

description

In this talk CPython reference counting and PyPy incremental mini mark algorithms are explained.

Transcript of Knowing your Python Garbage Collector

Knowing your garbage collector

Francisco Fernandez Castano

Rushmore.fm

francisco.fernandez.castano@gmail.com @fcofdezc

September 27, 2014

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 1 / 30

Overview

1 IntroductionMotivationConcepts

2 AlgorithmsCPython RCPyPy

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 2 / 30

Motivation

Managing memory manually is hard.

Who owns the memory?

Should I free these resources?

What happens with double frees?

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 3 / 30

Dangling pointers

int *func(void)

{

int num = 1234;

/* ... */

return #

}

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 4 / 30

John Maccarthy

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 5 / 30

Basic concepts

Heap

A data structure in which objects may be allocated or deallocated in anyorder.

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 6 / 30

Basic concepts

Heap

A data structure in which objects may be allocated or deallocated in anyorder.

Mutator

The part of a running program which executes application code.

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 7 / 30

Basic concepts

Heap

A data structure in which objects may be allocated or deallocated in anyorder.

Mutator

The part of a running program which executes application code.

Collector

The part of a running program responsible of garbage collection.

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 8 / 30

Garbage collection

Definition

Garbage collection is automatic memory management. While themutator runs , it routinely allocates memory from the heap. If morememory than available is needed, the collector reclaims unused memoryand returns it to the heap.

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 9 / 30

CPython GC

CPython implementation has garbage collection.

CPython GC algorithm is Reference counting with cycle detector

It also has a generational GC.

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 10 / 30

Young objects

[elem * 2 for elem in elements]

balance = (a / b / c) * 4

’asdadsasd -xxx’.replace(’x’, ’y’). replace(’a’, ’b’)

foo.bar()

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 11 / 30

PyObject

typedef struct _object {

_PyObject_HEAD_EXTRA

Py_ssize_t ob_refcnt;

struct _typeobject *ob_type;

} PyObject;

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 12 / 30

PyTypeObject

typedef struct _typeobject {

PyObject_VAR_HEAD

const char *tp_name;

Py_ssize_t tp_basicsize , tp_itemsize;

destructor tp_dealloc;

printfunc tp_print;

getattrfunc tp_getattr;

setattrfunc tp_setattr;

void *tp_reserved;

.

.

} PyTypeObject;

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 13 / 30

Reference Counting Algorithm

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 14 / 30

Reference Counting Algorithm

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 15 / 30

Reference Counting Algorithm

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 16 / 30

Reference Counting Algorithm

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 17 / 30

Reference Counting Algorithm

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 18 / 30

Cycles

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 19 / 30

Cycles

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 20 / 30

PyObject

typedef struct _object {

_PyObject_HEAD_EXTRA

Py_ssize_t ob_refcnt;

struct _typeobject *ob_type;

} PyObject;

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 21 / 30

PyTypeObject

typedef struct _typeobject {

PyObject_VAR_HEAD

const char *tp_name;

Py_ssize_t tp_basicsize , tp_itemsize;

destructor tp_dealloc;

printfunc tp_print;

getattrfunc tp_getattr;

setattrfunc tp_setattr;

void *tp_reserved;

.

.

} PyTypeObject;

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 22 / 30

Demo

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 23 / 30

Reference counting

Pros: Is incremental, as it works, it frees memory.

Cons: Detecting Cycles could be hard.

Cons: Size overhead on objects.

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 24 / 30

PyPy

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 25 / 30

Mark and Sweep Algorithm

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 26 / 30

Mark and Sweep Algorithm

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 27 / 30

Mark and Sweep Algorithm

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 28 / 30

Mark and Sweep Algorithm

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 29 / 30

Mark and sweep

Pros: Can collect cycles.

Cons: Basic implementation stops the world

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 30 / 30

Questions?

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 31 / 30

The End

Francisco Fernandez Castano (@fcofdezc) Python GC September 27, 2014 32 / 30