Download - Kernel Synchronization

Kernel Synchronization

Examples From the Linux Kernel

Michael E. Locasto

BIG PICTURE: HOW CAN THE KERNEL CORRECTLY SERVICE REQUESTS?

kernel control flow is a complicated, asynchronous interleaving

Main Ideas / Concepts

Atomic operations in x86

Kernel locking / synchronization primitives

Kernel preemption

Read-Copy-Update

The “big kernel lock”

Kernel Preemption

Kernel preemption is a concept in which the kernel can preempt other running kernel control paths (be they on behalf of a user or another kernel thread)

Acquiring a spinlock automatically disables kernel preemption (as we will see in the code)

Synchronization Primitives

Atomic operationsDisable interrupts (cli/sti modify IF of eflags)Lock memory bus (x86 lock prefix)Spin locksSemaphoresSequence LocksRead-copy-update (RCU) (lock free)

Barriers

Barriers are serializing operations; they “gather” and make operations sequential.

Memory barrier:x86 in/out on I/O portsx86 lock prefixx86 writes to CReg, SReg/eflags, DReg

x86 instr meaning

lfence read barrier

sfence write barrier

mfence r/w barrier

Barrier Implementation

Motivating Example: Using Semaphores in the Kernel

what are: down_read, up_read, and mmap_sem

START WITH THE DATA STRUCTURE: MM->MMAP_SEM

Let’s start with the data structure and see where that leads…

current->mm->mmap_sem

struct mm_struct: include/linux/mm_types.h

PRIMITIVE ONE: ATOMIC TYPE AND OPERATIONS

On x86, these operations are atomic

simple asm instructions that involve 0 or 1 aligned memory access

read-modify-update in 1 clock cycle (e.g., inc, dec)

anything prefixed by the IA-32 ‘lock’ prefix

atomic_t: include/linux/types.h

Example: Reference Counters

Refcounts: atomic_t; associated with resources, but keeps count of kernel control paths accessing the resource

PRIMITIVE TWO: SPINLOCKS

/include/linux/spinlock_types.h

typedef struct spinlock{struct raw_spinlock rlock;

} spinlock_t;

typedef struct raw_spinlock{arch_spinlock_t raw_lock;

} raw_spinlock_t;

arch/x86/include/asm/spinlock_types.h#L10

slock=1 (unlocked), slock=0 (locked)

spinlock API (partial)

/include/linux/spinlock.h

/kernel/spinlock.c

include/linux/spinlock_api_smp.h

Linux Tracks Lock Dependencies @ Runtime

PRIMITIVE THREE: SEMAPHORESHere we mainly consider Read/Write Semaphores

Important Caveats about Kernel Semaphores

Semaphores are *not* like spinlocks in the sense that the invoking process is put to sleep rather than busy waits.

As a result, kernel semaphores should only be used by functions that can safely sleep (i.e., not interrupt handlers)

might_sleep() leads (eventually) to:

rwsem_wake

__rwsem_do_wake

On our way out, allow a writer at the front of the waiting queue to proceed.

Then allow unlimited numbers of readers to access the critical region.

Advanced Techniques

Sequence LocksA solution to the multiple

readers-writer problem in that a writer is permitted to advance even if readers are in the critical section.

Readers must check both an entry and exit flag to see if data has been modified underneath them.

Read-Copy-Update (RCU)Designed to protect data structures

accessed by multiple CPUs; allows many readers and writers.

Basic idea is simple (and in the name). Readers access data structure via a pointer; writers initially act as readers & create a copy to modify. “Writing” is just a matter of updating the pointer.

RCUOnly for kernel control paths; disables preemption.

Used to protect data structures accessed through a pointer

by adding a layer of indirection, we can reduce wholesale writes/updates to a single atomic write/update

Heavy restrictions: RCU tasks cannot sleep

readers do little work

writers act as readers, make a copy, then update copy. Finally, they rewrite the pointer.

cleanup is correspondingly complicated.

RCU EXAMPLE: GETPPID(2)http://lxr.linux.no/#linux+v2.6.35.14/kernel/timer.c#L1354

EXERCISE: TIME PERFORMANCE COST OF SYNCHRONIZATION

Does synchronization impose a significant cost?(test at user level)

CODE: AUTOMATICALLY DRAWING RESOURCE GRAPHS