Digital UNIX Internals ISupporting User Threads12 - 1 Supporting User-Threads: NXM 2-Level...

37
Digital UNIX Internals I Supporting User Threads 12 - 1 Supporting User-Threads: NXM 2-Level Scheduling Chapter Thirteen

Transcript of Digital UNIX Internals ISupporting User Threads12 - 1 Supporting User-Threads: NXM 2-Level...

Digital UNIX Internals I Supporting User Threads12 - 1

Supporting User-Threads:NXM 2-Level Scheduling

Chapter Thirteen

Digital UNIX Internals I Supporting User Threads12 - 2

While Compaq believes the information included in this presentation is correct as of the date produced, it is subject to change without notice. All rights reserved.

Compaq, the Compaq logo, DIGITAL and the DIGITAL logo are trademarks of Compaq Computer Corporation. IBM is a registered trademark of International Business Machines. Microsoft is a registered trademark of Microsoft Corporation.

Digital UNIX Internals I Supporting User Threads12 - 3

Goals of Many to N (NXM) Scheduling Schedule multiple user threads managed by

the library on the contexts of a smaller set of kernel threads maintained by the kernel Improved performance Improved scalability

Library to NXM CommunicationSystem calls

NXM to Library CommunicationUpcalls

Digital UNIX Internals I Supporting User Threads12 - 4

NXM Terminology

PhysicalProcessors

Kernel’s Structuresfor the Process(Mach Threads)

Scheduler Threads (Virtual Processors)

Library’s Structuresfor the Process

User Threads (Pthreads)

Bound Threads

Library Application

Digital UNIX Internals I Supporting User Threads12 - 5

Thread roles -- I of II Manager thread

Kernel only threadManages the kernel half

Scheduler threadKernel thread assigned to user work

Blocked scheduler threadUser thread waiting in the kernel

Performing a system callsHandling a page fault

“Replaced” by another scheduler threadso other user work can be done

Digital UNIX Internals I Supporting User Threads12 - 6

Thread roles -- II of II Extra threads

Kernel thread awaiting user workAfter blocked threads awaken

the number of kernel threads created for user work may exceed the number of CPU’s plus the number of blocked requests

Terminated within a few seconds Unless new work appears

PthreadsCreated and managed by the user libraryMultiplexed on scheduler threads

Digital UNIX Internals I Supporting User Threads12 - 7

Library Syscalls to Kernel Allow library to request operations specific to

2 level scheduling Most importantly to bind user-threads to

kernel threads.

Digital UNIX Internals I Supporting User Threads12 - 8

NXM Syscallsnxm_task_init() initialize process for two level scheduling

nxm_thread_create() create a thread for use of thread library

nxm_idle() declare this thread to be idle

nxm_wakeup_idle() release idle thread(s)

nxm_thread_kill() send signal to a user thread

nxm_thread_block() block a bound thread

nxm_thread_wakeup() unblock a bound thread

nxm_set_cancel() set cancel cancellation state

nxm_get_state() returns registers (for debuggers)

nxm_signal_check() check for signals (for library)

nxm_thread_suspend()suspend a bound thread

nxm_thread_resume() suspend a thread

nxm_thread_destroy()destroy a thread

nxm_resched() make thread perform resched upcall

This is not a complete list

Digital UNIX Internals I Supporting User Threads12 - 9

Syscalls implementation: Mach System calls

kern/syscall_sw.c

mach_trap_table[]

.…

FN(nxm_thread_destroy, 0, 0), /* 29 */

FN(nxm_thread_create, 3, 0), /* 32 */

FN(nxm_task_init, 2, 0), /* 33 */

FN(nxm_idle, 1, 0), /* 35 */

FN(nxm_wakeup_idle, 1, 1), /* 36 */

FN(nxm_set_pthid, 2, 0), /* 37 */

FN(nxm_thread_kill, 2, 0), /* 38 */

FN(nxm_thread_block, 5, 0), /* 39 */

FN(nxm_thread_wakeup, 1, 0), /* 40 */

....

Digital UNIX Internals I Supporting User Threads12 - 10

Kernel Upcalls to Library Notify the library of significant execution

events arrival of a signals a user thread blocking quantum expiration

Allow library to bind a new user thread to a scheduler thread

Digital UNIX Internals I Supporting User Threads12 - 11

NXM UpcallsNXM_QUANTUM_EXPIRE quantum expired

NXM_SCHED_THREAD_UT_BLOCKED thread blocked in kernel

NXM_THREAD_UNBLOCK_NO_SID thread unblocked no sched id

NXM_SIGNAL_BLOCKED current thread has blocked sig

NXM_SPECULATIVE_EXECUTION call user spec-exec handler

NXM_THREAD_INTERRUPTED sched/bound thread interrupted

NXM_VP_RESCHED vp should reenter scheduler

NXM_GENTRAP_HANDLER call user gentrap handler

NXM_STACK_OVERFLOW yellow zone notification

NXM_FB_FIXUP fixup FBs (fat binaries)

NXM_VP_UNBOUND vp became unbound

Digital UNIX Internals I Supporting User Threads12 - 12

Upcall Implementation Transitions from kernel to user-mode

Into a thread library routine Initiated by kernel

nxm_upcall() Made by an idle scheduler thread from

nxm_idle() Or from AST (Asynchronous System Trap)

contextthread_ast_set(thread, AST_BLOCKED_UPCALL)

Digital UNIX Internals I Supporting User Threads12 - 13

Handling Blocking System Calls

User thread 1,kernel thread 1

sleep()

Kernelthread 1*activates kernelthread 2and sleeps

thread_block()

nxm_block()

Kernelthread 2makes aNXM...BLOCK upcall

Library maps kernel thread 2to user thread 2

thread_ast_set()trap()nxm_upcall()

Kernelthread 1is awoken,makes NXM...UNBLOCKED... upcall

thread_ast_set()trap()nxm_upcall()Kernel

User-Space

Library

App.

Digital UNIX Internals I Supporting User Threads12 - 14

Handling Idle Threads

User thread 1,kernel thread 1

nxm_idle()

Kernelthread 1waits, ifnot neededin 20 secondswill exit.

Kernelthread 2makes a upcallindicatinguser-thread 2 isrunnable

Library maps kernel thread 1to user thread 2

Kernelthread 1is awoken,makes NXM...UNBLOCKED... upcall

thread_ast_set()trap()nxm_upcall()Kernel

User-Space

Library

App. User thread 1blocks in library,No other threads runnable

nxm_wakeup_idle()

Library recordsuser thread 2now runnable

Digital UNIX Internals I Supporting User Threads12 - 15

Other Conditions Signals

Kernel must notify library of signal arrival Scheduling Attributes

Library wants to set user thread scheduling attributesnxm_resched

Kernel must track user thread quantum of a running nxm thread in hardclock() and notify library when it expires

nxm_upcall(), NXM_QUANTUM_EXPIRE Thread Cancellation

One user thread can cancel another with pthread_cancel()

Kernel must test for cancellation in system calls for nxm threads

Digital UNIX Internals I Supporting User Threads12 - 16

To sleep, perchance to run Blocked threads

Sleep on the appropriate kernel event But with an AST set to do upcall before returning to user mode

If blocked by nxm_thread_block Sleeps on thread’s nxm_flags field

Manager threads sleep this way

Extra threads Sleep on nxm_extra_count field of their slot

If time expires, terminate thyself

Idle threads Sleep on address of nxm slot

Digital UNIX Internals I Supporting User Threads12 - 17

NXM Structures: The Big Picture

struct nxm_shared

nxm_callback 0xBBB

nxm_ss[]

struct task

nxm_user: 0xAAAnxm_share

thread_list

Process’sseg0 (VAS)

XXX(){

}

struct nxm_shared

0xAAA

0xBBB

nxm_sptrnxm_flags

uu_sptruu_share

super_thread

nxm_sptrnxm_flags

uu_sptruu_share

super_thread

Process’s Kernel Structures

nxm_sptrnxm_flags

uu_sptruu_share

super_threadnxm_sptrnxm_flags

uu_sptruu_share

super_thread

Digital UNIX Internals I Supporting User Threads12 - 18

struct nxm_shared

Mapped into User Address Space Holds information required by kernel and

library One per NXM process

struct nxm_shared { long nxm_callback; Address of upcall routine unsigned short nxm_version; Version number unsigned short nxm_uniq_offset; Correction factor for TEB int pad1; long space[2]; Future growth struct nxm_sched_state nxm_ss[1]; Array of shared areas }; (number of CPUs)

Digital UNIX Internals I Supporting User Threads12 - 19

struct nxm_sched_state Field of nxm_shared Represents state of a NXM Virtual Processor

(Kernel Thread) Sized One Per CPU

struct nxm_sched_state { struct ushared_state nxm_u; State owned by user thread nxm_sched_bits_t nxm_bits; int nxm_quantum; Quantum count-down value int nxm_set_quantum; Quantum reset value int nxm_sysevent; Syscall state struct nxm_upcall *nxm_uc_ret; Stack ptr of null thread void *nxm_tid; Scheduler's thread id long nxm_va; Page fault address struct nxm_pth_state *nxm_pthid; ID id of null thread . . . .};

Digital UNIX Internals I Supporting User Threads12 - 20

struct ushared_state Context-switched shared-state structure

Field of nxm_sched_state Shared directly between the running user thread and the

kernel Context-switched by the user thread library:struct ushared_state { sigset_t sigmask; Thread signal mask sigset_t sig; Thread pending mask struct nxm_pth_state *pth_id; Out-of-line state int flags; Shared flags int cancel_state; Thread's cancellation stateSemi-shared, visible to kernel but never context-switched by library: int nxm_ssig; Scheduler's synchronous signals int reserved1; long nxm_active; Scheduler active long reserved2;};

Digital UNIX Internals I Supporting User Threads12 - 21

struct nxm_upcall Upcall frame structure

The kernel builds this frame on the stack and its address on an upcall

This same structure can be used for saving/restoring user thread state on library context switches:

struct nxm_upcall { int usaved; /* u_state is valid */ int pad; struct ushared_state u_state; /* shared state */ struct alpha_saved_state reg_state; /* register state */};

Digital UNIX Internals I Supporting User Threads12 - 22

struct nxm_pth_state Out-of-line user thread state. The kernel can access this

when needed, but only the pointer to it (ushared_state.nws) lives in the shared state:struct nxm_pth_state { u_long fpregs[32]; u_long fpcr; Must follow the fpregs! stack_t altstack; struct uuprof prof; struct nxm_ieee_state { long ieee_fp_control; Floating point state long ieee_set_state_at_signal; long ieee_fp_control_at_signal; long ieee_fpcr_at_signal; } nxm_ieee_state; int sigforce; int stack_event; long pad[9];};

Digital UNIX Internals I Supporting User Threads12 - 23

struct task NXM fields struct task {...

struct nxm_task *nxm_state; Pointer to all 2-level state … Where the 2-level scheduling state is:

struct nxm_task { cpuset_t _nxm_sched_mask; Mask of blocked sched threads int _nxm_sched_max; Max scheduler threads int _nxm_sched_cnt; Active scheduler threads unsigned int _nxm_retry_count; nxm_get_thread() retries unsigned int _nxm_retry_cpu; nxm_get_thread() timeout cpu struct thread *_nxm_manager; Library's manager thread sigset_t _nxm_signal_upcall; Signals needing upcalls long _nxm_task_callback ; Library upcall routine unsigned int _nxm_task_version; nxm version unsigned short _nxm_task_uniq_offset; Offset to TEB int _nxm_task_quantum; Quantum from task attributes vm_offset_t _nxm_user_cfg_addr; User VA of config area vm_size_t _nxm_cfg_size; Size of config area vm_size_t _nxm_share_size; Size of shared area nxm_config_info_t *_nxm_cfg_ptr; Shared config area struct nxm_shared **_nxm_share; Pointer to shared areas struct nxm_vp_info _nxm_vp[1]; Per-slot data};

Digital UNIX Internals I Supporting User Threads12 - 24

struct thread NXM fieldsstruct thread {…

struct nxm_sched_state *nxm_sptr; Scheduler state (if NXM thread)int nxm_flags;…int nxm_slot;…}

thread.nxm_flags values:#define NXM_IDLE 0x1 Idle with no scheduler context#define NXM_SCHED 0x2 Has scheduler context valid#define NXM_NOSCHED 0x4 No scheduler context valid#define NXM_BLOCKED 0x8 Blocked NXM thread#define NXM_BIND_ME 0x10 Special#define NXM_BOUND 0x20 User thread bound to kernel thread#define NXM_EXEC 0x40 Exec in progress#define NXM_RESCHED 0x80 Force resched#define NXM_SIGEV 0x200 Signal event pending#define NXM_MANAGER 0x400 Thread is a manager#define NXM_WAKEUP 0x800 Thread wakeup is pending#define NXM_SUSPEND 0x1000 Library suspend in effect#define NXM_RESUME_POSTED 0x2000

Digital UNIX Internals I Supporting User Threads12 - 25

NXM struct thread nxm_flags ValuesNXM_IDLE idle with no scheduler context

NXM_SCHED has valid scheduler context

NXM_NOSCHED no valid scheduler context

NXM_BLOCKED blocked NXM thread

NXM_FP_STATE blocked and has FP state

NXM_BOUND user thread bound to kernel thread

NXM_EXEC exec in progress

NXM_RESCHED force resched

Digital UNIX Internals I Supporting User Threads12 - 26

Other Thread Related NXM fields

struct np_uthread {…NXM Informationstruct ushared_state *uu_sptr;struct ushared_state uu_share;void *uu_tsd[MAX_TSD_SLOTS]; Thread-specific data, currently

}; MAX_TSD_SLOTS is 8

struct uthread {… struct nxm_pth_state *uu_proflast; Last nxm thread to use uu_prof…}

Digital UNIX Internals I Supporting User Threads12 - 27

Selected Routinesnxm_task_init()nxm_thread_create()nxm_get_thread()nxm_idle()thread_block()nxm_block()

Digital UNIX Internals I Supporting User Threads12 - 28

nxm_task_init(int max_sthreads, vm_offset_t sthread_array, nxm_task_attr_p sched_attr)

System call to set up the caller's task to perform 2-level thread scheduling Create a shared memory region with the kernel

The caller must provide a page-aligned user address for scheduler-thread array

Wires the memory from the kernel so that the page cannot go away or be be deleted by the caller at some later time

Creates a VP thread if older library Makes the calling thread a scheduler thread

with attributes based on the sched_attr argument

Digital UNIX Internals I Supporting User Threads12 - 29

nxm_thread_create(nxm_thread_attr_p attr, long *kid, int thread_index)

System call to create a thread for use by the threads library: This can be one of:

VP/scheduler thread, a bound (system scope) thread, or a manager thread

Verify Thread index is less than NXM_THREAD_MAX (64)Thread policy and priority

Create the new thread:Initialize the policy, priority, the scheduling and

execution state (i.e. initialize the PCB for the thread)Deals with CPU binding

Try to start the thread

Digital UNIX Internals I Supporting User Threads12 - 30

nxm_get_thread(boolean_t restart) Run exclusively by the NXM manager thread

in response to a scheduler thread calling nxm_block() Scan the list of blocked scheduler threads and create new

threads to replace them after trying to wake up an extra thread to do the work

On failure, set a timeout to come back here and retry Responsible for creating a new scheduler thread when an

existing one blocks and there are currently no idle threads Creates a new scheduler thread to carry out a block-user-

thread upcall to the threads library

Digital UNIX Internals I Supporting User Threads12 - 31

nxm_idle(union extra_arg arg) System call to declare this thread idle If thread is a scheduler thread

Call nxm_sched_idleSleep until needed

If thread is not a scheduler threadCall nxm_extra_idle

If scheduler thread in this slot is block Replace it

Else if not too many extra threads Become an extra Go to sleep for a while If awakened with no work to do

– Terminate

Digital UNIX Internals I Supporting User Threads12 - 32

thread_block() and NXM (1)thread_block()

selects new thread calls nxm_block() to setup for upcall context switches to new thread

Digital UNIX Internals I Supporting User Threads12 - 33

thread_block() and NXM (2)nxm_block()

handle case of thread non-interruptible sleep handle case of a library thread doing library work

in a critical section schedule an AST when awaken to perform upcall

Prepare this scheduler thread to pass on its identity to a new scheduler thread

This requires copying the current shared area back into the kernel, setting our state to BLOCKED, and getting a replacement thread going.

Digital UNIX Internals I Supporting User Threads12 - 34

thread_block() and NXM (3)nxm_block() continued

....save off floating point stateset threads NXM state to blocked NXM_BLOCKED if the thread is in a scheduling state

increment count of blocked NXM threads

if extra threads laying around, wake up a replacement

else check if enough threads are around. If not, wake up manager thread to create a replacement.

Digital UNIX Internals I Supporting User Threads12 - 35

Source References: NXM Scheduling (1) kernel/kern/syscall.c kernel/kern/syscall_subr.c

nxm mach system call definitions kernel/kern/thread.h

struct thread kernel/kern/task.h

struct task kernel/arch/alpha/nxm.h

struct nxm_shared struct nxm_sched_state

kernel/kern/sched_prim.c thread_block() nxm_block()

Digital UNIX Internals I Supporting User Threads12 - 36

Source Referenceskernel/kern/syscall_subr.c

nxm_task_init() nxm_get_thread() nxm_idle()

kernel/arch/alpha/nxm.h nxm upcall definitions struct ushared_state

kernel/arch/alpha/trap.c trap()

Compaq Computer Corporation© 1998