Chapter 4: Threads

Chapter 4: ThreadsChapter 4: Threads

4.2 Silberschatz, Galvin and Gagne ©2005Operating System Concepts

Chapter 4: ThreadsChapter 4: Threads

Overview Multithreading Models Threading Issues Pthreads Windows XP Threads Linux Threads Java Threads


3

Thread: IntroductionThread: Introduction

Each process has

1. Own Address Space

2. Single thread of control A process model has two concepts:

1. Resource grouping

2. Execution Sometimes it is useful to separate them


4

Unit of Resource OwnershipUnit of Resource Ownership

A process has an Address space Open files Child processes Accounting information Signal handlers Etc

If these are put together in a form of a process, can be managed more easily


5

Unit of DispatchingUnit of Dispatching Path of execution

Program counter: which instruction is running Registers:

holds current working variables

Stack: Contains the execution history, with one entry for

each procedure called but not yet returned

State Processes are used to group resources together Threads are the entities scheduled for execution

on the CPU Threads are also called lightweight process


6

Its better to distinguish between the two conceptsIts better to distinguish between the two concepts

Address space/Global VariablesOpen filesChild processesAccounting infoSignal handlersProgram counterRegistersStackState

Address space/Global VariablesOpen filesChild processesAccounting infoSignal handlers

Program counterRegistersStackState


Split


Unit of Resource

Unit of Dispatch

In case of multiple threads per process

Share

Light weight processes

Hea

vy w

eigh

t pro

cess


Threads allow you to multiplex which Threads allow you to multiplex which resources?resources?

CPU

Memory

PCBs

Open fil

es

User a

uthentic

ation

s...

149%

81%

38%62%60%

1. CPU

2. Memory

3. PCBs

4. Open files

5. User authentication structures


8

ThreadThread A thread is a basic unit of CPU utilization. It consist of

A thread ID A program Counter A register Set A stack

Threads share something with its peer threads (all the other5 threads in this particular task) the things that it share are Its code section Its data section Any OS resources, available for the task.

The first thread starts execution with

int main(int argc, char *argv[]) The threads appear to the Scheduling part of an OS just like any

other process Allow multiple execution paths in the same process environment


Threads vs. ProcessesThreads vs. Processes

Threads A thread has no data segment or

heap A thread cannot live on its own, it

must live within a process There can be more than one thread

in a process, the first thread calls main & has the process’s stack

Inexpensive creation Inexpensive context switching If a thread dies, its stack is

reclaimed Inter-thread communication via

memory.

ProcessesA process has code/data/heap & other segmentsThere must be at least one thread in a processThreads within a process share code/data/heap, share I/O, but each has its own stack & registersExpensive creationExpensive context switchingIf a process dies, its resources are reclaimed & all threads dieInter-process communication via OS and data copying.


10

Process Vs. ThreadsProcess Vs. Threads

(a) Three threads, each running in a separate address space

(b) Three threads, sharing the same address space


Context switch time for which entity is Context switch time for which entity is greater?greater?

Proce

ss

Thread

21%

79%1. Process

2. Thread


12

The Thread ModelThe Thread Model

Each thread has its own stack


Single and Multithreaded ProcessesSingle and Multithreaded Processes

A traditional heavy weight process is same as task

with one thread. It has a single thread of control.

If a process is multi thread, then that means more than

one part of the thread is executing at one time.

Multi threading can be useful in programs such as web

browsers where you can wish to download a file , view an

animation and print something at the same time.


Single and Multithreaded ProcessesSingle and Multithreaded Processes


Implementing ThreadsImplementing Threads

Processes define an address space; threads share the address space

Process Control Block (PCB) contains process-specific information Owner, PID, heap pointer,

priority, active thread, and pointers to thread information

Thread Control Block (TCB) contains thread-specific information Stack pointer, PC, thread state

(running, …), register values, a pointer to PCB, …

Code

Initialized data

HeapDLL’s

mapped segments

Process’s address space

Stack – thread1

PCSP

StateRegisters

…

TCB for Thread1

Stack – thread2

PCSP

StateRegisters

…

TCB for Thread2


Threads’ Life CycleThreads’ Life Cycle

Threads (just like processes) go through a sequence of start, ready, running, waiting, and done states

RunningReady

Waiting

Start Done


Benefits Of Multi-ThreadingBenefits Of Multi-Threading

Responsiveness

Multithreading increase the responsiveness .As the process

consist of more than one thread, if one thread block or busy in

lengthy calculation, some other thread still executing of the

process. So the user get response from executing process.

Resource Sharing

All threads, which belongs to one process, share the memory and

resources of that process. Secondly it allow the application to have

several different threads within the same address space.


Benefits Of Multi-ThreadingBenefits Of Multi-Threading

Economy

Allocation of memory and resources or process creation is costly.

All threads of a process share the resources of that process so it is

more economical to create and context switch the thread.

Utilization of Multi processor Architectures

MP architecture allows the facility of parallel processing, which is

most efficient way of processing. A single process can run on one

CPU even if we have more processors. Multi-threading on MP

system increase the concurrency. If a process is dividing into

multiple threads , these threads can execute simultaneously on

different processors.


Types of ThreadsTypes of Threads

There are two types of threads

Kernel Threads

User Threads


User ThreadsUser Threads

User level threads are not seen by operating system and also very fast

(switching from one thread to another thread in a single process does

not require context switch since same process is still executing).

However, if the thread that is currently executing blocks, the rest of the

process may also blocked( if OS is using only one single kernel thread

for this process i.e. the thread that kernel sees is same as blocked

thread , hence kernel assume that whole process is blocked). Thread management done by user-level threads library Three primary thread libraries:

POSIX Pthreads Win32 threads Java threads


Kernel ThreadsKernel Threads

Kernel Supported threads are seen by operating system and must be scheduled by the operating system. One multi thread may have multiple kernel threads.

Examples Windows XP/2000 Solaris Linux Tru64 UNIX Mac OS X


User-Level vs. Kernel ThreadsUser-Level vs. Kernel Threads

User-Level Managed by application Kernel not aware of thread Context switching cheap Create as many as needed Must be used with care

Kernel-Level Managed by kernel Consumes kernel resources Context switching expensive Number limited by kernel resources Simpler to use

Key issue: kernel threads provide virtual processors to user-level threads, but if all of kthreads block, then all user-level threads will block even if the program logic allows them to proceed


Thread LibrariesThread Libraries

Thread library provides programmer with API for creating and managing threads

Two primary ways of implementing Library entirely in user space with no kernel support. All the

code and data structure exists in user space. This means that invoking of a function in a the library results in a local function call in user space and not a system call.

Kernel-level library supported by the OS. In this case code and data structures for the library exits in kernel space. Invoking a function in the API of library typically results in a system call to a kernel.


Thread Libraries Thread Libraries

Three main thread libraries are used in today POSIX Pthreads Win32 Java


Multithreading ModelsMultithreading Models

Many-to-One

One-to-One

Many-to-Many


Many-to-OneMany-to-One

Many user-level threads mapped to single kernel thread.

It is efficient because it is implemented in user space. A process using

this model blocked entirely if a thread makes a blocking system call.

Only one thread can access the kernel at a time so it can not be run

in parallel on multiprocessor.

Examples:

Solaris Green Threads

GNU Portable Threads("Genuinely Not Unix" ; GNU is an

operating system composed of free software)


Many-to-One ModelMany-to-One Model


One-to-OneOne-to-One

Each user-level thread maps to kernel thread.

It provides more concurrency because it allows another thread to

execute when threads invoke the blocking system call.

It facilitates the parallelism in multiprocessor systems.

Each user thread requires a kernel thread, which may affect the

performance of the system.

Creation of threads in this model is restricted to certain number. Examples

Windows NT/XP/2000 Linux Solaris 9 and later


One-to-one ModelOne-to-one Model


Many-to-Many ModelMany-to-Many Model Allows many user level threads to be mapped to many kernel threads

Allows the operating system to create a sufficient number of kernel

threads

Number of kernel threads may be specific to a either a particular

application or a particular machine.

The user can create any number of threads and corresponding kernel

level threads can run in parallel on multiprocessor.

When a thread makes a blocking system call, the kernel can execute

another thread. Solaris prior to version 9 Windows NT/2000 with the ThreadFiber package


Many-to-Many ModelMany-to-Many Model


Multithreading Models: ComparisonMultithreading Models: Comparison

The many-to-one model allows the developer to create as many user threads as he/she wishes, but true concurrency can not be achieved because only one kernel thread can be scheduled for execution at a time

The one-to-one model allows more concurrence, but the developer has to be careful not to create too many threads within an application

The many-to-many model does not have these disadvantages and limitations: developers can create as many user threads as necessary, and the corresponding kernel threads can run in parallel on a multiprocessor


Two-level ModelTwo-level Model

Similar to M:M, except that it allows a user thread to be bound to kernel thread

Examples IRIX HP-UX Tru64 UNIX Solaris 8 and earlier



LWP

user-levelthreads

LWP LWP LWP

• Combination of one-to-one + “strict” many-to-many models• Supports both bound and unbound threads

– Bound threads - permanently mapped to a single, dedicated LWP– Unbound threads - may move among LWPs in set

• Thread creation, scheduling, synchronization done in user space• Flexible approach, “best of both worlds”• Used in Solaris implementation of Pthreads and several other Unix

implementations (IRIX, HP-UX)


PthreadsPthreads

A POSIX standard (IEEE 1003.1c) API for thread creation and synchronization

API specifies behavior of the thread library, implementation is up to development of the library

Common in UNIX operating systems (Solaris, Linux, Mac OS X)


Java ThreadsJava Threads

Java threads are managed by the JVM

Java threads may be created by:

Extending Thread class Implementing the Runnable interface


Java Thread States Java Thread States


Threading IssuesThreading Issues

Semantics of fork() and exec() system calls Thread cancellation Signal handling Thread pools Thread specific data Scheduler activations


Semantics of fork() and exec()Semantics of fork() and exec()

As the fork() system call is used to create a separate, duplicate process.

The semantics of the fork() and exec() system calls change in a multithreaded program.

If one thread in a program calls the fork(), does the new process duplicate all the threads or is the new process single threaded?


Semantics of fork() and exec()Semantics of fork() and exec()

Does fork() duplicate only the calling thread or all threads?

• Two versions of fork() in UNIX:, one that duplicates all threads and another that

duplicates only the thread that invoked the fork() system call.

• If a thread invokes the exec() system call, the program specified in the parameter

to exec() will replace the entire process – including all threads.

• If exec() is called immediately after forking, then duplicating all threads is

unnecessary, as the program specified in the parameters to exec() will replace

the process. In this instance, duplicating only the calling thread is appropriate.

• If the separate process does not call exec() after forking, the separate process

should duplicate all threads.


Thread CancellationThread Cancellation

Thread cancellation is the task of terminating a thread before it has finished. For example, if multiple threads are concurrently searching through a database and one thread return the result, the remaining threads might be canceled.

A thread that is to be often canceled is referred as the target thread.

Two general approaches: Asynchronous cancellation terminates the target thread

immediately Deferred cancellation allows the target thread to

periodically check if it should be cancelled, allowing it an opportunity to terminate itself in an orderly fashion.


Problem in Thread CancellationProblem in Thread Cancellation

The difficulty with cancellation occurs in a situations where resources have been allocated to a cancel thread Where thread has been cancelled in the midst of updating

data it share with other threads.

This becomes especially difficult with asynchronous cancellation. Often operating system will reclaim system resources from the canceled thread but will not reclaim all the resources. Therefore, canceling a thread asynchronously may not free a necessary system-wide resource.


Signal HandlingSignal Handling

A signal is used in a UNIX system to notify a process that a particular event has occurred. Signal may be received either asynchronously or synchronously. All Signal follow the same pattern.

A signal handler is used to process signals

1. Signal is generated by particular event

2. Signal is delivered to a process

3. Signal is handled



Examples of synchronous signals include illegal memory access

and division by 0. if a running program perform either of these

operations a signal is generated. Synchronous signals are delivered

to the same process that performed the operation that caused the

signals.

When a signal is generated by some event external to a running

process, that process receive the signal asynchronously.

Examples of such signals include terminating a process with specific

key strokes (such as <control><C>) and having a timer expire.

Typically , asynchronous signal is sent to another process.


Signal Handling in UNIXSignal Handling in UNIX

Every signal may be handled by one of two possible handlers:

1. A default signal handler that is run by the kernel when handling that signal.

2. A user-defined signal handler that is called to handle a signal.



Delivering the signal in multithreaded programs is more complicated, where a process may have several threads. Following Options exist: Deliver the signal to the thread to which the

signal applies Deliver the signal to every thread in the

process Deliver the signal to certain threads in the

process Assign a specific thread to receive all signals

for the process.


Thread PoolsThread Pools

The general idea is to create a number of threads at process

startup and place them into a pool where they sit and wait for

work

Advantages:

Usually slightly faster to service a request with an existing

thread than create a new thread

Allows the number of threads in the application(s) to be

bound to the size of the pool


Thread Specific DataThread Specific Data

Allows each thread to have its own copy of data Useful when you do not have control over the thread

creation process (i.e., when using a thread pool)


Scheduler ActivationsScheduler Activations

A final issue to be considered with multithreaded programs concerns

with communication between kernel level and user level thread

library.

Both M:M and Two-level models require communication to maintain

the appropriate number of kernel threads allocated to the

application.

Many system implementing either M:M or Two level model place an

intermediate data structure between user and kernel threads. This

data structure is typically known as light weight process or (LWP).


Scheduler ActivationsScheduler Activations One scheme for communication between user thread library and kernel

thread library is known as Scheduler Activation. It works as follow: The kernel provides an application with a set of virtual processor

(LWPs), and the application can schedule user threads onto an available virtual processor.

The kernel must inform an application about certain events. This

procedure is known as upcalls - a communication mechanism from the

kernel to the thread library.

Upcalls are handled by thread library with an upcall handler.

Upcall handler must run on virtual processor.

This communication allows an application to maintain the correct number

of kernel threads.


Operating System ExamplesOperating System Examples

Windows XP Threads Linux Thread


Windows XP ThreadsWindows XP Threads

Implements the one-to-one mapping, kernel-level Each thread contains

A thread id Register set Separate user and kernel stacks Private data storage area

The register set, stacks, and private storage area are known as the context of the threads

The primary data structures of a thread include: ETHREAD (executive thread block) KTHREAD (kernel thread block) TEB (thread environment block)


Windows XP ThreadsWindows XP Threads


Linux ThreadsLinux Threads

Linux refers to them as tasks rather than threads Thread creation is done through clone() system call clone() allows a child task to share the address space

of the parent task (process)


Linux ThreadsLinux Threads


57

Thread UsageThread Usage Less time to create a new thread than a process

the newly created thread uses the current process address space

no resources attached to them Less time to terminate a thread than a process. Less time to switch between two threads within the same

process, because the newly created thread uses the current process address space.

Less communication overheads threads share everything: address space, in particular. So, data

produced by one thread is immediately available to all the other threads

Performance gain Substantial Computing and Substantial Input/output

Useful on systems with multiple processors

End of Chapter 4End of Chapter 4

Chapter 4: Threads

Documents

Transcript of Chapter 4: Threads