Distributed Systems - Kursused · Distributed Systems Operation System Support slides are adopted...

1/34

Artjom Lind ([email protected])

https://courses.cs.ut.ee/2017/ds/fall

HajussüsteemidMTAT.08.009

DistributedSystemsOperation System Support slides are adopted from:

lecture: “Operating System(OS) support” (years 2016, 2017)

● book: “Distributed Systems: Concepts and Design”, Chapter 7 (Edn5. 2012)

● book: "Operating System Concepts, Ninth Edition ", Chapter 4

●

mailto:[email protected]




2/34



Outline

● Introduction● Operating System Layers● OS Kernel functionality● Virtual Memory● OS Process schedulers● Threads





3/34



Introduction

What is the main task of any operating system ?





4/34



Introduction

● interface to hardware (CPU, memory, disk, network card, monitor)

● core services (file systems, network protocols, ...)

● manage resources– memory management (virtual memory, memory pages)– CPU “management” (processes, threads)– IO resources (sockets, opened files)

● share resources between processes– process is a running program with its set of allocated resources– protect processes from each other





5/34



Introduction

● What are the differences between:– Network Operating System– Distributed Operating System





6/34



Introduction


● Network Operating System– Built-in network capability

● Some resource accessible transparently– Network File systems (NFS, Samba …)

● Remote shell, remote desktop– Ssh, VNC, X11, Rdesktop ...

– Still the concept is:● Many interconnected physical nodes● Many autonomous OS instances





7/34



Introduction


● Distributed Operating System– One Instance of Running OS– Many interconnected physical nodes

Are there any distributed OS nowadays ?





8/34



Introduction



Are there any distributed OS nowadays ?– Why aren't they developed ?





9/34



Introduction




● Software portability ?● Architecture complexity, hence lower performance?● Users in fact want to preserve autonomy for their PCs!





10/34



Operating System Layers

● A code may be running in different Privilege levels:– kernel level (tier 0) allows to access any resource,– user level (tier 3) restricts access to resources

● System calls are kernel “functions” to be called from user level– open() system call

● ...returns a file descriptor, a small, nonnegative integer for use in subsequent system calls (read(2), write(2)...

– http://man7.org/linux/man-pages/man2/syscalls.2.html

● Memory - the resource a program can access without syscall– however, it does not access physical memory but virtual

memory– mapping from virtual to physical is still controlled from the

kernel



http://man7.org/linux/man-pages/man2/syscalls.2.html






11/34




● User-Space– Applications, Services, Libraries …

● Kernel-Space – Process manager– Memory manager, Thread manager– Communication manager– Supervisor





12/34



OS Kernel functionality





13/34




● Process manager: Creation of and operations upon processes. A process is a unit of resource management, including an address space and one or more threads.

● Thread manager: Thread creation, synchronization and scheduling. Threads are scheduled activities attached to processes

● Memory manager: Management of physical and virtual memory.● Supervisor: Dispatching of interrupts, system call traps and other

exceptions; control of memory management unit and hardware caches; processor and floating-point unit register manipulations.

– (hal32.dll , hal.dll?)





14/34



Virtual Memory

● Why is it called “virtual” ?● What in fact happens to the memory when the process is

started ?● What is address space ?● What is fragmentation ?● What is paging ?





15/34



Virtual Memory

● Every process has its address space– e.g. address 42 on processes A and B is

● Address space is split into regions:– text (code), stack, heap– memory mapped files

● e.g. dynamic library (DLL) code– *How by the way DLL is

different from shared library– shared memory regions– try cat /proc/<PID>/maps

● Problem: fragmentation– if P2 exits, can we fit something into the green region place in

memory?– solution: paging





16/34



Memory Pages

● mapping virtual (VM) to physical memory (PM)– pages are 4kB (4MB) chunks– process Page Table contains the mapping– TLB cache speeds up translation

● allocating VM region does nothing to PM– read/write results in page fault– page fault handler decides what to do

● just allocate page in PM and update● process page table

– load data for memory mapped files● in case of low memory some pages are offloaded

– read-only mapped files can be discarded– data pages are written into swap

● VIRT,RES and SHR in top





17/34



Processes and Treads

● Process ?● Thread ?





18/34




● Process – Unit of Resource– Detailed definition ?

● Thread - Unit of Execution and Scheduling– Detailed definition ?

● Kernel thread ?● User-level thread ?

– Green-thread ?– Thread models (User-Level to Kernel Level)





19/34



Process Context

● What does the typical process consist of ?





20/34



Process Context

● Typical process consists of:– currently executing command

● (Program Counter)– Stack Pointer (SP)– general purpose registers,

floating point registers– memory management

registers● point to the

information about process memory regions and page tables





21/34



Context switch

● Any ideas what is it ?● How does it work ?● Why is it said it is an expensive action ?





22/34



Context switch

● Scheduler tries to run M processes on N cores– gives each process time slice to execute

● Switching - can only be done in kernel (privileged) mode– requires a system call

● pre-emptive: as the result of CPU hardware exception● side effect of system call, e.g. in blocking in recv()

– store old process context– find new process to execute (kernel contains list of processes)– load new process context (where it left the last time)– return to user mode

● TLB cache is invalidated (not with threads)● CPU caches are also largely invalidated (unless threads reuse locality)● All in all, it is pretty expensive action





23/34



Creating Process

● New processes are created with two system calls:– fork() system call creates almost exact copy of the current process

● information about memory regions is copied● page tables are copied

– however, physical pages are not copied immediately– copy-on-write

● exeve() system call runs another program in the current process– memory regions are cleared and replaced from the new executable

● new code, data, stack regions, opened files and libraries– page tables are discarded

● new information is filled as the program is loaded into physical memory





24/34



Process Schedulers

● What is responsible for ? – For scheduling eventually :)– But what exactly is it doing ?

● What are different kinds of process schedulers ?– Regarding the number of CPUs or Cores in on CPU?– Regarding the scheduling polices ?





25/34



Process Schedulers

● Responsible for deciding if current process can continue or it should yield to the next process.

● Two kinds of schedulers exist– Preemptive

● The decision is usually made based on events:– Current process is in waiting state (it issued some blocking

I/O request)– Current process terminates– A timer interrupt is invoking scheduler to put current

process from running to ready state– Blocking I/O operation is complete and the process may be

put back into running state– Cooperative

● scheduler cannot take the CPU away from a process – Batch processing systems (run-to-completion)





26/34



Threads

● Traditional ( heavyweight ) processes have a single thread of control - There is one program counter, and one sequence of instructions that can be carried out at any given time.

● Multi-threaded applications have multiple threads within a single process, each having their own program counter, stack and set of registers, but sharing common code, data, and certain structures such as open files.





27/34



Threads





28/34



OS Threading Models

● There are two types of threads to be managed in a modern system: User-space threads and kernel threads.

● User-space threads are supported above the kernel, without kernel support. These are the threads that application programmers would put into their programs.

● Kernel threads are supported within the kernel of the OS itself. All modern OS-es support kernel level threads, allowing the kernel to perform multiple simultaneous tasks and/or to service multiple kernel system calls simultaneously.

● In a specific implementation, the user threads must be mapped to kernel threads, using one of the following strategies.





29/34



Many-To-One Model● In the many-to-one model, many user-level

threads are all mapped onto a single kernel thread.

● Thread management is handled by the thread library in user space, which is very efficient.

● However, if a blocking system call is made, then the entire process blocks, even if the other user threads would otherwise be able to continue.

● Because a single kernel thread can operate only on a single CPU, the many-to-one model does not allow individual processes to be split across multiple CPUs.

● Green threads for Solaris and GNU Portable Threads implement the many-to-one model in the past, but few systems continue to do so today.





30/34



One-To-One Model● The one-to-one model creates a separate kernel

thread to handle each user thread.● One-to-one model overcomes the problems

listed above involving blocking system calls and the splitting of processes across multiple CPUs.

● However the overhead of managing the one-to-one model is more significant, involving more overhead and slowing down the system.

● Most implementations of this model place a limit on how many threads can be created.

● Linux and Windows from 95 to XP implement the one-to-one model for threads





31/34



Many-To-Many Model● The many-to-many model multiplexes any

number of user threads onto an equal or smaller number of kernel threads, combining the best features of the one-to-one and many-to-one models.

● Users have no restrictions on the number of threads created.

● Blocking kernel system calls do not block the entire process.

● Processes can be split across multiple processors.

● Individual processes may be allocated variable numbers of kernel threads, depending on the number of CPUs present and other factors.





32/34



Two-tier model● is the two-tier model, which allows either many-

to-many or one-to-one operation.● IRIX, HP-UX, and Tru64 UNIX use the two-tier

model, as did Solaris prior to Solaris 9.





33/34



Threads Libraries

● POSIX Threads or Pthreads– The POSIX standard ( IEEE 1003.1c ) defines the specification for

pThreads, not the implementation.– pThreads are available on Solaris, Linux, Mac OSX, Tru64, and via

public domain shareware for Windows.– Global variables are shared amongst all threads.

● Windows Threads– Similar to pThreads, just the API is different

● Java Threads– JVM runs on top of a native OS, and that the JVM specification does

not specify what model to use for mapping Java threads to kernel threads. This decision is JVM implementation dependent, and may be one-to-one, many-to-many, or many to one..

● What about Python Threads ? What does it in fact using ? And what model ?





34/34



Read at Home

● Read / work through the Textbook Chapter 8 -- Distributed objects and components

● Some questions to address:– 1. What are the key differences in objects and distributed

objects?– 2. What are the key concepts of CORBA?– 3. What are the issues with object-oriented middle-ware?– 4. What has the concept of software components in offer to

overcome these issues?





1/34



HajussüsteemidMTAT.08.009

DistributedSystemsOperation System Support slides are adopted from:

lecture: “Operating System(OS) support” (years 2016, 2017)

● book: “Distributed Systems: Concepts and Design”, Chapter 7 (Edn5. 2012)

● book: "Operating System Concepts, Ninth Edition ", Chapter 4

●

2/34



Outline

● Introduction● Operating System Layers● OS Kernel functionality● Virtual Memory● OS Process schedulers● Threads

3/34



Introduction

What is the main task of any operating system ?

4/34



Introduction

● interface to hardware (CPU, memory, disk, network card, monitor)

● core services (file systems, network protocols, ...)

● manage resources– memory management (virtual memory, memory pages)– CPU “management” (processes, threads)– IO resources (sockets, opened files)

● share resources between processes– process is a running program with its set of allocated resources– protect processes from each other

5/34



Introduction


6/34



Introduction


● Network Operating System– Built-in network capability

● Some resource accessible transparently– Network File systems (NFS, Samba …)

● Remote shell, remote desktop– Ssh, VNC, X11, Rdesktop ...

– Still the concept is:● Many interconnected physical nodes● Many autonomous OS instances

7/34



Introduction



Are there any distributed OS nowadays ?

8/34



Introduction




9/34



Introduction




● Software portability ?● Architecture complexity, hence lower performance?● Users in fact want to preserve autonomy for their PCs!

10/34




● A code may be running in different Privilege levels:– kernel level (tier 0) allows to access any resource,– user level (tier 3) restricts access to resources

● System calls are kernel “functions” to be called from user level– open() system call

● ...returns a file descriptor, a small, nonnegative integer for use in subsequent system calls (read(2), write(2)...

– http://man7.org/linux/man-pages/man2/syscalls.2.html

● Memory - the resource a program can access without syscall– however, it does not access physical memory but virtual

memory– mapping from virtual to physical is still controlled from the

kernel

11/34




● User-Space– Applications, Services, Libraries …

● Kernel-Space – Process manager– Memory manager, Thread manager– Communication manager– Supervisor

12/34




13/34




● Process manager: Creation of and operations upon processes. A process is a unit of resource management, including an address space and one or more threads.

● Thread manager: Thread creation, synchronization and scheduling. Threads are scheduled activities attached to processes

● Memory manager: Management of physical and virtual memory.● Supervisor: Dispatching of interrupts, system call traps and other

exceptions; control of memory management unit and hardware caches; processor and floating-point unit register manipulations.

– (hal32.dll , hal.dll?)

14/34



Virtual Memory

● Why is it called “virtual” ?● What in fact happens to the memory when the process is

started ?● What is address space ?● What is fragmentation ?● What is paging ?

15/34



Virtual Memory

● Every process has its address space– e.g. address 42 on processes A and B is

● Address space is split into regions:– text (code), stack, heap– memory mapped files

● e.g. dynamic library (DLL) code– *How by the way DLL is

different from shared library– shared memory regions– try cat /proc/<PID>/maps

● Problem: fragmentation– if P2 exits, can we fit something into the green region place in

memory?– solution: paging

16/34



Memory Pages

● mapping virtual (VM) to physical memory (PM)– pages are 4kB (4MB) chunks– process Page Table contains the mapping– TLB cache speeds up translation

● allocating VM region does nothing to PM– read/write results in page fault– page fault handler decides what to do

● just allocate page in PM and update● process page table

– load data for memory mapped files● in case of low memory some pages are offloaded

– read-only mapped files can be discarded– data pages are written into swap

● VIRT,RES and SHR in top

17/34




● Process ?● Thread ?

18/34




● Process – Unit of Resource– Detailed definition ?

● Thread - Unit of Execution and Scheduling– Detailed definition ?

● Kernel thread ?● User-level thread ?

– Green-thread ?– Thread models (User-Level to Kernel Level)

19/34



Process Context

● What does the typical process consist of ?

20/34



Process Context

● Typical process consists of:– currently executing command

● (Program Counter)– Stack Pointer (SP)– general purpose registers,

floating point registers– memory management

registers● point to the

information about process memory regions and page tables

21/34



Context switch

● Any ideas what is it ?● How does it work ?● Why is it said it is an expensive action ?

22/34



Context switch

● Scheduler tries to run M processes on N cores– gives each process time slice to execute

● Switching - can only be done in kernel (privileged) mode– requires a system call

● pre-emptive: as the result of CPU hardware exception● side effect of system call, e.g. in blocking in recv()

– store old process context– find new process to execute (kernel contains list of processes)– load new process context (where it left the last time)– return to user mode

● TLB cache is invalidated (not with threads)● CPU caches are also largely invalidated (unless threads reuse locality)● All in all, it is pretty expensive action

23/34



Creating Process

● New processes are created with two system calls:– fork() system call creates almost exact copy of the current process

● information about memory regions is copied● page tables are copied

– however, physical pages are not copied immediately– copy-on-write

● exeve() system call runs another program in the current process– memory regions are cleared and replaced from the new executable

● new code, data, stack regions, opened files and libraries– page tables are discarded

● new information is filled as the program is loaded into physical memory

24/34



Process Schedulers

● What is responsible for ? – For scheduling eventually :)– But what exactly is it doing ?

● What are different kinds of process schedulers ?– Regarding the number of CPUs or Cores in on CPU?– Regarding the scheduling polices ?

25/34



Process Schedulers

● Responsible for deciding if current process can continue or it should yield to the next process.

● Two kinds of schedulers exist– Preemptive

● The decision is usually made based on events:– Current process is in waiting state (it issued some blocking

I/O request)– Current process terminates– A timer interrupt is invoking scheduler to put current

process from running to ready state– Blocking I/O operation is complete and the process may be

put back into running state– Cooperative

● scheduler cannot take the CPU away from a process – Batch processing systems (run-to-completion)

26/34



Threads

● Traditional ( heavyweight ) processes have a single thread of control - There is one program counter, and one sequence of instructions that can be carried out at any given time.

● Multi-threaded applications have multiple threads within a single process, each having their own program counter, stack and set of registers, but sharing common code, data, and certain structures such as open files.

27/34



Threads

28/34



OS Threading Models

● There are two types of threads to be managed in a modern system: User-space threads and kernel threads.

● User-space threads are supported above the kernel, without kernel support. These are the threads that application programmers would put into their programs.

● Kernel threads are supported within the kernel of the OS itself. All modern OS-es support kernel level threads, allowing the kernel to perform multiple simultaneous tasks and/or to service multiple kernel system calls simultaneously.

● In a specific implementation, the user threads must be mapped to kernel threads, using one of the following strategies.

29/34



Many-To-One Model● In the many-to-one model, many user-level

threads are all mapped onto a single kernel thread.

● Thread management is handled by the thread library in user space, which is very efficient.

● However, if a blocking system call is made, then the entire process blocks, even if the other user threads would otherwise be able to continue.

● Because a single kernel thread can operate only on a single CPU, the many-to-one model does not allow individual processes to be split across multiple CPUs.

● Green threads for Solaris and GNU Portable Threads implement the many-to-one model in the past, but few systems continue to do so today.

30/34



One-To-One Model● The one-to-one model creates a separate kernel

thread to handle each user thread.● One-to-one model overcomes the problems

listed above involving blocking system calls and the splitting of processes across multiple CPUs.

● However the overhead of managing the one-to-one model is more significant, involving more overhead and slowing down the system.

● Most implementations of this model place a limit on how many threads can be created.

● Linux and Windows from 95 to XP implement the one-to-one model for threads

31/34



Many-To-Many Model● The many-to-many model multiplexes any

number of user threads onto an equal or smaller number of kernel threads, combining the best features of the one-to-one and many-to-one models.

● Users have no restrictions on the number of threads created.

● Blocking kernel system calls do not block the entire process.

● Processes can be split across multiple processors.

● Individual processes may be allocated variable numbers of kernel threads, depending on the number of CPUs present and other factors.

32/34



Two-tier model● is the two-tier model, which allows either many-

to-many or one-to-one operation.● IRIX, HP-UX, and Tru64 UNIX use the two-tier

model, as did Solaris prior to Solaris 9.

33/34



Threads Libraries

● POSIX Threads or Pthreads– The POSIX standard ( IEEE 1003.1c ) defines the specification for

pThreads, not the implementation.– pThreads are available on Solaris, Linux, Mac OSX, Tru64, and via

public domain shareware for Windows.– Global variables are shared amongst all threads.

● Windows Threads– Similar to pThreads, just the API is different

● Java Threads– JVM runs on top of a native OS, and that the JVM specification does

not specify what model to use for mapping Java threads to kernel threads. This decision is JVM implementation dependent, and may be one-to-one, many-to-many, or many to one..

● What about Python Threads ? What does it in fact using ? And what model ?

34/34



Read at Home

● Read / work through the Textbook Chapter 8 -- Distributed objects and components

● Some questions to address:– 1. What are the key differences in objects and distributed

objects?– 2. What are the key concepts of CORBA?– 3. What are the issues with object-oriented middle-ware?– 4. What has the concept of software components in offer to

overcome these issues?

Distributed Systems - Kursused · Distributed Systems Operation System Support slides are adopted...

Documents

Transcript of Distributed Systems - Kursused · Distributed Systems Operation System Support slides are adopted...