LectureCA All Slides

134
Slide 1 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis Computer Architecture Slide 2 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis This is the collection of lecture slides* of the lecture „Computer Architecture“ tought in Wintersemester 06/07 at University Duisburg- Essen. I slightly revised the surveys of the subjects and added slide numbers now. * Actually, this is the internet version of the lecture slides. With respect to the slides used in the lectures, animations are removed (errors hopefully as well) and additional text is added. Stefan Freinatis, March 2007 Slide 3 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis Computer Architecture Dr.-Ing. Stefan Freinatis Fachgebiet Verteilte Systeme (Prof. Geisselhardt) Raum BB 1017 Dipl.-Math. Kerstin Luck Fachgebiet Verteilte Systeme Raum BB 910 Lecture Exercises Slide 4 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis Times & Dates Computer Architecture 1. 25.10.06 01.11.06 All Saint‘s Day (public holiday in NRW, no lectures) 2. 08.11.06 3. 15.11.06 4. 22.11.06 5. 29.11.06 6. 06.12.06 7. 13.12.06 8. 20.12.06 9. 10.01.07 10. 17.01.07 11. 24.01.07 12. 31.01.07 13. 07.02.07 Lecture: 08:15 – 09:45 Exercise: 10:00 – 10:45

description

LectureCA All Slides

Transcript of LectureCA All Slides

Page 1: LectureCA All Slides

1

Slide 1 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Computer Architecture

Slide 2 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

This is the collection of lecture slides* of the lecture „Computer Architecture“ tought in Wintersemester 06/07 at University Duisburg-Essen. I slightly revised the surveys of the subjects and added slidenumbers now.

* Actually, this is the internet version of the lecture slides. With respect to the slides used in the lectures, animations are removed (errors hopefully as well) and additional text is added.

Stefan Freinatis, March 2007

Slide 3 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Computer Architecture

Dr.-Ing. Stefan FreinatisFachgebiet Verteilte Systeme (Prof. Geisselhardt)Raum BB 1017

Dipl.-Math. Kerstin LuckFachgebiet Verteilte SystemeRaum BB 910

Lecture

Exercises

Slide 4 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Times & Dates Computer Architecture

1. 25.10.0601.11.06 All Saint‘s Day (public holiday in NRW, no lectures)

2. 08.11.063. 15.11.064. 22.11.065. 29.11.066. 06.12.067. 13.12.068. 20.12.069. 10.01.0710. 17.01.0711. 24.01.0712. 31.01.0713. 07.02.07

Lecture: 08:15 – 09:45Exercise: 10:00 – 10:45

Page 2: LectureCA All Slides

2

Slide 5 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Resources Computer Architecture

Homepage „Verteilte Systeme“http://www.fb9dv.uni-duisburg.de/vs/de/index.htm

Direct link to homepage of lecture „Computer Architecture“http://www.fb9dv.uni-duisburg.de/vs/en/education/dv3/index2006.htm

Select ‚English‘ → ‚Lectures‘→ ‚Winter semester 2006/2007‘ → ‚Computer Architecture‘

Slide 6 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

TopicsIntroduction & History

1. Operating Systems (slide 34) System layers, batching, multi-programming, time sharing

2. File Systems (slide 65)Storage media, files & directories, disk scheduling

3. Process Management (slide 151)Processes, threads, IPC, scheduling, deadlocks

4. Memory Management (slide 351)Memory, paging, segmentation, virtual memory, caches

Slide 7 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Literature

[HP03] J. Hennessy, D. Patterson: Computer Architecture – A Quantitative Approach, 3rd ed., Elsevier Science, 2003, ISBN 1-55860-724-2.

[HP06] J. Hennessy, D. Patterson: Computer Architecture – A Quantitative Approach, 4th ed., Elsevier Science, 2006, ISBN 0-12-370490-1 .

[Ta01] A. Tanenbaum: Modern Operating Systems, 2nd ed., Prentice Hall, 2001, ISBN 0-13-092641-8.

[Sil00] A. Silberschatz: Applied Operating System concepts, 1sted., John Wiley & Sons, 2000, ISBN 0-471-36508-4.

Slide 8 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Computer Architecture

Introduction

Page 3: LectureCA All Slides

3

Slide 9 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

IntroductionComputer Architecture is the conceptual designand fundamental operational structure of a computer system [Wikipedia].

• Instruction set architecturestack or accumulator or general purpose register architecture

• Organizationmemory system, bus structure, CPU design

• Hardwaremachine specifics, logic design, technology

Computer Architecture encompasses [HP03 p.9]:

Slide 10 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Computer Application Areas

General Purpose desktopsbalanced performance for range of tasks, graphics, video, audio

Scientific desktops and servershigh-performance floating point and graphics

Introduction

Commercial serversdatabases, transaction processing, highly reliable

Embedded computinglow power, small size, safety critical

Slide 11 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

ComputerIntroduction

A computer is a person or an apparatus that is capable of processing information by applying calculation rules.

A computer is a machine for manipulating data according to a list of instructions known as program [Wikipedia] .

Generalized technology independent definition.

Slide 12 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

HistoryIntroduction

~ 5000 bc Basis of calculating is counting.10 fingers ⇒ decimal system

~ 1000 bc Abacus (Suan Pan, Soroban)

Chinese Suan Pan Roman Abacus

Page 4: LectureCA All Slides

4

Slide 13 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

HistoryIntroduction

← Book from 1958

See also: http://www.ee.ryerson.ca/~elf/abacus/leeabacus/

Finger technique (from Japanese book 1954)↓

Slide 14 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

HistoryIntroduction

300 bc …1000 ac Roman numeral systemaddition system, no zero

1I5V

10X50L

100C500D

1000MValueNumeral

Value ‚19‘:XVIIII or XIX

Not suitable forperforming multiplications.

Slide 15 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

HistoryIntroduction

~ 500 ac Hindu-Arabic Numeral System,place value system, introduction of 0

Indian (3rd century bc)

Indian (8th century)

West-Arabic (11th century)

European (15th century)

European (16th century)

Today

Forms the basis point for the developmentof calculation on machines.

Slide 16 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

HistoryIntroduction

1623 Wilhelm SchickardCalculation machine

1641 Blaise PascalAdding machine

1679 G.W. LeibnizDyadic system (binary system)

1808 J. M. JaquardPunch card controlled loom

Page 5: LectureCA All Slides

5

Slide 17 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

HistoryIntroduction

1833 Charles BabbageDifference Engine

• Data memory, program memory

• Instruction based operation

• Conditional jumps

• I/O unit

Slide 18 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

HistoryIntroduction

1847 George BooleLogic on mathematical statements

1890 H. HollerithPunch card based tabulating machine

Digital data loggingon punch cards.First electro mechanical data processing.

Slide 19 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

HistoryIntroduction

1936 Alan TuringPhilosophy of information, Turing machineFounder of Computer Science

1941 Konrad ZuseFirst electro-mechanic computer Z3Binary arithmetic, floating point

Z3 rebuild in 1961

Slide 20 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

History

By functiontable ROM

YesNoYes1948

Partially, by rewiring

YesNoYes1944

USAENIAC

By punchedpaper tape

NoNoYes1944USAHarvard Mark I

Partially, by rewiring

YesYesYes1943UKColossus

NoYesYesYesSummer 1941USAAtanasoff - Berry Computer

By punchedfilm stock

NoYesYesMay 1941GermanyZuse Z3

ProgrammableElectronicBinaryDigitalShownworking

NationComputer

Characteristics of the first 5 operative digital computersIntroduction

Information source: Wikipedia on „Z3“ or on „ENIAC“, English

Page 6: LectureCA All Slides

6

Slide 21 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

HistoryIntroduction

1945 John v. NeumannConcept of universal computer systemsFounder of Computer Architecture

The von Neumann model of a universal computer (stored program computer)

Input

Output

Memory ALU

Control

data

data

data

control signals

instructions

control signals

Slide 22 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

v. Neumann ModelIntroduction

•Control UnitInterpretation of the program. Timing control of units.

•MemoryStorage for program and data. Addressable storage locations. Read / Write.

•ALU (Arithmetic Logic Unit)Performs calculations.

A computer consists of 5 units

• Input Unit

•Output UnitCommunication with the environment

Slide 23 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

v. Neumann ModelIntroduction

Addresses

Data

Control

Inpu

t / O

utpu

t

Mem

ory

Microprocessor(CPU)

Keyboard

Monitor

...

Microcomputer

Today:• Input unit and output unit are combined (not necessarily

physically!) to form the Input/Output unit (short: I/O unit).

• The control unit and the ALU are combined to formthe microprocessor.

The v. Neumann model (or architecture) basically still applies to the majority of modern computer systems.

Slide 24 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Characteristicsvon Neumann Model

• Architecture is independent of problem to be processedUniversal stored program computer, not tailored to specific problem.

• Random accessible memory locationsSelection of location by means of an address. All locations have same capacity.

• Both program and data reside in memoryThe state of the machine (control unit) decides whether thecontent of a memory location is considered data or code.

• Computer is centrally controlledCPU has the master role.

• Sequential processing Execution of a program is done instruction by instruction.

Page 7: LectureCA All Slides

7

Slide 25 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

v. Neumann Model

1. Fetch instruction from memory and put itinto instruction register (in CPU).

2. Evaluate instruction (decode instruction)

3. When needed for this particular instruction,address the data (the operands) in memory.

4. Fetch the data (usually into CPU internal registers).

5. Perform operation on the data (usually this iscarried out by the ALU) and write back the results.

6. Adjust address counter to point to next instruction.

Steps in executing an instruction

Instruction phase

Data phase

Introduction

Slide 26 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Bus System

v. Neumann Bottleneck

address of instruction

instruction

address of A

A

address of B

B

address of C

C

TheC

PU

si de

TheMem

oryside

IntroductionMemory accesses in executing C = A + B

timeA, B, C: data in memoryaddress busdata bus

Slide 27 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

v. Neumann BottleneckIntroduction

The data is processed faster by the CPUthan it can be taken from or stored in memory.

The processor↔memory interface is crucial for the overall computation performance.

Reduction of the bottleneck effect through introduction of a hierarchical memory organization.

Register – Cache – Main memory

Slide 28 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Computer Performance

Performance, the work done ina certain amount of time.

Introduction

Performance is like Power.t

WP =

• processing an instruction,• carrying out a floating-point or an integer operation,• processing a standardized program (benchmark)

Work can have the meaning of

Page 8: LectureCA All Slides

8

Slide 29 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Computer Performance

• Clock rate [Hz]The frequency at which the CPU is clocked.

• MIPSMillion instructions per second

• FLOPSFloating point operations per second

Introduction

Popular performance measures

Slide 30 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Computer PerformanceIntroduction

Many performance measures are notvery expressive ... as they do not

• consider the number of instructions being carriedout per cycle (parallel execution),

• cover the effective throughput between CPUand memory,

• distinguish between complex instruction set computer (CISC) and reduced instruction set computer (RISC).

Slide 31 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Computer PerformanceIntroduction

Many performance measures are notvery expressive ... as they do not

• consider the number of instructions being carriedout per cycle (parallel execution),

• cover the effective throughput between CPUand memory,

• distinguish between complex instruction set computer (CISC) and reduced instruction set computer (RISC).

Computer performance comparedto a VAX-11/780 from 1978.

Figure from [HP06 p.3]

Slide 32 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Moore‘s LawGordon Moore empirically observedin 1965 thatthe number of transistors on a chipdoubles approximately every 12 months.

In 1975 he revised his prediction tothe number of transistors on a chipdoubling every two years.

See also: www.thocp.net/biographies/papers/moores_law.htm

tt NN ⋅⋅≈ 15.0

0 10 where t is in [years]Moore‘s Law:

Gordon E. Moore

Introduction

Page 9: LectureCA All Slides

9

Moore‘s Law

Image source: Wikipedia on „Moore‘s Law“, English Slide 34 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Computer Architecture

Operating SystemsSystem layers (36)Early computer Systems (42)Batch systems (46)Multi-program systems (50)Time sharing systems (54)Modern systems (57)

Slide 35 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Operating SystemsAn operating system is a program that acts as an intermediary between a user of a computer and the computer hardware [Sil00 p.3].

Purpose: provision of an environment in which auser can execute programs.

Objectives:

• to make the system convenient to useUsability, extending the machine beyond low level hardware programming

• to use the hardware in an efficient mannerResource management, manage the hardware allocation among different programs

Slide 36 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Operating Systems

Computer system layersFigure from [Sil00 p.4]

Page 10: LectureCA All Slides

10

Slide 37 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

System Layers

1. Hardware – provides basic computing resources.CPU, Memory, I/O, and devices connected to I/O.

2. Operating system – coordinates the use of the hardware among the various applicationprograms for the various users.

3. Applications programs – the programs used to solve the computing problems of the users.

4. User – people, machines, or other computers using the computer system.

Operating systems

Slide 38 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

System LayersOperating systems

Computer system layersFigure from lecture CA WS 05/06, original source unknown

Slide 39 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Operating SystemsUsability – the operating system as an Extended Machine

The architecture of most computers at the machine language level is awkward to program, especially for I/O.

• shields the programmer from the hardware details,

• provides simple(r) interfaces,

• offers high level abstractions and, in this view, presents the user with the equivalent of an extended machine.

The operating system

See also [Ta01 p.4]

Slide 40 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Operating SystemsResource Management – The operating system as a

Resource Manager

Computer resources: processor(s), memory, timer,disks, network interfaces, printer, graphic card, ...

• keeps track of who is using which resource,

• grants or denies resource requests,

• accounts the usage of resources.

The operating system

See also [Ta01 p.5]

Page 11: LectureCA All Slides

11

Slide 41 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Resource ManagementResource management may be divided into

Operating systems

• time management (e.g. CPU time, printer time),

• and space management (e.g. memory or disk space).

• process management,

• memory management,

• file system management,

• device management.

Resource management incorporates

Before going in into these subjects, let‘s have a look at the computer development since 1945.

Slide 42 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Early Computer Systems

• Vacuum tubes

• A single group of people did all the workdesign, construction, programming, operating, maintenance

• Programming in machine languageplugboard, no programming languages

Operating systemsFirst computer generation (1945 – 55)

• Users directly interact with computer system

• Programs directly interact with hardware

• No operating system

Slide 43 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Early Computer SystemsOperating systems

First computer generation (1945 – 55)

IBM 407 Accounting MachineElectro mechanical tabulator

Source: http://www.columbia.edu/acis/history/407.html

Wiring panel (plugboard)

Slide 44 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Early Computer SystemsOperating systems

First computer generation (1945 – 55)

IBM 402 plugboard

Source: http://www.columbia.edu/acis/history/plugboard.html

Page 12: LectureCA All Slides

12

Slide 45 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Early Computer Systems

• Vacuum tubes

• A single group of people did all the workdesign, construction, programming, operating, maintenance

• Programming in machine languageplugboard, no programming languages

Operating systemsFirst computer generation (1945 – 55)

• Users directly interact with computer system

• Programs directly interact with hardware

• No operating systemSlide 46 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Batch Systems

• Transistors, Mainframe computers

• First high level programming languagesFortran (Formula translation), Algol (Algorithmic language), Lisp (List Processing)

• No direct user interaction with computerEverything went via the computer operators.

• Users submit job to operatorjob = program + data + control information.

• Operator batched jobsComposition of jobs with similar needs

Operating systemsSecond computer generation (1955 – 65)

Slide 47 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Batch SystemsOperating systems

Figure from [Ta01 p.9]

Structure of a typical FMS (Fortran Monitor System) batch job

Second computer generation (1955 – 65)

Slide 48 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Batch SystemsOperating systems

Figure from [Ta01 p.8]

IBM IBM IBM

Batch job processing scence [Tanenbaum]

Second computer generation (1955 – 65)

Page 13: LectureCA All Slides

13

Slide 49 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Batch Systems

• Resident monitor program in memory Monitor program loading one job one after another (from tape).

Operating systems

Monitorprogram

job

Memory

• Sequenced job inputJobs from tape or from card reader. Monitorprogram cannot select jobs on its own.

• One job in memory at a time

• CPU often idlewaiting for slow I/O devices

Second computer generation (1955 – 65)

Slide 50 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Multiprogram Systems

• DisksDirect access to several jobs on disk. Now the operatingsystem can select jobs (job scheduling).

• Multiprogrammed Batch Systems

Operating systemsThird computer generation (1965 – 80)

OperatingSystem

job 1

Memory

job 2

job 3

job 4

Several jobs in memory at the same time

Operating system shares CPU timeamong the jobs (CPU scheduling).

Better CPU utilization

• Integrated Circuits

Slide 51 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Multiprogram SystemsOperating systems

Assume program A being executed on a single-programcomputer. The program needs two I/O operations.

Assume program B being executed on the same computerat some other time. The program needs no I/O.

A1 A2 A3

I/O I/Ot

A A

CPU usageover time

B1 B2 B3 tCPU usageover time

Slide 52 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Multiprogram SystemsOperating systems

Total execution time on a single-program computer:

Total execution time on a multi-program computer:

Now assume program A and B being executed on amulti-program computer.

B1 B2 B3A1 A2 A3

I/O I/Ot

A A

CPU usageover time

Page 14: LectureCA All Slides

14

Slide 53 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Multiprogram Systems

• Multiprogram computers were still batch systems

• Desire for quicker response timeIt took hours/days until output ready. A single misplaced comma could cause a compilation to fail, and the programmer wasted half a day [Ta01 p.11].

• Desire for interactivityUsers wanted to have the machine ‘for themselves’, working online.

Operating systemsThird computer generation (1965 – 80)

⇒ Requests paved the way for timesharingsystems (still in third computer generation)

Slide 54 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Time Sharing Systems

• Direct user interactionMany users share a computer simultaneously. Terminals ↔ Host.

• Multiple job execution with high frequent switchingOperating system must provide more sophisticated CPU scheduling.

• Disk as backing store for memoryVirtual memory

• Many jobs awaiting execution

Operating systemsThird computer generation (1965 – 80)

Operating SystemSwapping,address translation,protecting memory

(memory management)

• Disk as input / output storageNeed for the OS to manage user data (file system management)

Slide 55 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Time Sharing SystemsAssume program A and B as previously. Executionon a time sharing system: Program B has finished

Time sharing is not necessarily faster. Compare tothe multiprogramming example:

Small time slices allow for interactivity (quasi parallel execution)

B1 B2 B3A1 A2 A3

I/O I/OtCPU usage

over time

I/O I/Ot

A A

CPU idleCPU usageover time

Slide 56 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Memory LayoutOperatingSystem

program 1

Memory

program 2

program 3

program 4

program n

program 5

program 6

WorkingMemory

Time sharing system

OperatingSystem

job 1

Memory

job 2

job 3

job 4

Multi program system

Monitorprogram

job

Memory

Batch system

Page 15: LectureCA All Slides

15

Slide 57 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Modern SystemsOperating systems

Fourth computer generation (1980 – present)

• Single-chip CPUs

• Personal Computers

• Real-Time Systems

• Multiprocessor Systems

• Distributed Systems

• Embedded Systems

CP/M

MS-DOS, DR-DOS

Windows 1.0 ... Windows 98 / ME

Windows NT 4.0 ... 2003, XP

XENIX, MINIX, Linux, FreeBSD

Slide 58 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Real Time Systems

• Rigid time requirements

• Hard Real TimeIndustrial control & robotics

Guaranteed response times

Slimmed OS features (no virtual memory)

• Soft Real TimeMultimedia, virtual reality

Less restrictive time requirements

RT System

RT System

Modern systems

Slide 59 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Multiprocessor Systems

• n processors in system (n > 1), tightly coupled

• Resource sharing

• Symmetric Multiprocessing

• Asymmetric Multiprocessing

Each CPU runs identical copy of OS

All CPUs are peers (no master-slave)

Each CPU is assigned specific task

Task assignment by master CPU

User User User User

CPU CPU CPU CPU

Operating System

Operating System

User User User User

CPU CPU CPU CPU

Modern systems

Slide 60 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Distributed Systems

• Individual computers

• Autonomous operation

• Communication via network

• Network Operating System

Modern systems

File Sharing

Message exchange

• n computers/processors (n > 1), loosely coupled

Page 16: LectureCA All Slides

16

Slide 61 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Embedded Systems

• Dedicated to specific tasks

Modern systems

• Encapsulated in host deviceinvisible, usually not repaired when defect

• Small in size, low energy

• Sometimes safety-criticalautomotive drive by wire, medical apparatus

• Custom(ized) operating systemLittle or no file I/O, sometimes multitasking,no fancy OS’s.

Slide 62 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Resource ManagementOperating systems

• File system management

• Memory management

Creation and organization of a logical storage location where data (user data, system data, programs) can be persistently stored in terms of files. Assigning rights and managing accesses. Maintentance.

• Process managementCreation of processes (programs in execution) and sharing the CPU among them. Control of execution time. Enabling communication between processes.

• Device management.

Assigning memory areas to processes. Organizing virtual memory.

Low level administrative work related to the specifics of the I/O devices. Translations, low level stream processing. Usually by device drivers.

Slide 63 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Operating Systems

An operating system in the wide senseis the software package for making a computer operable.

The operating system in the narrow sense is the one program running all the time on the computer (the kernel).It consists of several tasks and is asked for services through system calls.

Imag

e so

urce

:Wik

iped

iaon

‚ker

nel‘,

Eng

lish

Slide 64 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Operating SystemsOperating system categories

• Single User - Single Tasking

• Single User - Multi Tasking

• Multi User - Single Tasking

MS-DOS

Windows, MacOS

• Multi User - Multi Tasking

CP/M

Unix, VMS

Page 17: LectureCA All Slides

17

Slide 65 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Computer Architecture

File System Management

Slide 66 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File System ManagementStorage Media (67)

Magnetic Disks (71)

Files and Directories (81, 90)

File Implementation (98)

Directory Implementation (114)

Free Block Management (124)

File System Layout (129)

Cylinder skew, disk scheduling (135)

Floppy Disks (145)

Slide 67 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Storage MediaStorage hierarchyFigure from [Sil00 p.31]

low

high

acce

ss ti

me

seco

n day

sto r

age

prim

ary

stor

age

Slide 68 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Storage MediaCost versus access time for DRAM and magnetic disks [HP06 p.359]

1ms 10ms

Flash

Page 18: LectureCA All Slides

18

Slide 69 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Storage Media

• Store large amount of dataMuch more data than fits into (virtual) memory

• Persistent storeThe information must survive the terminationof the process creating or using it.

• Concurrent access to dataMultiple processes should be able to access the data simultaneously.

⇒ Storage of data on secondarystorage media in terms of files.

Requirements for secondary storage

Slide 70 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File System ManagementStorage Media (67)

Magnetic Disks (71)

Files and Directories (81, 90)

File Implementation (98)

Directory Implementation (114)

Free Block Management (124)

File System Layout (129)

Cylinder skew, disk scheduling (135)

Floppy Disks (145)

Slide 71 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Magnetic DisksMagnetic disk drive principle

Figure from [Sil00 p.29]

Disk drive

Computer

diskcontroller

hostcontroller

Slide 72 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Magnetic DisksSector: Smallest addressable unit on magnetic disk.

Data size between 32 and 4096 bytes (standard 512 bytes).

Several sectors may be combined to form a logical block. Thecomposition is usually performed by a device driver. In this way thehigher software layers only deal with abstract devices that all havethe same block size, independent of the physical sector size.Such a block is also termed cluster.

A disk sectorFigure from [Ta01 p.315]

512 bytes

Page 19: LectureCA All Slides

19

Slide 73 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Magnetic DisksFormatted Disk Capacity

= bytes per sector x tracks per cylinderx sectors per track

capacity of a track

capacity of one platter side

x cylinder

number of tracks on a platter

capacity of all platter sides = disk capacity

Capacity = 63 kB

CHS = (7, 2, 9), sector size: 512 byte

C = cylinderH = Heads = tracks per cylinderS = sectors per track

Slide 74 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Magnetic Disks

Disk parameters for the original IBM PC floppy diskand a Western Digital WD 18300 hard disk [Ta01 p.301].

(heads)

Slide 75 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Magnetic Disks

Physical disk geometry

On older disks the number of sectors per track was the same for all cylinders.

The physics of the inner track sectors defined the maximum number of bytes per sector.

From physics, the outer sectors could have stored more bytes than defined,as the areas are bigger.

Waste of space / capacity

Slide 76 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Magnetic Disks

Physical geometry (left) and corresponding virtual geometry example (right)

Modern disks are divided into zones with more sectors in the outer zones than in the inner zones (zone bit recording).

Figure from [Ta01 p.302], modified

This

mus

t be

seen

as

two

sect

ors

Page 20: LectureCA All Slides

20

Slide 77 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Magnetic DisksPhysical geometry: The true physical disk layout. Withmodern disks only the internal electronic knows about it.

Virtual geometry: The published disk layout to theexternal world (device driver, operating system, user)

CHS (for old disks)or not published any more

CHS (e.g. WD 18300 example)

LBA (logical block addressing)Disk sectors are just numbered consecutivelywithout regard of the physical geometry.

A disk is a random access storage device.

Slide 78 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Magnetic DisksLow level formatting: Creation of the physical geometry on the disk platters. Defect disk areas are masked out and are replaced by spare areas. Done by disk drive internal software.

High level formatting: A partition receives a boot block and an empty file system (free storage administration, root directory).Done by application program or by operating system administration tool.

Partitioning: The disk is divided into independent partitions, each logically acting as a separate disk. Definition of a masterboot record in first sector of the disk. Done by application program.

Slide 79 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Logical Disk Layout

File system

Magnetic disks

Figure from [Ta01 p.400], modified

Slide 80 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File System ManagementStorage Media (67)

Magnetic Disks (71)

Files and Directories (81, 90)

File Implementation (98)

Directory Implementation (114)

Free Block Management (124)

File System Layout (129)

Cylinder skew, disk scheduling (135)

Floppy Disks (145)

Page 21: LectureCA All Slides

21

Slide 81 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Files

A file is a named collection of related information recorded on secondary storage. [Sil00 p.346]

A file is a logical storage unit. It isan abstract data type. [Sil00 p345, 347]

Files are an abstraction mechanism for storing information and retrieving it back later. [after Ta01 p.380]

Slide 82 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File Structure

Logical file structure examples [Ta01 p.382]

Files

Slide 83 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File Structurea) Byte sequence

Unstructured. The OS does not know or care what is in the file. Meaning imposed by application program. Maximum flexibility. Approach used by Unix and Windows.

b) Sequence of records (fixed-length)Each record has some internal structure. Background idea: read / writeoperations from secondary storage have record size.

c) Tree of recordsHighly structured. Records may be of variable size. Access to a record through key (e.g. „Pony“). Lookup / read / write / append are performed by OS, not by application program. Approach used in large mainframe computers (commercial data processing systems).

Files

Slide 84 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File Access

Figure from [Sil00 p.355], modifiedRecord

• Sequential AccessSimple and most common. Based on the tape model of a file.Data is processed in order (byte after byte or record after record).Operations: read, write, rewind.Records need not to be of same length (e.g. text files with eachline posing a record. Remember Pascal readln, writeln.

Files

Page 22: LectureCA All Slides

22

Slide 85 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File Access• Direct Access

Bytes or fixed-length logical records. Records are numbered. Access can be in no particular order. Access by record number. Based on disk model of a file. Useful for immediate access to large data records (e.g. database).Operations: read, write, seek.

Figure from [Sil00 p.355], modified

1 2 3 4 5 6 7 8 9 10 11 12

(file pointer)

Byte or record seek

Files

Slide 86 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File Access• Indexed Access

Index file holds keys. Keys point to records within relative file.Suited for tree structures.

Example of index file and relative file, figure from [Sil00 p.358]

Files

Slide 87 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File Names

• Name assigned by creation processandrew 2day urgent! fig_14

• Case sensitivityAndrew ANDREWandrew

• Two-part file names: basename.extensionreadme.txt prog.c.Z lecture.doc

Unix: case sensitive. MS-DOS: not sensitive.

Extensions are often just conventions, not mandatory by the operating system (although convenient when the OS knows about them).

Files

Slide 88 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File Attributes• Additional information about a file.

Depends on operating system and file system what attributes there are.

• Assigned by the operating system.• Stored in the file system

Regular file or directory file or ... File type

Whether the file content is text or is binarytext / binary flag

Date of file creationCreation date

Whether or not file name is displayed in listingsHidden flag

If set, file is a temporary file and is deleted on process exitTemp flag

Who can access the file and in what way?Access rights

Some possible file attributes

Files

Page 23: LectureCA All Slides

23

Slide 89 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File Types

Block special files Character special filesRegular files DirectoriesUnixWindows

Files

Contain bytes (words in Unicode) according to a standardizedcharacter set, such as EBCDIC, ASCII or Unicode. The content is directly printable (screen, printer). Data.

Contents not intended to be printed (at least directly). Contenthas meaning only to those programs using the files.Program (binary executable) or data.

Files for maintaining the logical structure of the file system

Text files (also termed ASCII files)

Binary files

Slide 90 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Single-level directory

Figure from [Sil00 p.360]

DirectoriesA directory is a named logical place to put files in.

• Early operating systems (CP/M, MS-DOS 1.0)

• Still used in tiny embedded systems

• File names are unique

This is the directory entry for the file called records,pointing to the file content on the storage media.

This is the file contentof the file records.

Slide 91 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

user1 user2 user3 user4

DirectoriesTwo-level directory

Figure from [Sil00 p.361], modified

• Absolute file names, relative file names, path names/user1/test, /user3/test test, ../user4/data /user3

• Absolute file names are unique

• Hierarchical structure (tree of depth 1)

root directory

sub directories

Slide 92 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

DirectoriesMulti-level directory

Figure from [Sil00 p.363]

leve

l

Page 24: LectureCA All Slides

24

Slide 93 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Multi-Level DirectoryDirectories

• Hierarchical structure of arbitrary depthTree structure, graph structure. Logical organization structure.

• One root directoryArbitrary number of sub (sub sub ...) directories

• Efficient file searchTree / Graph traversing routines. Much faster than sequential search.

• Logical groupingSystem files, user files, shared files, ...

• Most common structure

• Generalization of two-level directory

Slide 94 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Multi-Level Directory

Figure from [Sil00 p.365]

DirectoriesAcyclic graph directory structure

• Additional directory entries (Links)

• Shared directories

• Shared files

• More than one absolute namefor a file (or a directory)

• Dangling link problem

Shared directoryShared files

Slide 95 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Multi-Level Directory

Figure from [Sil00 p.365]

DirectoriesGeneral graph directory structure

Avoiding cycles:

Forbid any links to directoriesNo more shared directories then

Use cycle detection algorithm

Allowing links to point to directoriescreates the possibility of cycles.

Slide 96 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File System ManagementNow turning from the user’s view to the implementor’s view. Users are concernedwith how files are named, what operationare allowed and what the directories look like.

how files and directories are stored on the disk,

how the disk space is managed,

and how to make everything work efficiently.

Implementors are interested in

Page 25: LectureCA All Slides

25

Slide 97 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File System ManagementStorage Media (67)

Magnetic Disks (71)

Files and Directories (81, 90)

File Implementation (98)

Directory Implementation (114)

Free Block Management (124)

File System Layout (129)

Cylinder skew, disk scheduling (135)

Floppy Disks (145)

Slide 98 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File Implementation

• Contiguous Allocation

• Indexed Allocation

• Linked Allocation Chained BlocksChained Pointers

The most important issue in implementing files is the way how the available disk space is allocated to a file.

Slide 99 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Contiguous AllocationFile Implementation

Each file occupies a set of contiguous blockson the disk. File defined by disk address (first block)and by length in block units.

Advantage

• Simple implementationFor each file we just need to know its start block and its length

• Fast accessAccess in one continuous operation. Minimum head seeks.

Disadvantage

• Disk fragmentationProblem of finding space for new file. The final file size mustbe known in advance!

Slide 100 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Contiguous AllocationFile Implementation

(a) Contiguous allocation of disk space for 7 files

(b) State of the disk after files D and E have been removedFigure from [Ta01 p.401]

Page 26: LectureCA All Slides

26

Slide 101 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Contiguous AllocationFile Implementation

Free disk space is broken into chunks (holes) which are spread all over the disk. New files are put into available holes, often not filling them up entirely and thus leaving smaller holes. A big problem arises when the largest available hole is too small for a new file.

A file usually does not fill up its last block entirely, so the remaining space in theblock is left unused.

External Fragmentation

Internal Fragmentation

used

Slide 102 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Linked AllocationFile Implementation

Each file is a linked list of disk blocks. Theblocks may be scattered anywhere on the disk.

Each block has besides its data a pointer to the next block. The pointer is a number (a block number).

next

data

next

data

next

data... ...nil

disk

Chained blocks

Slide 103 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Linked AllocationFile Implementation

Figure from [Sil00 p.380]

Advantage

• Simple implementationOnly first block number needed.

• No external fragmentationFiles consist of blocks scattered on the disk.No more ‚useless‘ blocks on disk.

The file ‚jeep‘ starts with block 9. It consists of the blocks 9, 16, 1, 10, and 25 in this order.

Slide 104 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Linked AllocationFile Implementation

Disadvantage

• Free space managementSomehow all the free blocks must be recorded in some free-block pool.

• Higher access timeMore seeks to access the whole file owing to block scattering.

• Space reductionSome bytes of each block are needed for the pointer.

• ReliabilityIf a pointer is broken, the remainder of the file is inaccessible.

• Not efficient for random accessTo get to block k we must walk along the chain.

Page 27: LectureCA All Slides

27

Slide 105 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Linked AllocationFile Implementation

A table contains as many entries as there are disk

blocks. The entries are numbered by block number.

The block numbers of a file are linked in this table

in chain manner (as with chained blocks).

This table is called file allocation table (FAT).

In particular the last

disadvantage of thechained blocks allocation method,

the unsuitability for random accesses to files,

lead to the chained pointers allocation method.

Figure from [Sil00 p.382]

Slide 106 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Linked AllocationFile Implementation

Figures from [Ta01 p.403,404], modified

block

block

Chained block allocation

block

Chained pointer allocation (FAT)

The FAT is stored on disk and is loaded intomemory when the operating system starts up.

Slide 107 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Chained pointersAdvantage

• Simple implementationOne simple table for both file allocation and free-block pool.

• Whole block available for dataNo more pointers taking away data space.

• Suitable for random accesssAlthough the principle of getting to block k did not change, the search(counting) is now done on the block numbers, not on the blocks themselves.

Linked Allocation

Disadvantage

• FAT takes up disk space and memory (when cached)One table entry for each disk block. Table size proportional to disk size.

• Higher access time (compared to contiguous allocation)Still it needs many seeks to collect all the scattered blocks.

Slide 108 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Indexed AllocationFile Implementation

Each file is assigned an index block.

The index block is an array of block numbers, listing in order the blocks belonging to the file. To get to block k of a file, one reads the kth entry of the index block.

next data

next data

next data

next data

nildiskindex block

Page 28: LectureCA All Slides

28

Slide 109 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Indexed AllocationFile Implementation

The file ‚jeep‘ is describedby index block 19.

The index block has 8 entries of which 5 are used.

Index blocks are also called index nodes,short i-nodes or inodes.

Figure from [Sil00 p.383]

Slide 110 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Indexed AllocationAdvantage

• Good for random accessFast determination of block k of a file.

• Lesser memory occupationOnly for those files currently in use (open files) the correspondingindex blocks are loaded into in memory.

• Lesser disk space occupationOnly as many index blocks needed as there are files in the file system.

File Implementation

Disadvantage

• Free block managementA separate free-block pool must be available.

• Index block utilizationUnused entries in index block do waste space.

Slide 111 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Indexed Allocation

• Linked index blocksThe last entry in an index block points to another index block (chaining).

• Multilevel index blocksAn entry does not point to the data, but points to a first-level index block (single indirect block) which then points to the data. Optionally, additional level available through second-level and third-level index blocks.

• Combined schemeMost entries point to the data directly. The remaining entries point tofirst-level and second-level and third-level index blocks. Used by Unix.

File ImplementationWhat if a file needs more blocks than entries available in an index block?

Slide 112 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Indexed Allocation

data

data

data

data

Combined scheme example (Unix V7)from [Ta01 p.447], modified

File Implementation

Note: The inodes are no disk blocks, but are records stored in disk blocks.The single / double / triple indirect blocks are disk blocks.

Page 29: LectureCA All Slides

29

Slide 113 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File System ManagementStorage Media (67)

Magnetic Disks (71)

Files and Directories (81, 90)

File Implementation (98)

Directory Implementation (114)

Free Block Management (124)

File System Layout (129)

Cylinder skew, disk scheduling (135)

Floppy Disks (145)

Slide 114 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Directory ImplementationBefore accessing a file, the file must

first be opened by the operating

system. For that, the OS uses the

path name supplied by the user to

locate the directory entry.

A directory entry provides

• the name of the file,

• the information needed to find the blocks of the file,

• and information about the file‘s attributes.

Slide 115 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Directory EntryAttribute placement

The attributes may be stored

a) together with the file name in the directory entry (MS-DOS, VMS)

b) or off the directory entry (Unix)Figure from [Ta01 p.406]

Directory Implementation

Slide 116 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Directory EntryMS-DOS directory entry

Directory entry size: 32 byte

File attributes stored in entry.

First block number points to first file block, respectively to the corresponding entry in the FAT (DOS uses chained pointers).

Figure from [Ta01 p.440]

Directory Implementation

Page 30: LectureCA All Slides

30

Slide 117 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

attributes

Directory EntryUnix directory entry (Unix V7)

Entry size: 16 byte.Modern Unix versions allow for longer file names.

File attributes are stored in the inode.

The rest of the inode points to the file blocks

Directory Implementation

directory entry

Slide 118 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Directory EntryMS-DOS file attributes

A : Archive flag

D: Directory flag

V: Volume label flag

Figure from [Ta01 p.440]

Directory Implementation

A D V S H R

S : System file flag

H: Hidden flag

R: Read-only flag

of file creation

Slide 119 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Directory ImplementationAn MS-DOS directory (not the entry) itself is a file (a binary file)with the file type attribute set to directory.

The disk blocks pointed to contain other directory entries (eachagain of 32 byte size) which either depict files or subsequent directories (sub directories).

Upon installing an MS-DOS file system, there is automatically created a root directory.

Similar applies to Unix. When the file type attribute is set to directory, the file blocks contain directory entries.

Windows 2000 and descendants (NTFS) treat directories asentities different from files.

Slide 120 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Directory Implementation

directory entry

directory entry

directory entry

directory entry

directory entry

directory entry

directory entrydirectory entry

disk block

...

pointing to disk blocks containing directory entries

pointing to disk blocks containing file data

= Directory attribute set

= Directory attribute not set. Regular file.

Legend:

MS-DOS directory

Page 31: LectureCA All Slides

31

Slide 121 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File LookupHow to find a file name in a directory

• Linear SearchEach directory entry has to be compared against the searchname (string compare). Slow for large directories.

• Binary SearchNeeds a sorted directory (by name). Entering and deleting files requires moving directory entries around in order to keep them sorted (Insertion Sort).

• Hash TableIn addition to each file name, an hash value (a number) is created andstored. Search is then done on the hash value, not on the name.

• B-treeFile names are nodes and leafs in a balanced tree. NTFS.

Directory Implementation

Slide 122 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File Lookup

The steps in looking up the file /usr/ast/mbox in classical Unix

Directory ImplementationFigure from [Ta01 p.447]

Slide 123 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File System ManagementStorage Media (67)

Magnetic Disks (71)

Files and Directories (81, 90)

File Implementation (98)

Directory Implementation (114)

Free Block Management (124)

File System Layout (129)

Cylinder skew, disk scheduling (135)

Floppy Disks (145)

Slide 124 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Free Block ManagementTo keep track of the blocks available for allocation (free blocks), the operating system must somehow maintain a free block pool.

When a file is created, the pool is searched for free blocks. When a file is deleted, the freed blocks are added to the pool.

File systems using a FAT do not need a separate free block pool.Free blocks are simply marked in the table by a 0.

• Linked List• Free List• Bit Map

Free Block Pool Implementations:

Page 32: LectureCA All Slides

32

Slide 125 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Free Block ManagementLinked ListThe free blocksform a linked list where each block pointsto the next one (chained blocks).

• Simple ImplementationOnly first block number needed.

• Quick AccessNew blocks are prepended (LIFO principle)

• Disk I/OUpdating the pointers involves I/O.

• Block ModificationModified content hinders ‘undelete’ of the block.

Figure from [Sil00 p.388]

Slide 126 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

17 18 0

Free Block ManagementFree List

The free block numbers are listed in a table. The table is stored in disk blocks.The table blocks may belinked together.

• SpaceEach free block requires 4 byte in table

• ManagementAdding and deleting block numbers needs time, in particular when a table blockis almost full (additional disk I/O required).

Figu

re fr

om [T

a01

p.41

3]

Slide 127 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Free Block Management

To each existing block on disk a bit is assigned. Whena block is free, the bit is set. When the block is occupied, the bit is reset (or vice versa). All bits form a bit map.

Bit Map

• CompactEach block represented by a single bit. Fixed size.

• Logical orderNeighboring bits represent neighboring blocks (logical order).Quite easy to find contiguous blocks, or blocks located close together.

• Conversion block number ↔ bit positionFrom the block number the corresponding bit position must becalculated and vice versa.

Figure from [Ta01 p.413]

Slide 128 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File System ManagementStorage Media (67)

Magnetic Disks (71)

Files and Directories (81, 90)

File Implementation (98)

Directory Implementation (114)

Free Block Management (124)

File System Layout (129)

Cylinder skew, disk scheduling (135)

Floppy Disks (145)

Page 33: LectureCA All Slides

33

Slide 129 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File System Layout

File system

Figure from [Ta01 p.400], modified

Each Partition starts with a boot block (first block) which is followedby the file system. The boot block may be modified by the file system.

Slide 130 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

FAT copy Files and directoriesFAT Root dir

Layout of FAT file system

File System Layout

Information about the filesystem location is stored in the boot block.

Number of entries in root directory is limited,except for FAT-32 where it is a cluster chain.

A copy of the FAT for reliability reasons

http://www.microsoft.com/whdc/system/platform/firmware/fatgen.mspxMicrosoft FAT-32 specification at

Slide 131 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File System LayoutPossible file system layouts for a UNIX file system

Super block Inodes Root dir Files and directories

Super block Inodes Root dir Files and directoriesFree block pool

The inode for the root directory is located at a fixed place.

Information about filesystem (block size, volume label,size of inode list, next free inode, next free block, ...)

Bit map free block management

Slide 132 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File areaMFT System files

Layout of NTFS file system

File System Layout

Files for storing metadata about the file system.Actually, the MFT itself is a system file.

Master File Table. Linear sequence of 1kB records. Each record describes one file or directory. MFT is a file, may be located anywhere on disk.

More about NTFS: http://www.ntfs.com/ntfs_basics.htm

Information about the filesystem location is stored in the boot block.

Page 34: LectureCA All Slides

34

Slide 133 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File System ManagementStorage Media (67)

Magnetic Disks (71)

Files and Directories (81, 90)

File Implementation (98)

Directory Implementation (114)

Free Block Management (124)

File System Layout (129)

Cylinder skew, disk scheduling (135)

Floppy Disks (145)

Slide 134 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Cylinder Skew

Cylinder skew example

Assumption: Reading frominner tracks towards outertracks.

Here: skew = 3 sectors.

After head has moved tonext track, sector 0 arrives just in time. Reading can continue right away.

Performance improvementwhen reading multiple tracks.

Physical disk geometry, figure from [Ta01 p.316]

Disk Performance

Slide 135 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Disk SchedulingModern disk drives are addressed as large one-dimensional arrays of logical blocks, where the logical block is the smallest unit of transfer. The array of logical blocks is mapped into the sectors of the disk sequentially. Sector 0 is the first sector of the first track on the outermost cylinder. Mapping proceeds in order through that track, then the rest of the tracks in that cylinder, and then through the rest of the cylinders from outermost to innermost.

However, it is difficult to convert a logical block into CHS:

• The disk may have defective sectors which are replaced by spare sectors from elsewhere on the disk.

• Owing to zone bit recording the number of sectors per track is not the same for all cylinders. After [Sil00 p.436]

Disk Performance

Slide 136 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Disk Scheduling• Fast access desired (high disk bandwidth)

Disk bandwidth is the total number of bytes transferred, divided by the total timefrom the first request for service until completion of the last transfer.

• Bandwidth depends on– Seek time, the time for the disk to move the heads to the

cylinder containing the desired sector.

– Rotational latency, the additional time waiting for the disk torotate the desired sector to the disk head.

• Seek time ≈ seek distance.

• Scheduling goal: minimizing seek timeScheduling in earlier days done by OS, nowadays by either OS (then guessing the physical disk geometry) or by the integrated disk drive controller.

Disk Performance

Page 35: LectureCA All Slides

35

Slide 137 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Disk Scheduling

• First-Come First-Served (FCFS)

• Shortest Seek Time First (STTF)

• SCAN

• C-SCAN

• C-LOOK

Scheduling Algorithms

For the following examples: Assumption that there are 200 tracks ona single sided disk. Read requests are queued in some queue.The queue is currently holding the requests for tracks 98, 183, 37, 122, 14, 124, 65 and 67. The head is at track 53.

Disk Performance

Slide 138 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

FCFSDisk Scheduling

The requests are serviced in the order of their entry (first entry is served first).

Figure from [Sil00 p.437]

time

track

Slide 139 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

SSTFDisk Scheduling

The next request served is the one that is closest to current position (shortest seek time).

Figure from [Sil00 p.438]

time

track

Slide 140 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

SCANDisk Scheduling

Disk arm starts at one end of the disk and sweeps over to the other end, thereby servicing the requests.

At the other end the head reverses direction and servicing continues on the return trip.

Figure from [Sil00 p.439]

time

track

Page 36: LectureCA All Slides

36

Slide 141 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

C-SCANDisk Scheduling

Disk arm starts at one end of the disk and sweeps over to the other end, thereby servicing the requests. At the other end the head returns to the beginning of

the disk without servicing on the return trip.

Figure from [Sil00 p.440]

time

track

Slide 142 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

C-LOOK

Like SCAN or C-SCAN, but the head moves only as far as the final request in each direction.

Figure from [Sil00 p.441]

time

track

Disk Scheduling

Slide 143 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

• SSTF is common and has a natural appeal

• SCAN and C-SCAN perform better for systemsthat place a heavy load on the disk.

• Performance depends on the number and typesof requests.

• Requests for disk service are influenced by thefile allocation method.

• Either SSTF or LOOK is a reasonable choice as default algorithm.

Disk SchedulingDisk Performance

Slide 144 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File System ManagementStorage Media (67)

Magnetic Disks (71)

Files and Directories (81, 90)

File Implementation (98)

Directory Implementation (114)

Free Block Management (124)

File System Layout (129)

Cylinder skew, disk scheduling (135)

Floppy Disks (145)

Page 37: LectureCA All Slides

37

Slide 145 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Floppy Disks

Figure from www.computermuseum.li

• Portable storage media• 8“ floppy in 1969• 5.25“ floppy in 1978• 3.5“ floppy in 1987

720K, 1.44 MB360k ... 1.2M80K ... 1.2MCapacity:

3.5“ disk5.25“ disk8“ disk

Floppy disks almost displaced by Flash Memory (e.g. USB Stick) now, except for the purpose of booting computers (bootable floppies).

Slide 146 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Floppy Disks

Two sided floppy disk

1

2

3

4

5

7

89

19

20

21

2223

25

2627

37

38

39

4041

43

4445

Drehrichtung

Spurnummer

Seite 1(Rückseite)

Seite 0

BIOS 0,0,1

Beginn der Spuren

Sektorerkennung BDOS: 42BIOS: 0,2,6 (Seite, Spur, Sektor)

03 2 1

62442Page 1 (Back)

Track start Rotation direction

Page 0 (Front)

Sector number

Track index (0, 1, 2, ... )

BIOS: 2,0,6 (CHS)BDOS: 42

Slide 147 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Floppy Disks

1

2

3

4

5

7

89

19

20

21

2223

25

2627

37

38

39

4041

43

4445

Drehrichtung

Spurnummer

Seite 1(Rückseite)

Seite 0

BIOS 0,0,1

Beginn der Spuren

Sektorerkennung BDOS: 42BIOS: 0,2,6 (Seite, Spur, Sektor)

03 2 1

62442Page 1 (Back)

Track start Rotation direction

Page 0 (Front)

Sector number

Track index (0, 1, 2, ... )

BIOS: 2,0,6 (CHS)BDOS: 42

BIOS = Basic Input Output SystemStored in (EP)ROM

Sector access through invoking a software-interrupt and addressinga sector by means of CHS.

BDOS = Basic Disk Operating SystemOriginates from CP/M operating system. Higher abstraction level than BIOS.

Sector access through invoking a software-interrupt and addressing a sector by means of a logical consecutive sector number (1, 2, ...).

Slide 148 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Floppy DisksSector Structure

1 2 3 4 5 6 7 8 9

Sync IAM track index

head index

sector index

sector length CRC DAM 128-1024

data bytesCRC/ ECC

Inter-Record Gap

Address field Data field

CRC: Cyclic Redundancy CheckECC: Error Checking/CorrectionIAM: Index Address MarkDAM: Data Address Mark

sectors

Page 38: LectureCA All Slides

38

Slide 149 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Floppy Disks

342011211.44 M

30169211.2 M

158521720 K

136421360 K

DataRoot dirFAT 2FAT 1Boot sectorDisk

Starting sector numbers for system and data areas (FAT file system). All numbers are in decimal notation.

Slide 150 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Floppy Disks

Data(5)

Data(4)

Data(3)

Data(2)

Data(1)

Data(6)

10

11

12

1314

15

16

17

18

Spur 0, Seite 1

Data(1)Data

(2)

Data(3)

Data(4)

Data(5)

Data(6) Dir.

(5)

Dir.(6)

Dir.(7)

Track 0, Page 1Track 0, Page 0

Bootstraploader

1

2

3

45

6

7

8

9

BootstrapLoader

FAT(1)

FAT(2)

FAT(3)FAT

(4)

Dir.(1)

Dir.(2)

Dir.(3)

Dir.(4)

Dir. = allocated space for root directory Track 0 of a 360 kB floppy disk

Slide 151 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Computer Architecture

Process Management

Slide 152 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Process Management

Processes (153)

Threads (178)

Interprocess Communication (IPC) (195)

Scheduling (247)

Real-Time Scheduling (278)

Deadlocks (318)

Page 39: LectureCA All Slides

39

Slide 153 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

ProcessesA process is a set of identifiable, repeatable actions which are ordered in some way and contribute to the fulfillment of an objective.(General definition)

A process is a program in execution.(Computer oriented definition)

• Process: dynamic, activeActing according to the recipe (cooking) is a process.

• Program: static, passiveA cooking recipe is a program.

Slide 154 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Process Model• Several processes are working quasi-parallel.

• A process is a unit of work.

• Conceptually, each process has its own virtual CPU.In reality, the real CPU switches back and forth from process to process.

Processes

Sequential viewProcess model view

Processes makeprogress over time

Figure from [Ta01 p.72]

Slide 155 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Processes

a) CPU-boundspends more time doing computations – few very long CPU bursts.

b) I/O-boundspends more time doing I/O than computations – many short CPU bursts.

Figure from [Ta01 p.134]

A process may be described as either (more or less)

Slide 156 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Address SpaceProcesses

A process is an executing program, and encompasses the current values of the program counter, of the registers, of the variables and of the stack.

code section (text section or segment)This is the actual program code (the machine instructions).

CS

DS

SS

SP

PC

Memory

CPU

data section (data segment)This segment contains global variables (global to theprocess, not global to the computer system).

stack section (stack segment)The stack contains temporary data (local variables,return addresses, function parameters).

Page 40: LectureCA All Slides

40

Slide 157 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Process StatesProcesses

Figure from [Sil00 p.89]

Note: Only in the running state the process needs CPU cycles, in all other states it is actually ‚frozen‘ (or nonexistent any more).

Slide 158 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Process StatesProcesses

• NewThe process is created. Resources are allocated.

• ReadyThe process is ready to be (re)scheduled.

• RunningThe CPU is allocated to the process,that is, the program instructions are being executed.

• WaitingThe process is waiting for some event to occur. Without this event the processcannot continue – even if the CPU would be free.

• TerminatedWork is done. The process is taken off the system (off the queues)and its resources are freed.

Slide 159 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

ProcessesEvents at which processes are created

• Operating System Start-UpMost of the system processes are created here. A large portion of them are background processes (daemons).

• Interactive User RequestA user requests an application to start.

• Batch jobJobs that are scheduled to be carried out when the system has available the resources (e.g. calendar-driven events, low priority jobs)

• Existing process gives birthAn existing process (e.g. a user application or a system process) creates aprocess to carry out some related (sub)tasks.

Slide 160 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Process CreationProcesses

• Resource sharing– Parent and child share all resources.– Child shares subset of parent’s resources.– Parent and child share no resources.

• Execution– Parent and child execute concurrently.– Parent waits until child terminates.

• Address Space– Child is copy of parent– Child has program loaded into it

Syst

em c

alls

for c

reat

ing

a ch

ild p

roce

ss:

Unix

: fork()

Win

dows

: CreateProcess()

• Parent process creates a child processwhich in turn may create other processes, forming a tree of processes.

Page 41: LectureCA All Slides

41

Slide 161 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

fork() exampleProcess Creation

#include <stdio.h>

void main(){

int result;printf(„Parent, my pid is: %d\n", getpid());result = fork();if (result == 0) { /* child only */printf(„Child, my pid is: %d\n", getpid());...} else { /* parent only */printf(„Parent, my pid is: %d\n", getpid());...

}}

system call that tells a process its pid(process identifier) which is a unique process number within the system.

from here on think parallel

Exe

cut e

d by

chi

ld

Executed by parent

Slide 162 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

fork() exampleProcess Creation

Parent, my pid is: 189

Child, my pid is: 190

Parent, my pid is: 189

Terminal output:order depends on whether parent or child is scheduled first after fork().

pid = 189

Before fork()

...fork()...

PC

After fork()

pid = 190

...fork()...

pid = 189

...fork()...PC PC

Slide 163 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Process CreationProcesses

Figure from [Sil00 p.96]

A Unix process tree

Slide 164 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Process TerminationProcesses

• Process asks the OS to delete itWork is done. Resources are deallocated (memory is freed,open files are closed, used I/O buffers are flushed).

• Parent terminates child– Child may have exceeded allocated resources.

– Task assigned to child is no longer required.

– Parent‘s parent is exiting.Some OS do not allow a child to continue when itsparent terminates. Cascading termination (a subtree is deleted).

Events at which processes are terminated

Syst

em c

alls

for s

elf-t

erm

inat

ion

of a

pro

cess

:

Uni

x: exit()

Win

dow

s: ExitProcess()

Page 42: LectureCA All Slides

42

Slide 165 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Process Control BlockProcesses

• Operating system maintains a process table

• Each entry represents a process

• Entry often termed PCB (process control block)

A PCB contains all information about a process that must be saved when the processis switched from running into waiting or ready, such that it can later be restartedas if it had never been stopped.

Info regarding process management,regarding memory occupationand open files.

Figure from [Sil00 p.89]

PCB example

Slide 166 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Process Control BlockProcesses

Table from [Ta01 p.80]

Typical fields of a PCB

Slide 167 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Context SwitchProcesses

• Saving the state of old processSaving the current context of the process in its PCB.

• Loading the state of new processRestoring the former context of the process from its PCB.

The task of switching the CPU from one process to anotheris termed context switch (sometimes also process switch):

Context switching is pure administrative overhead. The duration of a switch lies in the range of 1 ... 1000 µs. The switch time depends on the hardware. Processors with multiple sets of registers are faster in switching. Context switching poses a certain bottleneck, which is one reason for the introduction of threads.

Slide 168 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Context SwitchProcesses

Figure from [Sil00 p.90]

cont

ext s

witc

hco

ntex

t sw

itch

context switch time

context switch time

Page 43: LectureCA All Slides

43

Slide 169 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

SchedulingProcesses

• Job QueueHolds the future processes of the system.

• Ready Queue (also called CPU queue)Holds all processes that reside in memory and are ready to execute.

• Device Queue (also called I/O queue)Each device has a queue holding the processes waiting for I/O completion.

• IPC QueueHolds the processes that wait for some IPC (inter process communication) event to occur.

On a uniprocessor system there is only one process running,all others have to wait until they are scheduled. They arewaiting in some scheduling queue:

Slide 170 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Ready queue

Tape

Ethernet

Disk

Terminal

registers registers registers

registers

SchedulingProcesses

Figure from [Sil00 p.92], modified

Dev

ice

queu

es

These queues are empty

The ready queue and some device queues

Slide 171 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

SchedulingProcesses

From the job queue a new process is initially put into theready queue. It waits until it is dispatched (selected for execution). Once the process is allocated the CPU, one of these events may occur.

• InterruptThe time slice may be expired or some higher priority process is ready. Hardware error signals (exceptions) also may cause a process to be interrupted.

• I/O requestThe process requests I/O. The process is shifted to a device queue. After the I/O device has ready, the process is put into the ready queue to continue.

• IPC requestThe process wants to communicate with another process through some blockingIPC feature. Like I/O, but here the „I/O-device“ is another process.

A note on the terminology: Strictly spoken, a process (in the sense of an active entity)only exists when it is allocated the CPU. In all other cases it is a ‚dead body‘.

Slide 172 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

SchedulingProcesses

Ready queue CPU

Device queueI/O request

Job queue

Interrupt

IPC queueIPC request

Queueing diagram of process scheduling

processes are new

processes are ready process is running

processes are waitingevents

Page 44: LectureCA All Slides

44

Slide 173 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

SchedulingProcesses

The OS selects processes from queues and puts them intoother queues. This selection task is done by schedulers.

• Long-term SchedulerOriginates from batch systems. Selects jobs (programs) from the pooland loads them into memory.

Invoked rather infrequently (seconds ... minutes). Can be slow.Has influence on the degree of multiprogramming (number of processes in memory).Some modern OS do not have a long-term scheduler any more.

• Short-term SchedulerSelects one process from among the processes that are ready to execute, and allocates the CPU to it. Initiates the context switches.

Invoked very frequently (in the range of milliseconds). Must be fast, that is,must not consume much CPU time compared to the processes.

Slide 174 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

SchedulingProcesses

Ready queue CPUJob queue

Long-term scheduler Short-term scheduler

Schedulers and their queues

Slide 175 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

SchedulingProcessesSometimes it may be advantageous to remove processes

temporarily from memory in order to reduce the degree of multi-programming. At some later time the process is reintroduced intomemory and can be continued. This scheme is called swapping, performed by a medium-term scheduler.

Ready queue CPUJob queue

swap queue

swap out

swap in

Medium-term scheduler

Slide 176 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Process Concept• Program in execution

Several processes may be carried out in parallel.

• Resource groupingEach process is related to a certain task and groups togetherthe required resources (Address space, PCB).

Traditional multi-processing systems:

• Each process is executed sequentiallyNo parallelism inside a process.

• Blocked operations → Blocked processAny blocking operation (e.g. I/O, IPC) blocks the process. The processmust wait until the operation finishes.

In traditional systems each process has a single thread of control.

Processes

Page 45: LectureCA All Slides

45

Slide 177 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Process Management

Processes (153)

Threads (178)

Interprocess Communication (IPC) (195)

Scheduling (247)

Real-Time Scheduling (278)

Deadlocks (318)

Slide 178 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Threads

Here: a sequence of instructionsthat may execute in parallel with others

• a piece of yarn,A thread is

A thread is a line of execution within the scope of a process. A single threaded process has a single line of execution (sequential execution of program code), the process and the thread are the same. In particular, a thread is

• a basic unit of CPU utilization.

• a screw spire,• a line of thoughts.

Slide 179 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Threads

Figure from [Ta01 p.82]

Three single threadedprocesses in parallel

A process with threeparallel threads.

Slide 180 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Threads

- Reading from keyboard- Formatting and displaying pages- Periodically saving to disk- ... and lots of other tasks

A single threaded process would quite quickly result in an unhappy usersince (s)he always has to wait until the current operation is finished.

Multiple processes?Each process would have its own isolated address space.

As an example, consider of a word processing application.

Multiple threads!The threads operate in the same address space and thushave access to the data.

Page 46: LectureCA All Slides

46

Slide 181 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Threads

Figure from [Ta01 p.86]

Three-threaded word processing application

formatting anddisplaying

Reading keyboardSaving to disk

Slide 182 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Threads• Multiple executions in same environment

All threads have exactly the same address space (the process address space).

Figure from [Sil00 p.116]

• Each thread has own registers, stack and state

Slide 183 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Threads

Table from [Ta01 p.83]

Items shared by all threads in a process

Items private to each thread

Slide 184 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

User Level Threads

• Take place in user spaceThe operating system does not know about the applications’ internal multi-threading.

Threads

• Can be used on OS not supporting threadsIt only needs some thread library (like pthreads) linked to the application.

• Each process has its own thread tableThe table is maintained by the routines of the thread library.

• Customized thread schedulingThe processes use their own thread scheduling algorithm. However, no timercontrolled scheduling possible since there are no clock interrupts inside a process.

• Blocking system calls do block the processAll threads are stopped because the process is temporarily removed from the CPU.

Page 47: LectureCA All Slides

47

Slide 185 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

User Level Threads

Figure from [Ta01 p.91]

Thread management isperformed by theapplication.

Examples- POSIX Pthreads- Mach C-threads- Solaris threads

Threads

Slide 186 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Kernel Threads• Take place in kernel

The operating system manages the threads of each process

Threads

• Available only on multi-threaded OS‘sThe operating system must support multi-threaded application programs.

• No thread administration inside processsince this is done by the kernel. Thread creation and management howeveris generally somewhat slower than with user level threads [Sil00 p.118].

• No customized schedulingThe user process cannot use its own customized scheduling algorithm.

• No problem with blocking system callsA blocking system call causes a thread to pause. The OS activates anotherthread, either from the same process or from another process.

Slide 187 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Kernel Threads

Thread management isperformed by theoperating system.

Examples- Windows 95/98/NT/2000- Solaris- Tru64 UNIX- BeOS- Linux

Figure from [Ta01 p.91]

Threads

Slide 188 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Multithreading ModelsThreads

• Many user level threadsare mapped to a singlekernel thread.

• Used on systems that donot support kernel threads.

Many-to-One Model

Figure from [Sil00 p.118]

Page 48: LectureCA All Slides

48

Slide 189 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Multithreading Models

Each user level thread is mapped to one kernel thread.

One-to-One ModelThreads

Figure from [Sil00 p.119]

Slide 190 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Multithreading Models

Many user level threads are mapped to many kernel threads.

Many-to-Many ModelThreads

Figure from [Sil00 p.119]

Slide 191 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

MultithreadingThreads

Solaris 2 multi-threading example

Figure from [Sil00 p.121]

Slide 192 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Threads• Implements one-to-one mapping• Each thread contains

- a thread id- register set- separate user and kernel stacks- private data storage area

Windows 2000:

• One-to-one model (pthreads), many-to-many (NGPT)• Thread creation is done through clone() system call.clone() allows a child to share the address space of the parent. This system call is unique to Linux, source code not portable to other UNIX systems.

Linux:

Page 49: LectureCA All Slides

49

Slide 193 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Threads• Provides support at language level.• Thread scheduling in JVM

Java:

class Worker extends Thread {public void run() {System.out.println("I am a worker thread");}

}

public class MainThread {public static void main(String args[]) {Worker worker1 = new Worker();worker1.start();System.out.println("I am the main thread");}

}

Exam

ple:

Cre

atio

n of

a th

read

by

inhe

ritin

g fro

m Thread

clas

s

thread creation and automatic call of run() method

Slide 194 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Process Management

Processes (153)

Threads (178)

Interprocess Communication (IPC) (195)

Scheduling (247)

Real-Time Scheduling (278)

Deadlocks (318)

Slide 195 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

IPC

• Managing critical activitiesMaking sure that two (or more) processes do not getinto each others' way when engaging critical activities.

• SequencingMaking sure that proper sequencing is assuredin case of dependencies among processes.

• Passing informationProcesses are independent of each other and have private address spaces.How can a process pass information (or data) to another process?

Purpose of Inter Process Communication

Process Synchronization

Data exchange

Thread Synchronization

Less important for threads since they operate in the same environment

Slide 196 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Race ConditionsIPC

Figure from [Ta01 p.101]

shared variables

next empty slot

Print spooling example

Situations, where two or more processes access some shared resource, and the final result depends on who runs precisely when, are called race conditions.

Page 50: LectureCA All Slides

50

Slide 197 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Race ConditionsIPCProcesses A and B want to print a file

Both have to enter the file name into a spooler directoryout points to the next file to be printed. This variable is accessed only by the printer daemon. The daemon currently is busy with slot 4.in points to the next empty slot. Each process entering a file name in the empty slot must increment in.

Process A reads in (value = 7) into some local variable. Before it can continue, the CPU is switched over to B.Process B reads in (value = 7) and stores its value locally. Then the file name is entered into slot 7 and the local variable is incremented by 1. Finally the local variable is copied to in (value = 8).Process A is running again. According to the local variable, the file name is entered into slot 7 – erasing the file name put by B. Finally in is incremented.User B is waiting in the printer room for years ...

Now consider this situation:

Slide 198 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Race ConditionsIPCAnother example at machine instruction level

R1 xR1 = R1+1R1 x

R3 xR3 = R3+1R3 x

Process 1

Shared variable x (initially 0)

Process 2

x=1

Scenario 1

x=2 x=1

x=0

Scenario 2

R1 x

R1 = R1+1R1 x

R3 xR3 = R3+1

R3 x

Process 1 Process 2x=0

Slide 199 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Critical RegionsIPCHow to avoid race conditions?

Find some way to prohibit more than one process from manipulating the shared data at the same time.

→ „Mutual exclusion“

Part of the time a process is doing some internal computations and other things that do not lead to race conditions.

Sometimes a process however needs to access shared resources or does other critical things that may lead to race conditions. These parts of a program are called critical regions (or critical sections).

critical regiontProcess A

Slide 200 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Critical RegionsIPC

1. No two processes simultaneously in critical regionwhich would otherwise controvert the concept of mutuality.

2. No assumptions about process speedsNo predictions on process timings or priorities. Must work with all processes.

3. No process outside its critical regions must blockother processes, simply because there is no reason to hinderothers entering their critical region.

4. No process must wait forever to enter a critical region.For reasons of fairness and to avoid deadlocks.

Four conditions to provide correctly working mutual exclusion:

Page 51: LectureCA All Slides

51

Slide 201 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Critical Regions

Figure from [Ta01 p.103]

Mutual exclusion using critical regionsIPC

Slide 202 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Mutual ExclusionProposals for achieving mutual exclusion

IPC

• Disabling interruptsThe process disables all interrupts and thus cannot be taken away from the CPU.

Not appropriate. Unwise to give user process full control over computer.

• Lock variablesA process reads a shared lock variable. If the lock it is not set, the process setsthe variable (locking) and uses the resource.

In the period between evaluating and setting the variablethe process may be interrupted. Same problem as with printer spooling example.

Slide 203 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Mutual ExclusionProposals for achieving mutual exclusion (continued)

IPC

• Strict AlternationThe shared variable turn keeps track of whose turn it is. Both processes alternate in accessing their critical regions.

while (1) {while (turn != 0);critical_region();turn = 1;noncritical_region();

}

while (1) {while (turn != 1);critical_region();turn = 0;noncritical_region();

}

Process 0 Process 1

Slide 204 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Mutual ExclusionProposals for achieving mutual exclusion (continued)

IPC

• Strict Alternation (continued)

Busy waiting wastes CPU time.

Violation of condition 3.

busy waiting for turn = 0Process 0

Process 1

turn = 0

tturn = 1 turn = 0

No good idea when one process is much slower than the other.

Page 52: LectureCA All Slides

52

Slide 205 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Mutual ExclusionIPC

Proposals for achieving mutual exclusion (continued)

int turn;bool interested[2];

void enter_region(int process) {int other = 1 – process;interested[process] = TRUE;turn = process;while (turn == process && interested[other] == TRUE);

}

void leave_region(int process) {interested[process] = FALSE;

}

• Peterson Algorithmshared variables

Two processes, number is either 0 or 1

Slide 206 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Mutual Exclusion

while (turn == 0 && interested[1] == TRUE);

while (turn == 1 && interested[0] == TRUE);

Peterson Algorithm (continued)

Assume process 0 and 1 both simultaneously entering critical_region()

IPC

Both are manipulating turn at the same time. Whichever store is lastis the one that counts. Assume process 1 was slightly later, thus turn = 1.

other = 1interested[0] = trueturn = 0

other = 0interested[1] = trueturn = 1P

roce

ss 1

Pro

ces s

0

Process 0 passes its while statement, whereas process 1 keeps busy waitingtherein. Later, when process 0 calls leave_region(), process 1 is released from the loop.

Good working algorithm, but uses busy waiting

Slide 207 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Mutual Exclusion

enter_region: TSL R, lockCMP R, #0JNZ enter_regionRET

leave_region: MOV lock, #0RET

Pseudo assembler listing providing the functions enter_region() and leave_region().

Proposals for achieving mutual exclusion (continued)

• Test and Set Lock (TSL)Atomic operation at machine level. Cannot be interrupted. TSL reads the content of the memory word lock into register R and then stores a nonzero value at the memory address lock. The memory bus is locked, no other process(or) can access lock.

IPC

indivisible operation

CPU must support TSLBusy waiting

Slide 208 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Mutual ExclusionIntermediate Summary

• Disabling Interrupts• Lock Variables• Strict Alternation• Peterson Algorithm• TSL instruction

IPC

Not recommended for multi-user systems.

Problem remains the same.

Violation of condition 3. Busy waiting.

Busy waiting.

Solves the problem through atomic operation.Should be used without busy waiting.

In essence, what the last three solutions do is this: A process checks whether the entry to its critical region is allowed. If it is not, the process just sits in a tight loop waiting until it is.Unexpected side effects, such as priority inversion problem.

Page 53: LectureCA All Slides

53

Slide 209 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Priority Inversion ProblemIPC

Consider a computer with two processes

Process H with high priorityProcess L with low priority

The scheduling rules are such that H is run whenever it is in ready state.

At a certain moment, with L in its critical region, H becomes ready and is scheduled. H now begins busy waiting, but since L is never scheduled while H is running, L never has the chance to leave its critical region. H loops forever. This is sometimes referred to as the priority inversion problem.

Solution: blocking a process instead of wasting CPU time.

Slide 210 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Sleep and wake upIPC

sleep()A system call that causes the caller to block, that is, the process voluntarily goes from the running state into the waiting state. The scheduler switches over to another process.

wakeup(process)A system call that causes the process process to awake from its sleep() and to continue execution. If the process process is not asleep at that moment, the wakeup signal is lost.

Note: these two calls are fictitious representatives of real system calls whose names and parameters depend on the particular operating system.

Slide 211 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Producer – Consumer Problem• Shared buffer with limited size

The buffer allows for a maximum of N entries (it is bounded). The problemis also known as bounded buffer problem.

shared buffer

Producer

• Producer puts information into bufferWhen the buffer is full, the producer must wait until at least one itemhas been consumed.

Consumer

• Consumer removes information from bufferWhen the buffer is empty the consumer must wait until at least one new itemhas been entered.

IPC

const int N = 100;int count = 0;

void producer() {while (TRUE) {int item = produce_item();if (count == N) sleep();insert_item(item);count++;if (count == 1) wakeup(consumer);

}}

void consumer() {while(TRUE) {

if (count == 0) sleep();item = remove_item();count--;if (count == N-1) wakeup(producer);consume_item(item);

} }

constantly producing

produce itemsleep when buffer is fullenter item to bufferadjust item counter

when the buffer was empty beforehand (and thus now has 1 item), wakeup any consumer(s) that may be waiting

constantly consuming

sleep when buffer is emptyremove one itemadjust item counter

when the buffer was full beforehand (and thus now has N-1 items), wakeup producer(s) that may be waiting.

Producer – Consumer Implementation ExampleThis implementation suffers from race conditions

A

Page 54: LectureCA All Slides

54

Slide 213 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Producer – Consumer Problem

The buffer is empty and the consumer has just read count to see if it is 0.At that instant (see A in listing) the scheduler decides to switch over to the producer.

The producer inserts an item in the buffer, increments count and notices that count is now 1. Reasoning that count was just 0 and thus the consumer must be sleeping, the producer calls wakeup() to wake the consumer up.

However, the consumer was not yet asleep, it was just taken away the CPU shortly before it could enter sleep(). The wakeup signal is lost.

When the consumer is rescheduled and resumes at A , it will go to sleep. Sooner or later the producer has filled up the buffer and goes asleep as well.

Both processes will sleep forever.

A race condition may occur in this case: Mutual Exclusion

Slide 214 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Producer – Consumer Problem

• The variable count is unconstrainedAny process has access any time.

• Evaluating count and going asleep is a non-atomic operationThe prerequisite(s) that lead to sleep() may have changed when sleep() is reached.

Reasons for race condition

Workaround:• Add a wakeup waiting bit

When the bit is set, sleep() will reset that bit and the process stays awake.

• Each process must have a wakeup bit assigned Although this is possible, the principal problem is not solved.

What is needed is something that does testing a variable and going to sleep – dependent on that variable – in a single non-interruptible manner.

Mutual Exclusion

Slide 215 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Semaphores• Introduced by Dijkstra (1965)

• Counting the number of wakeupsAn integer variable counts the number of wakeups for future use.

down(int* sem) {if (*sem < 1) sleep();*sem--;

}

principle of down-operation

up(int* sem) {*sem++;if (*sem == 1) wakeup a process

}

principle of up-operation

Mutual Exclusion

• Two operations: down and updown is a generalization of sleep. up is a generalization of wakeup. Both operations are carried out in a single, indivisible operation (usually in kernel). Oncea semaphore operation is started, no other process can access the semaphore.

Slide 216 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Semaphores• Up and down are system calls

in order to make sure that the operating system briefly disables all interruptswhile carrying out the few machine instructions implementing up and down.

const int N = 10;typedef int semaphore;semaphore empty = N;semaphore full = 0;semaphore mutex = 1;

Producer – Consumer problem using semaphores (next page)Definition of variables:

a semaphore is an integercounting empty slots

counting full slots

mutual exclusion on buffer access

• Semaphores should be lock-protectedThis is recommended at least in multi-processor systems to prevent another CPU from simultaneously accessing a semaphore. TSL instruction helps out.

Mutual Exclusion

Page 55: LectureCA All Slides

55

void producer() {while (TRUE) {int item = produce_item();down(&empty);down(&mutex);insert_item(item);up(&mutex);up(&full);

}}

void consumer() {while(TRUE) {

down(&full);down(&mutex);item = remove_item();up(&mutex);up(&empty);consume_item(item);

} }

possibly sleep, decrement empty counter

release mutex, wake up other processincrement full counter, possibly wake up other ...

Producer – Consumer Implementation ExampleThis implementation does not suffer from race conditions

possibly sleep, claim mutex (set it to 0) thereafter

possibly sleep, decrement full counter

possibly sleep, claim mutex (set it to 0) thereafter

release mutex, wake up other processincrement empty counter, possibly wake up other ...

Slide 218 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Semaphores

t

Assume N = 5. Initial condition: empty = 5, full = 0.

4

1

producer

3

2

producer

2

3

producer

1

4

producer

0

5

producer

0

sleeppro...

4

1

consumer

3

2

consumer

2

3

consumer

1

4

consumer

0

5

consumer

0

sleepcon...

Scenario: producer is working, no consumer present

empty

full

5

0

tfull

empty

Scenario: consumer is working, no producer present

Initial condition: empty = 0, full = 5.

5

0

Mutual Exclusion

Slide 219 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Semaphores

empty

full

0

5

0

sleep

full

empty

t

t

Assume N = 5. Initial condition: empty = 1, full = 4.

5

0

producer

5

0

0

5

producerpro...

4

1

consumer

3

2

consumer

1

4

4

1

Scenario: Consumer waking up producer

4

1

consumer

4

1

waking up producer

Mutual Exclusion

Slide 220 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Semaphores

empty

full

full

empty

t

t0

5

0

sleep

Assume N = 5. Initial condition: empty = 4, full = 1.

4

1

1

4

consumer con...

5

0

Scenario: Producer waking up consumer

4

1

producer

4

1

4

1

producer

3

2

producer

0

5

consumer

5

0

waking up consumer

Mutual Exclusion

Page 56: LectureCA All Slides

56

Slide 221 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Semaphores

empty

full

2

2

full

empty3

t

t

If processes overlap, then temporary it may be that empty + full ≠ N

Assume N = 5. Initial condition: empty = 3, full = 2.

producer

consumer

4

consumer

11

3

2

2

3

1

2

3

2

down

up

down down

up up

1

4

Note that consumer and producer may almost concurrently change the same semaphore legally.

Mutual Exclusion

Slide 222 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Mutex• Simplified semaphore

when counting is not needed.

• Two statesLocked or unlocked. Used for managing mutual exclusion (hence the name).

mutex_lock: TSL R, mutexCMP R, #0JZ okCALL thread_yieldJMP mutex_lock

ok: RET

mutex_unlock: MOV mutex, #0RET

Pseudo assembler listing implementing mutex_lock() and mutex_unlock().

get and set mutex

was it unlocked?

if yes: jump to okif no: sleeptry again acquiring mutex

unlock mutex

Mutual Exclusion

Slide 223 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Monitors

• High level synchronization primitiveat programming language level. Direct support by some programming languages.

• A collection of procedures, variables and data structures grouped together in a module

Mutual Exclusion

A monitor has multiple entry pointsOnly one process can be in the monitor at a timeEnforces mutual exclusion – less chances for programming errors

• Monitor implementationCompiler handles implementationLibrary functions using semaphores

Slide 224 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

MonitorsMutual Exclusion

monitor example;integer i;condition c;

procedure producer()......

end;

procedure consumer()......

end;

end monitor;A monitor in Pidgin Pascal, from [Ta01 p.115]

Variables not accessible from outside the monitor‘s own methods (capsulation).

Functions (methods) publicly accessible to all processes,however only one process at a timemay call a monitor function.

If the buffer is full, the producer must wait.If the buffer is empty the consumer must wait.

Page 57: LectureCA All Slides

57

Slide 225 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

MonitorsMutual Exclusion

• How can a process wait inside a monitor?Cannot put to sleep because no other process can enter the monitor meanwhile.

• Use a condition variable!A condition variable supports two operations.

wait(): suspend this process until it is signaled. The suspended process is not considered inside the monitor any more. Another process is allowed to enter the monitor.

signal(): wake up one process waiting on the condition variable. No effect if nobody is waiting. The signaling process automatically leaves the monitor (Hoare monitor).

Condition variables usable only inside a monitor.

Slide 226 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

MonitorsMutual Exclusion

Producer-Consumer problem with monitors, from [Ta01 p.117]

Slide 227 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Processes approaching barrier Waiting for C to arrive All processes continuing

BarriersIPC

• Group synchronizationIntended for groups of processes rather than for two processes.

• Processes wait at a barrier for the othersaccording to the all-or-none principle

• After all have arrived, all can proceed

Figure from [Ta01 p.124]

Slide 228 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

BarriersApplication example

Process 2 working on these elements

Process 3 working onthese elements

... and so on for theremaining elements

An array (e.g. an image) is updated frequently by some process 0 (producer). Many processes are working in parallel on certain array elements (consumers). All consumers must waituntil the array has been updated and can then start working again on the updated input.

Process 0

IPCProcess 1 working on these elements

Page 58: LectureCA All Slides

58

Slide 229 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

IPCIntermediate Summary (II)

• SemaphoresCounting variable, used in non-interruptible manner. Down may put the caller to sleep, up may wake up another process.

• MutexesSimplified semaphore with two states. Used for mutual exclusion.

• MonitorsHigh level construct for achieving mutual exclusion at programming language level.

• BarriersUsed for synchronizing a group of processes.

These mechanisms all serve for process synchronization.For data exchange among processes something else is needed: Messages.

Slide 230 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Messages• Kernel supported mechanism for data exchange

Eliminates the need for ‚self-made‘ (user-programmed) communication via shared resources such as shared files or shared memory.

IPC

OS (kernel space)Process 1

send

Process 2

receive

Copy from user space to kernel space Copy from kernel space to user space

Systembuffers

Some data (a message)

send(): send datareceive(): receive data

• Two basic operations:provided by the kernel (system calls)

Slide 231 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Direct Communication

• Processes must name each other explicitly

Messages

- send(P, message): send data to process P- receive(Q, message): receive data from process Q

• Communication link properties- One process pair has exactly one link- The link may be unidirectional or bidirectional

• Both processes must existAs the name direct implies, you cannot send a message to a future process.

Symmetry in addressing. Both processes need to know each other by some identifier. This is no problem if both were fork()ed off the same parent beforehand, but is a problem when they are ‚strangers‘ to each other.

Slide 232 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Indirect CommunicationMessages

- Each mailbox has a unique identifier- Processes communicate when they access the same mailbox

• Communication link properties- Link is established when processes share a mailbox- A link may be associated with many processes (broadcast)- Unidirectional or bidirectional communication

• Messages are send / received from mailboxesThe mailbox must exist, not necessarily the receiving process yet.

• Primitives- send(A, message): send message to mailbox A- receive(A, message): receive message from mailbox A

Page 59: LectureCA All Slides

59

Slide 233 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Synchronous CommunicationMessages

• Also called blocking send / receive• Sender waits for receiver to receive the data

The send() system call blocks until receiver has received the message.

OS (kernel space)Process 1

send

Process 2

receive

Acknowledgement from receiver A single buffer (for the pair) is sufficient

• Receiver waits for sender to send dataThe receive() system call blocks until a message is arriving.

Slide 234 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Asynchronous CommunicationMessages

• Also called non-blocking send / receive• Sender drops message and passes on

The send() system call returns to the caller when the kernel has the message.

• Receiver peeks for messagesThe receive() system does not block, but rather returns some error code telling whetherthere is a message or not. Receiver must do polling to check for messages.

OS (kernel space)Process 1

send

Process 2

receive

Multiple buffers (for each pair) needed

Slide 235 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Messages• Send by copy

The message is copied to kernel buffer at send time. At receive time the message is copied to the receiver. Copying takes time.

• Send by referenceA reference (a memory address or a handle) is copied to the receiver which uses the reference to access the data. The data usually resides in a kernel buffer (is copied there beforehand). Fast read access.

• Fixed sized messagesThe kernel buffers are of fixed size – as are the messages. Straightforward system level implementation. Big messages must be constructed from many small messages which makes user level programming somewhat more difficult.

• Variable sized messagesSender and receiver must communicate about the message size. Best use of kernel buffer space, however, buffers must not grow indefinitely.

IPC

Slide 236 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

UNIX IPC Mechanisms

• PipesSimple(st) communication link between two processes. Applies first-in first-out principle. Works like an invisible file, but is no file. Operations: read(), write().

• FIFOsAlso called named pipe. Works like a file. May exist in modern Unices just in the kernel (and not in the file system). There can be more than one writer or reader on a FIFO.Operations: open(), close(), read(), write().

• MessagesAllow for message transfer. Messages can have types. A process may read all messages or only those of a particular type. Message communication worksaccording to the first-in first-out principle.Operations: msgget(), msgsnd(), msgrcv(), msgctl().

IPC

Page 60: LectureCA All Slides

60

Slide 237 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

UNIX IPC Mechanisms

• SemaphoresCreation and manipulation of sets of semaphores.Operations: semget(), semop(), semctl().

IPC

For an introduction into the UNIX IPC mechanisms (with examples) see

Stefan Freinatis: Interprozeßkommunikation unter Unix - eine Einführung, Technischer Bericht, Fachgebiet Datenverarbeitung, Universität Duisburg, 1994.http://www.fb9dv.uni-duisburg.de/vs/members/fr/ipc.pdf

• Shared memoryA selectable part of the address space of process P1 is mapped into the address space of another process P2 (or others). The processes have simultaneous access.Operations: shmget(), shmat(), shmdt(), smhctl().

const int FIXSIZE=80

void main() {

int fd[2]; // file descriptors for pipe

pipe(fd); // create pipe

int result = fork(); // duplicate process

if (result == 0) { // start child’s code

printf(“This is the child, my pid is: %d\n", getpid());

close(fd[1]); // we do not need writing

char buf[256]; // a buffer

read(fd[0], buf, FIXSIZE) // wait for message from parent

printf(“Child: received message was: %s\n", buf);

exit(0); // good bye

} // end child, start parent

close(fd[0]); // we do not need reading

printf(“This is the parent, my pid is: %d\n", getpid());

write(fd[1], "Hallo!", FIXSIZE); // write message to child

}

Simple pipe example. Parent is writing, child is reading.

Slide 239 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Classical IPC ProblemsIPC

Figure from [Ta01 p.125]

• Five philosophers sitting at a tableThe problem can be generalized to more thanfive philosophers, of course.

• Each either eats or thinks• Five forks available• Eating needs 2 forks

Slippery spaghetti, one needs two forks!

• Pick one fork at a timeEither first the right fork and then the left one,or vice versa.

The dining philosophersAn artificial synchronization problem posed and solved by Edsger Dijkstra 1965.

Slide 240 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Dining philosophersClassical IPC problems

The life of these philosophers consists of alternate periods of eating and thinking. When a philosopher becomes hungry, she tries to acquire her left and right fork, one at a time, in either order. If successful in acquiring two forks, she eats for a while, then puts down the forks and continues to think.

Can you write a program that

makes the philosophers eating and thinking (thus creation of 5 threads or processes, one for each philosopher),

allows maximum utilization (parallelism), that is, two philosophers may eat at a time (no simple solution with just one philosopher eating at a time),

is not centrally controlled by somebody instructing the philosophers,

and that never gets stuck?

Text from [Ta01 p.125]

Page 61: LectureCA All Slides

61

Slide 241 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Dining philosophersClassical IPC problems

const int N=5;

void philosopher(int i) { // N philosophers in parallel

while(TRUE){ // for the whole life

think();

take_fork(i); // take left fork

take_fork((i+1)%N); // take right fork

eat();

put_fork(i); // put left fork

put_fork((i+1)%N); // put right fork

}

} A nonsolution to the dining philosophers problem

If all philosophers take their left fork simultaneously, none will be able to take the right fork. All philosophers get stuck. Deadlock situation.

Slide 242 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Classical IPC ProblemsIPC

• Database systemsuch as an airline reservation system.

• Many competing processes wish to read and writeMany reading processes is not the problem, but if one process wants to write, no otherprocess may have access – not even readers.

The Readers and Writers ProblemAn artificial shared database access problem by Courtois et. al, 1971

How to program the readers and writers?

Writer waits until all readers are goneNot good. Usually there are always readers present. Indefinite wait.

Writer blocks new readersA solution. Writer waits until old readers are gone and meanwhile blocks new readers

Slide 243 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Classical IPC ProblemsIPCThe sleeping barber problem

An artificial queuing situation problem

Figure from [Ta01 p.130]

customer chairs

barber sleeps when nocustomers are present

Slide 244 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Sleeping BarberIPC

How to program the barber and the customers without getting into race conditions?

The barber shop has one barber, one barber chair, and n chairs for customers, if any, to sit on. If there are no customers present, the barber sits down in the barber chair and falls asleep. When a customer arrives, he has to wake up the sleeping barber. If additional customers arrive while the barber is cutting a customer’s hair, they either sit down (if there are empty chairs) or leave the shop (if all chairs are full). Text from [Ta01 p.129]

const int CHAIRS=5; // number of chairs

typdef int semaphore;

semaphore customers = 0; // number of customers waiting

semaphore barbers = 0; // number of barbers waiting

semaphore mutex = 1; // for mutual exclusion

int waiting = 0;

Page 62: LectureCA All Slides

62

Slide 245 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

IPC

void barber() { // barber processwhile(TRUE){ // for the whole life

down(&customers); // sleep if no customersdown(&mutex); // acquire access to ‘waiting’waiting--; up(&barbers); // one barber ready to cutup(&mutex); // release ‘waiting’cut_hair(); // cut hair (non critical)

}}

void customer() { // customer processdown(&mutex); // enter critical regionif (waiting < CHAIRS){ // when seats availablewaiting++; // one more waitingup(&customers); // tell barber if first customerup(&mutex); // release ‘waiting’down(&barbers); // sleep if no barber availableget_haircut(); // get serviced

} else up(&mutex); // shop is full, leave}

A solution to the sleeping barber problem [Ta01 p.131]

Slide 246 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Scheduling

Processes (153)

Threads (178)

Interprocess Communication (IPC) (195)

Scheduling (247)

Real-Time Scheduling (278)

Deadlocks (318)

Slide 247 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Scheduling• Better CPU utilization through multiprogramming

• Scheduling: switching CPU among processes

• Productivity depends on CPU bursts

Figure from [Ta01 p.134]

Slide 248 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Short-Term SchedulerScheduling

Ready queue CPUJob queue

Short-term scheduler

Also called CPU scheduler. Selects one process fromamong the ready processes in memory and dispatches it. The dispatcher is a module that finally gives CPU control to the selected process (switching context, switching from kernel mode to user mode, loading the PC).

Page 63: LectureCA All Slides

63

Slide 249 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Scheduling decisionsSchedulingCPU scheduling decisions may take place when a process

1. switches from running to waiting,2. switches from running to ready,3. switches from waiting to ready,4. or terminates.

Figure from [Sil00 p.89]

1.

2.

3.

4.

Slide 250 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Preemptive(ness)Scheduling

With preemptive scheduling the operating system can additionally force a context switch at any time to satisfy the priority policies. This allows the system to more reliably guarantee each process aregular "slice" of operating time.

Preemptiveness determines the way of multitasking.

the process became blocked,it completed,or it voluntarily gave up the CPU.

With non-preemptive scheduling (cooperative scheduling),a running process is taken away the CPU because

Slide 251 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Preemptive(ness)Scheduling

Cooperative (non-preemptive) scheduling:

• CPU occupation depends on processin particular on the CPU burst distribution.

• Applicable on any hardware platform• Lesser problems with shared resources

at least the elementary parts of shared data structures are not inconsistent

Preemptive scheduling:

• Scheduler can interrupt• Special timer hardware required

for the timer-controlled interrupts of the scheduler.

• Synchronization of shared resourcesAn interrupted process may leave shared data inconsistent.

Slide 252 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Scheduling Criteria

• CPU UtilizationKeeping the CPU as busy as possible. The utilization usually ranges from40% (light loaded system) to 90% (heavy loaded).

• ThroughputThe number of processes that are completed per time unit. For long processes the throughput rate may be one process per hour, for shortones it may be 10 per second.

• Turnaround timeThe interval from the time of submission to the time of completion of a process. Includes the time to get into memory, times spent in the ready queue, execution time on CPU and I/O time.[ With real-time scheduling this time-period is called reaction time ]

SchedulingThe scheduling policy depends on what criteria are emphasized [Sil00 p.140]

Page 64: LectureCA All Slides

64

Slide 253 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Scheduling Criteria• Waiting time

The scheduling algorithm does not affect the time a process executes or spends doing I/O. It only affects the amount of time a process spends waiting in the ready queue. The waiting time is the sum of time spent waiting in the ready queue.

• Response timeIrrespective of the turnaround time, some processes produce an output fairly early and continue computing new results while previous results are output to the user. The response time is the time from the submission of a request until the first response is produced.[ Remark: In the exercises the response time is defined as the time from submission until theprocess starts (that is, until the first machine instruction is executing). ]

Scheduling

Different systems (batch systems, interactive computers, control systems) mayput focus on different scheduling criteria. See next slide.

Slide 254 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Scheduling

Criteria importance by system [Ta01 p.137]

Slide 255 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

OptimizationScheduling

• Maximize(average(CPU utilization))

• Maximize(average(throughput))

• Minimize(average(turnaround time))

• Minimize(average(waiting time))

• Minimize(average(response time))

Sometimes it is desirable to optimize the minimum or maximum values rather than the average. For example, to guarantee that all users receive a good service in terms of responsiveness, we may want to minimize the maximum response time. [Note: we do not delve into ‘optimization’ any further].

Common criteria:

Slide 256 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Static / Dynamic SchedulingScheduling

With static scheduling all decisions are made before thesystem starts running. This only works when there is perfect information available in advance about the work needed to be done and the deadlines that have to be met. Static scheduling - if applied - is used in real-time systems that operate in a deterministic environment.

With dynamic scheduling all decisions are made at run time. Littleneeds to be known in advance. Dynamic scheduling is required when the number and type of requests is not known beforehand (non deterministic environment). Interactive computer systems like personal computers use dynamic scheduling. The scheduling algorithm is carried out as a (hopefully short) system process in-between the other processes.

Page 65: LectureCA All Slides

65

Slide 257 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Scheduling Algorithms

• First Come – First Served

• Shortest Job First

• Priority Scheduling

• Round Robin

• Multilevel Queueing

Scheduling

These algorithms typically are dynamic scheduling algorithms.

Slide 258 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

First Come - First ServedScheduling

The process that entered the ready queue firstwill be the first one scheduled. The ready queue is a FIFO queue.Cooperative scheduling (no preemption).

3 msP3

3 msP2

24 msP1

Burst timeProcess Let the processes arrive in the order P1, P2, P3. The Gantt chart for the schedule is:

Waiting time for P1 = 0 ms, for P2 = 24 ms, for P3 = 27 ms.

Average waiting time: (0 ms + 24 ms + 27 ms) / 3 = 17 ms.

P1 P2 P3

24 27 300

ms

Slide 259 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

First Come - First ServedScheduling

Let the processes now arrive in the order P2, P3, P1.The Gantt chart for the schedule is:

Waiting time for P1 = 6 ms, for P2 = 0 ms, for P3 = 3 ms.

Average waiting time: (6 ms + 0 ms + 3 ms) / 3 = 3 ms.

• Much better average waiting time than previous case.• With FCFS the waiting time generally is not minimal.• No preemption.

P1P3P2

63 300

t [ms]

Slide 260 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Shortest Job First (SJF)Scheduling

Associate with each process the length of its next CPU burst.Use these lengths to schedule the process with the shortest time.

• Non-preemptive SJFOnce the CPU is given to the process, it cannot be preempteduntil the CPU burst is completed.

• Preemptive SJFWhen a new process arrives with a CPU burst length less than the remaining burst time of the current process, the CPU is given to the new process. This scheme is known as the Shortest Remaining Time First (SRTF)

Two schemes:

With respect to the waiting time, SJF is provably optimal. It gives the minimum average waiting time for a given set of processes.Processes with long bursts may suffer from starvation.

Page 66: LectureCA All Slides

66

Slide 261 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Shortest Job FirstScheduling

4 ms5 msP4

1 ms4 msP3

4 ms2 msP2

7 ms0 msP1

Burst timeArrival timeProcess

Waiting time for P1 = 0 ms, for P2 = 6 ms, for P3 = 3 ms, for P4 = 7 ms.

Average waiting time: (0 ms + 6 ms + 3 ms + 7 ms) / 4 = 4 ms.

For non-preemptive schedulingthe Gantt chart is:

P1 P3 P2

73 160

P4

8 12

t [ms]

Slide 262 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Shortest Job FirstScheduling

4 ms5 msP4

1 ms4 msP3

4 ms2 msP2

7 ms0 msP1

Burst timeArrival timeProcess

Waiting time for P1 = 9 ms, for P2 = 1 ms, for P3 = 0 ms, for P4 = 2 ms.

Average waiting time: (9 ms + 1 ms + 0 ms + 2 ms) / 4 = 3 ms.

For preemptive scheduling (SRTF)the Gantt chart is:

P1 P3P2

42 110

P4

5 7

P2 P1

16

t [ms]

Slide 263 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Shortest Job FirstScheduling

Predicting the CPU burst time

The next CPU burst is predicted as the exponential average of the measured lengths of previous bursts:

( ) nnn t ταατ −+=+ 11

tn = actual length of nth burst

τn + 1 = predicted length of next burst

α: 0 ≤ α ≤ 1controls the relative contributions of the recent and the past history

Slide 264 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

10 2 3 4 5 6 7 8i

Shortest Job FirstScheduling

Figure from [Sil00 p.144]

Exponential average for α = ½ and τ0 = 10

Page 67: LectureCA All Slides

67

Slide 265 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Shortest Job FirstSchedulingExponential average for α = ½ and τ0 = 10

τ1 = ½ · 6 + ½ · 10 = 8

τ2 = ½ · 4 + ½ · 8 = 6

τ3 = ½ · 6 + ½ · 6 = 6

τ4 = ½ · 4 + ½ · 6 = 5

τ5 = ½ · 13 + ½ · 5 = 9

τ6 = ½ · 13 + ½ · 9 = 11

τ7 = ½ · 13 + ½ · 11 = 12

( ) nnn t ταατ −+=+ 11

Slide 266 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Priority SchedulingScheduling

Each process is assigned a priority. The process withhighest priority is allocated the CPU.

• Non-preemptive• Preemptive

When a new process arrives with a priority higher than a running process,the CPU is given to the new process.

Two schemes:

• SJF scheduling is a special case of priority scheduling in whichthe ‚priority‘ is the inverse of the CPU burst length.

• Solution to starvation problem: The priority of a process increases as the waiting time increases (aging technique).

Slide 267 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Priority SchedulingScheduling

Assume low numbers representing high priorities

25 msP5

51 msP4

42 msP3

11 msP2

310 msP1

PriorityBurst timeProcess

All processes arrive at time 0.For non-preemptive scheduling the Gantt chart is:

P1P5P2

61 160

P4P3

18 19t [ms]

Slide 268 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Priority SchedulingScheduling

12 ms

6 ms

2 ms

2 ms

0 ms

Arrival time

25 msP5

51 msP4

42 msP3

11 msP2

310 msP1

PriorityBurst timeProcess

Timing diagramProcesses sorted by priority

= running= ready

P4

P3

P1

P5

P2

0 5 10 15 20t [ms]

Here: preemptive scheduling.

Page 68: LectureCA All Slides

68

Slide 269 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Round RobinScheduling

Each process gets a small unit of CPU time (time quantum),usually 10-100 milliseconds. After the quantum has elapsed, the process is preempted and added to the end of the ready queue.

Burst ≤ quantumWhen the current CPU burst is smaller than the time quantum, the process itself will release the CPU (changing state into waiting).

Burst > quantumThe process is interrupted and another process is dispatched.

If the time quantum is very large compared to the processes‘ burst times, the scheduling policy is the same as FCFS.

If the time quantum is very small, the round robin policy turns into processor sharing (seems as if each process has its own processor).

Slide 270 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Round RobinScheduling

Waiting time for P1 = 0 + 57 + 24 = 81 ms, for P2 = 20 ms,for P3 = 37 + 40 + 17 = 94 ms, for P4 = 57 + 40 = 97 ms.

Average waiting time: (81 + 20 + 94 + 97) / 4 = 73 ms.

24 msP4

68 msP3

17 msP2

53 msP1

Burst timeProcess

Suppose a time quantum of 20 ms.The Gantt chart for the schedule is:

P1 P2 P3 P4 P1 P3 P4 P1 P3 P3

0 20 37 57 77 97 117 121 134 154 162t [ms]

Slide 271 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Round RobinScheduling

Round Robin typically has higher average turnaroundsthan SJF, but has better response.

Context switch and performanceThe smaller the time quanta, the more the context switches do affect performance.Following is shown a process with a 10 ms burst, and time quanta of 12, 6 and 1 ms.

Figure from [Sil00 p.148]

Context switches cause overhead

Slide 272 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Round RobinScheduling

Turnaround time depends on time quantum

Figure from [Sil00 p.149]

All processes arrive at same time.Ready queue order: P1, P2, P3, P4

Turnaround time as function of time quantum

Burst time

Page 69: LectureCA All Slides

69

Slide 273 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Round RobinScheduling

Average turnaround time for time quantum = 1ms

P4

P3

P2

P1

0 5 10 15 20t [ms]

Turnaround (P1) = 15 ms

Turnaround (P2) = 9 ms

Turnaround (P3) = 3 ms

Turnaround (P4) = 17 ms

Average turnaround:

(15 + 9 + 3 + 17) ms4

= 11 ms

Slide 274 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Round RobinScheduling

Average turnaround time for time quantum = 2 ms

P4

P3

P2

P1

0 5 10 15 20t [ms]

Turnaround (P1) = 14 ms

Turnaround (P2) = 10 ms

Turnaround (P3) = 5 ms

Turnaround (P4) = 17 ms

Average turnaround:

(14 + 10 + 5 + 17) ms4

= 11.5 ms

Slide 275 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Round RobinScheduling

Average turnaround time for time quantum = 6 ms

Turnaround (P1) = 6 ms

Turnaround (P2) = 9 ms

Turnaround (P3) = 10 ms

Turnaround (P4) = 17 ms

Average turnaround:

(6 + 9 + 10 + 17) ms4

= 10.5 ms

P4

P3

P2

P1

0 5 10 15 20t [ms]

Side note: policy now is like FCFS

Slide 276 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Multilevel QueueScheduling

The ready queue is partitioned into separate queues.Each queue has its own CPU scheduling algorithm. There is alsoscheduling between the queues (inter queue).

Figure from [Sil00 p.150]

Interqueue schedulingFixed priorityTime slicing

Page 70: LectureCA All Slides

70

Slide 277 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Process Management

Processes (153)

Threads (178)

Interprocess Communication (IPC) (195)

Scheduling (247)

Real-Time Scheduling (278)

Deadlocks (318)

Slide 278 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Real-Time SchedulingScheduling

RT System

t

t

t

t

Tdist

Technical process

waiting (ready)

context switch

executioninclusive output

TRmax

TR

Tw

Tcs

∆e

r

s

d

c

Realtime condition: TR ≤ TRmax otherwise realtime-violation

Slide 279 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Real-Time SchedulingA technical process generates events (periodically or not). A real-time computing system is requested to respond to the events. The response must be delivered within the period TRmax.

Scheduling

The technical system requests computation by raising an interrupt at

time r at the real-time system. The time from the occurrence of the request (interrupt) until the context switch of the corresponding computer process is the waiting time Tw . Switching the context takes the time TCS .

The point in time at which execution starts is the start time s. The

execution time ∆e is the netto CPU time needed for execution (even if

the process is interrupted). The process finishes at completion time c.

Tdist

Slide 280 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Real-Time SchedulingScheduling

The reaction time (also called response time) TR is thetime interval between the request (the interrupt) and the endof the process: TR = Tw + TCS + ∆e. This is the time interval thetechnical system has to wait until response is received.

Starting from the request, the maximum response time TRmax defines

the deadline d (a point in time) at which the real-time system must have responded.

Note: For all following considerations, the context switch time TCS is neglected, that is, we assume TCS = 0 µs.In accordance with “D. Zöbel, W. Albrecht: Echtzeitsysteme, page 24, ISBN 3-8266-0150-5

A hard real-time system must not violate the real-time conditions.

Page 71: LectureCA All Slides

71

Slide 281 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Real-Time Violation

Two technical processes TP1 and TP2 on some machine require response from a real-time system. The corresponding computer processes are P1 and P2. The technical processes generate events as follows:

TP2

TP1

0 5 10

0 5 10

t [ms]

t [ms]a b c d

a b cTRmax2

The execution time of P1 is 1ms, the execution time of P2 is 4 ms, and the scheduling algorithm is preemptive priority scheduling. The context switch time is considered negligible (0 µs).

RT-SchedulingExample RT.1

Response must be given latest just before the next event (thus within Tdist)

TRmax1

Slide 282 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Real-Time Violation

TP2

TP1

0 5 10

0 5 10

t [ms]

t [ms]a b c d

a b c

t [ms]

P2

P1

0 5 10

a

a b

b c

c

6 ms

4 ms

TRmax

TP2

TP1

Machine

4 ms

1 ms

response time TR

HIGH

LOW

Priority

P2

P1

ProcessCase 1:P1 low priorityP2 high priority

Real-time violation, response to TP1 is too late!

Slide 283 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Real-Time Violation

TP2

TP1

0 5 10

0 5 10

t [ms]

t [ms]a b c d

a b c

t [ms]

P2

P1

0 5 10

a

a a

b

b b

c d

c

Case 2:P1 high priorityP2 low priority

6 ms

4 ms

TRmax

TP2

TP1

Machine

4 ms

1 ms

response time TR

LOW

HIGH

Priority

P2

P1

Process

No real-time violation. Fine!

Slide 284 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Real-Time SchedulingScheduling

Theorem

are known (deterministic systems).

For a system with n processors (n ≥ 2) there is no optimal scheduling algorithm for a set of processes P1 ... Pm unless

all starting times s1, ... sm,all execution times ∆e1, ... ∆em,all completion times c1, ... cm

An algorithm is optimal when it finds an effective solutionif such exists.

Often, technical processes (or natural processes) are non-deterministic, at least to a part.

Page 72: LectureCA All Slides

72

Slide 285 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Branch-and-Bound SchedulingRT-Scheduling

Find a schedule by searching all combinations of processes.

the request time (interrupt arrival time) rknown in case of periodical technical processes

the response time TRknown from analysis or worst-case measurements

the deadline dgiven by the technical system

Example:

Of each process (non-preemptive!) must be known in advance:

30 ms

50 ms

20 msexecution time ∆e

100 ms0 msP3

90 ms0 msP2

30 ms0 msP1

deadline direquest time riProcess

Slide 286 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Branch-and-Bound SchedulingRT-Scheduling

P1

P1, P2

P1, P2 , P3

P1, P3

P1, P3 , P2

P2, P1 P2, P3

P2, P3 , P1

P3, P1 P3, P2

P3, P2 , P1

P2 P3

Ø

P2, P1 , P3 P3, P1 , P2

Search tree for the example

For n processes: tree depth (number of levels) = n, number of combinations = n!

Slide 287 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Branch-and-Bound SchedulingRT-Scheduling

P1

P2

P3

t [ms]0 10 20 30 40 50 60 70 80 90 100 110

d1 d2 d3

d3

Sequence P1, P2 , P3

P1

P2

P3

0 10 20 30 40 50 60 70 80 90 100 110

d1 d2

Sequence P2, P3 , P1

t [ms]

Real-time violation

Slide 288 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Branch-and-Bound SchedulingRT-Scheduling

P1

P2

P3

t [ms]0 10 20 30 40 50 60 70 80 90 100 110

d1 d2 dd3

Sequence P2, P1 , P3

P1

P2

P3

0 10 20 30 40 50 60 70 80 90 100 110

d1 d2 d3

Sequence P2, P3 , P1

t [ms]

Real-time violation

Real-time violation

Page 73: LectureCA All Slides

73

Slide 289 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Branch-and-Bound SchedulingRT-Scheduling

P1

P2

P3

t [ms]0 10 20 30 40 50 60 70 80 90 100 110

d1 d2 d3

Sequence P3, P1 , P2

Real-time violation

P1

P2

P3

0 10 20 30 40 50 60 70 80 90 100 110

d1 d2 d3

Sequence P3, P2 , P1

t [ms]

Real-time violation

Real-time violation

Slide 290 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Branch-and-Bound SchedulingRT-Scheduling

P1

P1, P2

P1, P2 , P3

P1, P3

P1, P3 , P2

P2, P1 P2, P3

P2, P3 , P1

P3, P1 P3, P2

P3, P2 , P1

P2 P3

Ø

P2, P1 , P3 P3, P1 , P2

Search tree for the example

The only solution: P1 must be first, P2 must be second.

Slide 291 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Branch-and-Bound SchedulingRT-Scheduling

P1

P1, P2

P1, P2 , P3

P1, P3

P1, P3 , P2

P2, P1 P2, P3

P2, P3 , P1

P3, P1 P3, P2

P3, P2 , P1

P2 P3

Ø

For small n one may directly investigate the n! combinationsat the leafs. For bigger n it is recommended to start from the root and investigate all nodes (level by level). When a node violates the real-time conditionthe corresponding sub tree can be disregarded.

Slide 292 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Deadline SchedulingRT-Scheduling

Priority Scheduling. The process with the closest deadlinehas highest priority. When processes have the same deadline, selection is done arbitrarily or according to FCFS.

The deadline scheduling algorithm is also known as earliest deadline first (EDF). The algorithm is optimal for the one-processor case.If there is a solution, it is found. If none is found then there is no solution.

• Non-preemptiveThe algorithm is carried out after a running process finishes. Intermediaterequests are saved (interrupt flip-flops) meanwhile.

• PreemptiveThe algorithm is carried out when a request arrives (interrupt routine) orafter a process finishes.

Page 74: LectureCA All Slides

74

Slide 293 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Deadline SchedulingRT-Scheduling

13 ms5 ms0 msP4

2 ms

1 ms

4 msexecution time ∆e

7 ms0 msP3

7 ms0 msP2

5 ms0 msP1

deadline direquest time riProcess

Example RT.2: Non-preemptive scheduling

P1

P2

P3

P4

0 5 10 15 20t [ms]

d1 d2, d3 d4

Deadline is the same, choice is arbitrary.Could be sequence P3, P2 as well.

Slide 294 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Deadline SchedulingRT-Scheduling

P1

P2

P3

P4

0 5 10 15 20t [ms]

10 ms4 ms5 msP4

3 ms

3 ms

2 msexecution time ∆e

12 ms6 msP3

14 ms3 msP2

4 ms0 msP1

deadline direquest time riProcess

d1 d2d3d4

Example RT.3: Preemptive scheduling

Rem

embe

r, co

ntex

t sw

itch

time

is n

egle

cted

.

Slide 295 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Deadline SchedulingRT-SchedulingContinuation of example RT.3

t = 0 ms: Request for P1 arrives. Since there is no other process, P1 is scheduled.

t = 2 ms: P1 finishes. Since there are no requests, the scheduler has nothing to do.

t = 3 ms: Request for P2 arrives. Since there is no other process, P2 is scheduled.

t = 5 ms: Request for P4 arrives. The deadline d4 is closer than the deadline of therunning process P2. P4 has higher priority and is scheduled.

t = 6 ms: Request for P3 arrives. Deadline d3 is more distant than any other, so nothing changes. P4 continues.

t = 9 ms: P4 finishes. The closest deadline now is d3, so P3 is scheduled.

t = 12 ms: P3 finishes. The closest deadline now is d2, so P2 is scheduled again.

t = 13 ms: P2 finishes. There are no processes ready. Nothing to schedule.

Slide 296 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Deadline SchedulingRT-Scheduling

For multi-processor systems, the algorithm is not optimal.

4 ms

5 ms

8 msexecution time ∆e

9 ms0 msP3

9 ms0 msP2

10 ms0 msP1

deadline direquest time riProcess

Example RT.4: Three processes and two processors. Non-preemptive scheduling.

t [ms]

P1

P2

P3

0 5 10

Processor 1

Processor 2

Real-time violation P1

Page 75: LectureCA All Slides

75

Slide 297 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Real-Time Scheduling

then the processes can be scheduled on a single processor without real-time violation, if

11

≤∆∑

=

n

i

i

Te

idist

When there are n processes that are

periodic,

independent of each other,

preemptable,

and the response is to be delivered latestat the end of each period (that is TRmax = Tdist)

t

Tdist

TRmax

Scheduling

„Schedulability Test“

Slide 298 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Real-Time Scheduling

k · 200 ms15 msP3

k · 70 ms25 msP2

k · 30 ms15 msP1

deadline diexecution time ∆eProcessExample RT.5 :

1 0.9350.0750.360.520015

7025

3015

dist

≤=++=++=∆∑

=

n

i i

i

Te

1

P1

P2

P3

0 50 100 150 200t [ms]

5 ms

15 ms

15 ms

10 ms 10 ms 15 ms

The processes can be scheduled. Deadline scheduling would yield:

Scheduling

Slide 299 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Real-Time SchedulingContinuation of example RT.5

t = 0 ms: Requests for P1, P2, P3 arrive. P1 has closest deadline and is scheduled.

t = 15 ms: P1 finishes. The deadline of P2 is closer than the deadline of P3. P2 isscheduled.

t = 30 ms: Request for P1 arrives. Reevaluation of the deadlines yields that P1 hashighest priority. P1 is scheduled.

t = 45 ms: P1 finishes. The deadline of P2 still is closer than the deadline of P3. P2 isscheduled.

t = 55 ms: P2 finishes. The only waiting process is P3. P3 thus is scheduled.

t = 60 ms: Request for P1 arrives. Reevaluation of the deadlines yields that P1 hashighest priority. P1 is scheduled.

t = 70 ms: Request for P2 arrives. Deadline of P1 is closest, P1 continues....

Scheduling

Slide 300 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Real-Time SchedulingExample RT.6:

1.1350.420.2150.5125

143

42

dist

=++=++=∆∑

=

n

i i

i

Te

1

deadlines diexecution time ∆eProcess

5 ms

3 ms

2 ms

k · 12 msP3

k · 14 msP2

k · 4 msP1

This means an overutilization of the microprocessor. The processor would have to execute more than one process at a time (which is impossible). Therefore there is no schedule that would not violate the real-time condition sooner or later (on a single-processor system). The schedulability test failed.

Scheduling

Page 76: LectureCA All Slides

76

Slide 301 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Laxity SchedulingRT-Scheduling

Priority Scheduling. The process with the leastlaxity has highest priority. For equal laxities the selection policy is arbitrary or FCFS.

Laxity: ∆lax = (d - now) – ∆e

The laxity is the period of time left in which a process can be started without violating its deadline. Latest when the laxity is 0 the process must be started, otherwise it will not finish in time.The execution time ∆e of the process must be known, of course

t∆e

∆lax dnow

now is the point at time at which the laxity is(re)calculated. Usually this is the point in time at whicha new request arrives (preemptive scheduling) or at which a process finishes.

Slide 302 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Laxity SchedulingRT-Scheduling

4 ms

5 ms

8 msexecution time ∆e

9 ms0 msP3

9 ms0 msP2

10 ms0 msP1

deadline direquest time riProcess

Example RT.7: Three processes and two processors. Non-preemptive scheduling. Same as in example RT.4.

Deadline scheduling focuses on the deadline, but does not take into account the execution time ∆e of a process. Laxity scheduling does, it sometimes finds a solution that deadline scheduling does not find.

Processes now undergoing laxity scheduling (see next slide)

Slide 303 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Laxity SchedulingRT-Scheduling

t = 0 ms: Requests for P1, P2, P3 arrive. The laxities are: ∆lax1 = 2 ms, ∆lax2 = 4 ms,

∆lax3 = 5 ms. Least laxity is ∆lax1, so P1 is scheduled on processor 1.

Processor 2 is not yet assigned, so P2 is chosen (∆lax2 < ∆lax3).

t = 5 ms: P2 finishes. The only process waiting is P3, so it is scheduled.

t = 8 ms: P1 finishes. No new processes to schedule.

t [ms]

P3P2

0 5 10

Processor 1

Processor 2

P1

Continuation of example RT.7

No real-time violation as opposed to the deadline scheduling example RT.4

Slide 304 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Laxity SchedulingRT-Scheduling

5 ms3 ms0 msP3

5 ms

5 ms

1 msexecution time ∆e

8 ms0 msP4

6 ms0 msP2

1 ms0 msP1

deadline direquest time riProcess

Example RT.8: Four processes and two processors. Non-preemptive scheduling.

Laxity scheduling, like deadline scheduling,is generally not optimal for multi-processors.That is, it does not always find a solution.

Continuation on next slide

Page 77: LectureCA All Slides

77

Slide 305 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Laxity SchedulingRT-Scheduling

t = 0 ms: Requests for P1, P2, P3, P4 arrive. The laxities are: ∆lax1 = 0 ms,∆lax2 = 1 ms, ∆lax3 = 2 ms, ∆lax4 = 3 ms. Least laxity is ∆lax1,so P1 is scheduled on processor 1. Second least laxity is ∆lax2,so P2 is chosen for processor 2.

t = 1 ms: P1 finishes. Least laxity is ∆lax3 (now 1ms), so P3 is scheduled onprocessor 1.

t = 4 ms: P3 finishes. Least laxity is ∆lax4 (now -1 ms), so P4 is scheduled onprocessor 1 ... but it is already too late (negative laxity).

t [ms]

P4

P2

0 5 10

Processor 1

Processor 2

P1

Continuation of example RT.8

P3

d4

Real-time violation P4

Slide 306 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Laxity SchedulingRT-Scheduling

t [ms]

P4

P2

0 5 10

Processor 1

Processor 2

P1

Continuation of example RT.8

P3

However, there exists a schedule that works well:

Scheduling non-preemptive processes in a multi-processor system is a complex problem.This is even the case in a two-processor system when all request times ri are the same

and all deadlines di are the same.

Non-violating schedulefound through deadline scheduling

d4

Slide 307 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Rate Monotonic SchedulingPriority scheduling for periodical preemptive processes where the deadlines are equal to the periods. The process with highestfrequency (repetition rate) has highest priority. Static scheduling.

tTechnical process 1

Tdist

tTechnical process 2

Tdist

Computer process P2 has higher priority than process P1 since its rate is higher.

Although the algorithm is not optimal, it is often used in real-time applications because it is fast and simple (at run time!). Note, static scheduling!

Slide 308 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Rate Monotonic SchedulingThe classic static real-time scheduling algorithm for preemptable, periodic processes is RMS (Rate Monotonic Scheduling). It can be used for processes that meet the following conditions:

Each periodic process must complete within its period.No process is dependent on any other process.Each process needs the same amount of CPU time on each burst.Any non periodic processes have no deadlines.Process preemption occurs instantaneously and with no overhead.

RMS works by assigning each process a fixed priority equal to the frequency of occurrence of its triggering event. For example, a process that must run every 30ms (= 33Hz) receives priority 33, a process that must run every 40ms (= 25 Hz) receives priority 25. The priorities are linear with the rate, this is why it is called rate monotonic.

A more thorough explanation from [Ta01 p.472]

Page 78: LectureCA All Slides

78

Slide 309 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Rate Monotonic Scheduling

(k+1) · 50 ms5 msk · 50 msC

15 ms

10 msexecution time ∆e

(k+1) · 40 msk · 40 msB

(k+1) · 30 msk · 30 msAdeadline direquest time riProcess

Three periodic processes [Ta01 p.471]

Figure from [Ta01 p.471]

Example RT.9

Slide 310 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Rate Monotonic Scheduling

The processes A, B, C scheduled with

Continuation Example RT.9

Rate Monotonic Scheduling (RMS),Deadline scheduling (EDF).

Figure from [Ta01 p.473]

Slide 311 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Rate Monotonic SchedulingContinuation Example RT.9

Up to t = 90 the choices of EDF and RMS are the same. At t = 90 process A is requested again. The RMS scheduler votes for A (process A4 in the figure) since its priority is higher than the priority of B, thus B is interrupted.The deadline scheduler in contrast has a choice because the deadline of A is the same as the deadline of B (dA = dB = 120). In practice, preempting B has some nonzero cost associated, therefore it is better to let B continue.

See next example (Example RT.10) to dispel the ideathat RMS and EDF would always give same results.

Slide 312 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Rate Monotonic Scheduling

(k+1) · 50 ms5 msk · 50 msC

15 ms

15 msexecution time ∆e

(k+1) · 40 msk · 40 msB

(k+1) · 30 msk · 30 msAdeadline direquest time riProcess

Example RT.10: Like RT.9 but process A now has 15ms execution time

0.9750.10.3750.5505

4015

3015

dist

=++=++=∆∑

=

n

i i

i

Te

1

The schedulability test yields that the processes are schedulable.

Nevertheless, RMS fails in this example while EDF does not. →

Page 79: LectureCA All Slides

79

Slide 313 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Rate Monotonic SchedulingContinuation Example RT.10

RMS leads to a real-time violation. Process C is missing its deadline dC = 50.

Figure from [Ta01 p.474]

Slide 314 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Rate Monotonic Scheduling

For n = 2 processes, RMS will work for sure if the CPU utilization is below 0.828.For n = 3 processes, RMS will work for sure if the CPU utilization is below 0.780.For n → ∞ processes, RMS will ... if the CPU utilization is below ln 2 (0.694).

Why did RMS fail?

Using static priorities only works if the CPU utilization is not too high.It was proved* that RMS is guaranteed to work for any system of periodic processes if

).12(1

1−⋅≤

∆∑=

nn

i

i nT

e

idist

* C.L. Liu, James Layland: Scheduling Algorithms for Multiprogramming in a Hard-Real-TimeEnvironment, Journal of the ACM, 1973, http://citeseer.ist.psu.edu/liu73scheduling.html

Slide 315 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Rate Monotonic SchedulingIn example RT.9 the utilization was 0.808 (thus higher than 0.780),why did it work?

We were just lucky. With different periods and execution times, a utilization of 0.808 might fail. In example RT.10 the utilization was so high that there was little hope RMS could work.

In contrast to RMS, deadline scheduling always works for any schedulable set of processes (single-processor system). Deadline scheduling can achieve 100% CPU utilization. The price paid is a more complex algorithm [Ta01 p.475]. Because RMS is static all priorities are known at run time. Selecting the next process is a matter of just afew machine instructions.

Slide 316 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Real-Time Scheduling

staticscheduling

dynamic scheduling

dynamic scheduling

static scheduling

Preferably used in

Highest repetition rate (frequency) has highest priority. Execution time is not taken into account.

Least laxity has highest priority.Execution time is taken into account.

Earliest deadline has highest priority. Execution time is not taken into account.

Try all permutations of processes.Description

German Name

“Planen nachmonotonen Raten”

“Planen nachSpielräumen”

“Planen nachFristen”

“Planen durchSuchen”

RMSLaxityDeadline(EDF)

Branch and Bound

Overview real-time scheduling algorithms

Page 80: LectureCA All Slides

80

Slide 317 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Computer Architecture

Processes (153)

Threads (178)

Interprocess Communication (IPC) (195)

Scheduling (247)

Real-Time Scheduling (278)

Deadlocks (318)

Slide 318 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Deadlocks

{request(resource1);request(resource2);...release(resource1);release(resource2);

}

Process 1

{request(resource2);request(resource1);...release(resource2);release(resource1);

}

Process 2

Consider two processes requiring exclusive access to some sharedresources (e.g. file, tape-drive, printer, CD-Writer).

Fictitious system call for requesting exclusive access to a resource. When access cannot be granted, the call blocks until the resource is available.

Slide 319 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Deadlocks

When the two processes are executed sequentially (one after the other), no problem arises.

{request(resource1);request(resource2);...release(resource1);release(resource2);

} Process 1

{request(resource2);request(resource1);...release(resource2);release(resource1);

}

time

Process 2

Slide 320 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Deadlocks

When process 1 has acquired the resources before process 2starts trying the same, no problem arises. Process 2 just has to wait.

{request(resource1);request(resource2);...release(resource1);release(resource2);

} Process 1

time

{request(resource2);

request(resource1);...release(resource2);release(resource1);

} Process 2

blocked

Page 81: LectureCA All Slides

81

Slide 321 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Deadlocks

Occasionally, when both processes are carried out in parallel asdepicted above, both their attempts to acquire the missing resource will cause the processes to block. Since each process holds a resource that the other one needs, and since each process cannotrelease its resource, both processes do wait forever (deadlock).

{request(resource1);request(resource2);

Process 1

time

{request(resource2);request(resource1);

Process 2

blocked blocked

Slide 322 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

DeadlocksA set of processes is deadlocked when each process in the set is waiting for an event that only another process in the set can cause.

Waiting for an event:• Waiting for the availability of a resource

• Waiting for some input

• Waiting for a message (IPC) or a signal• or any other type of event that a

process is waiting for in order to continue

Slide 323 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Deadlocks

Yields to carat right

Yields to carat right

Yields to carat right

Yields to carat right

Classical deadlock problem from the non-computer world

Every car is ought to give way to the car on the right.

None will proceed.

Figure from lecture slides „Computer Architecture“ WS 05/06 (Basermann / Jungmaier)

Slide 324 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Resources• Anything a process / thread needs to continue

Examples: I/O-devices like printer, tape, CD-ROM, files, but also internal resources such as process table, thread table, file allocation table or semaphores / mutexes.

Deadlocks

• Exclusive accessOnly one process at a time can use the resource (e.g. printer or writingto a shared file).

• Non-exclusive accessMore than one process can use the resource at the same time (e.g. reading from a shared file)

• Preemptable resourcesThe resource can (with some non-zero cost) be temporarily taken away from a process and given to another process (e.g. memory swapping).

• Non-preemptable resourcesThe resource cannot be temporarily assigned to another process (e.g. printer, CD-Writer) without leading to garbage.

Page 82: LectureCA All Slides

82

Slide 325 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Deadlocks

• Mutual ExclusionEach resource is either currently assigned to exactly one process or is available.

• Hold and WaitA process currently holding a resource can request new resources.

• Non-preemptable resourcesResources previously granted cannot be forcibly taken away from a process.

• Circular WaitThere must be a circular chain of processes, each of which is waiting for a resource held by another process in the chain.

The following four conditions must be present for a deadlock to occur.

If one of these conditions is absent, no deadlock is possible

Slide 326 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Deadlock ModelingResource allocation graphs

a) Holding a resource (Process A holds resource R)b) Requesting a resource (Process B requests resource S)c) Deadlock situation: Process D requests U which is held by process C.

Process C requests T which is held by D.

Deadlocks

Process Resource

Figure from [Ta01 p.165]

Slide 327 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

A B C

Deadlock ModelingDeadlocks

Figure from [Ta01 p.166]

time

Resource allocation order leading to a deadlock

Slide 328 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

(o) (p) (q)

Deadlock ModelingDeadlocks

Figure from [Ta01 p.166]

time

Example of resource allocation not resulting in a deadlock

Page 83: LectureCA All Slides

83

Slide 329 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

DeadlocksStrategies for dealing with deadlocks:

1. Ignore the problemSounds silly, but in fact many operating systems do exactly this – assuming that deadlocks occur rarely.

2. Detection & RecoveryThe OS tries to detect deadlocks and then takes some recovery action.

3. AvoidanceResources are granted in such a way that deadlocks cannot occur.

4. PreventionTrying to break at least one of the four conditions such thatno deadlock can happen.

Slide 330 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

DeadlocksStrategy 1 (Ignoring the problem)

Most operating systems, including UNIX and Windows, just ignore the problem on the assumption that most users would prefer an occasional deadlock to a rule restricting all users to one process, one open file, and one of everything.

If deadlocks could be eliminated for free, there would not be much discussion. But the price is high. If deadlocks occur on the average once a year, but system crashes owing to hardware failures and software errors occur once a week, nobody would be willing to pay a large penalty in performance or convenience to eliminate deadlocks (After Ta01 p.167 ).

For that, the deadlock problem often is disregarded.

Slide 331 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

DeadlocksStrategy 2 (Detection & Recovery)The operating system tries to detect deadlocks and to recover.

Process A holds R and wants S

Process B holds nothing and wants T

Process C holds nothing and wants S

Process D holds U and wants S and T

Process E holds T and wants V

Process F holds W and wants S

Process G holds V and wants U.

Example DL.1 : Consider the following system state:

Is the system deadlocked, and if so, which processes are involved?

Slide 332 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Strategy 2

Constructing the resource allocation graph (a):

Continuation of example DL.1 (deadlock detection)

The extracted cycle (b) shows the processes and resources involved in a deadlock.

Figure from [Ta01 p.169]

deadlock

Deadlocks

Page 84: LectureCA All Slides

84

Slide 333 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Strategy 2Deadlock detection with multiple instances of a resource type

n processes: P1 ... Pn

m resource classes

We have (respectively we define):

Ei = the number of existing resource instances of resource class i, 1 ≤ i ≤ m.

E is the existing resource vector, E = (E1 ... Em).

A is the available resource vector. Each Ai in A gives the number of currently available resource instances. A = (A1 ... Am).

Deadlocks

Relation X ≤ Y is defined to be true if each Xi ≤ Yi.

Slide 334 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Strategy 2

Figure from [Ta01 p.171]

Deadlock detection with multiple instances of a resource typeDeadlocks

Definition of current allocation matrix and request matrix:

P1P2

Slide 335 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Strategy 2Deadlock detection with multiple instances of a resource type

Deadlocks

Deadlock detection algorithm:

1. All processes are initially unmarked

2. Look for an unmarked process Pi for which row Ri ≤ AHere the algorithm is looking for a process that can be run to completion (theresource demands of the process can be satisfied immediately).

3. If such a Pi is found, add row Ci to A and mark Pi. Go to step 2.After Pi is (or would have) finished, its resources are given back to the pool.The process is marked (in the sense of ‘successful completion’).

4. If no such process exists, terminate.

All unmarked processes, if any, are deadlocked!

Slide 336 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Strategy 2Example DL.2 (deadlock detection algorithm):

Deadlocks

Figure from [Ta01 p.173]

Consider the following system state:

Is there (or will there be) a deadlock in the system?

Page 85: LectureCA All Slides

85

Slide 337 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Strategy 2Continuation of example DL.2 (deadlock detection algorithm) Deadlocks

Checking P1: R1 is not ≤ A (CD-ROM is missing). P1 cannot run and is not marked.

Checking P2: R2 is not ≤ A (Scanner is missing). P2 cannot run and is not marked.

Checking P3: R3 is ≤ A, thus P3 can run and is marked. The resources are given backto the pool. A = (2 2 2 0).

Checking P1: R1 still is ≤ A (CD-Rom still not available).

Checking P2: R2 now is ≤ A, thus P2 can run and is marked. The resources are givenback to the pool. A = (4 2 2 1).

Checking P1: R1 now is ≤ A. P1 can run and is marked. The resources are givenback to the pool. A = (4 2 3 1) = E.

No more unmarked processes: termination.

No deadlocks.

Slide 338 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Strategy 2Example DL.3 (deadlock detection algorithm):

Deadlocks

Same as DL.2 but now C2 = (2 1 0 1) and thus A = (2 0 0 0).

The entire system is deadlocked!

Checking P1: R1 is not ≤ A (CD-ROM is missing). P1 cannot run and is not marked.

Checking P2: R2 is not ≤ A (Scanner is missing). P2 cannot run and is not marked.

Checking P3: R3 is not ≤ A (Plotter is missing). P3 cannot run and is not marked.

All processes checked. Nothing will change: termination.

Slide 339 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Strategy 2Detection & Recovery

• Resource PreemptionForcibly taking away a resource from a process. May have ill side effects.Difficult or even impossible in many cases.

• Process RollbackA process periodically writes its complete state to file (checkpointing). In case of a deadlock, the process is rolled back to an earlier state in which it occupied lesser resources. Program(ming) overhead!

• Killing ProcessesCrudest but simplest method. One or more processes from the chain are terminated and must be started all over again at some later point in time. May also cause ill effects – consider a process updating a data base twice instead of once.

Deadlocks

Slide 340 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

DeadlocksStrategy 3 (Avoidance)Do not allow system states that may result in a deadlock.

A state is said to be safe when it is not deadlocked and there exists some scheduling order in which every process can run to completion even if all of them request their maximum number of resources. An unsafe state mayresult in a deadlock, but does not have to.

Assume there is a total number of 10 instances available.Then the state is a safe state since there is a way to run all processes.

maximum number of resource instances needed (requests)

number of resource instances currently held (allocation)

Page 86: LectureCA All Slides

86

Slide 341 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Strategy 3

a) starting situation (question: is this a safe state?).There are 3 resources left in the pool.

b) B is granted 2 additional resources.

c) B has finished. Now 5 resources are free.

d) C is granted another 5 resources.

e) C has finished. Now 7 resources are free.Process A can be run without problems. Thus (a) is a safe state.

Deadlocks

(a) (b) (c) (d) (e)

Figure from [Ta01 p.177]

Slide 342 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Strategy 3

a) starting situation as before (this is a safe state)

b) A is granted one additional resource.

c) B is granted the remaining 2 resources.

d) B has finished. A and C cannot run because each of themneeds 5 resources to complete. Deadlock.

Deadlocks

Any other sequence starting from (b) also ends up in a deadlock.Therefore state (b) is an unsafe state. The move from (a) to (b) was bringing the system from a safe state to an unsafe state.

(a) (b) (c) (d)

Figure from [Ta01 p.177]

Slide 343 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Strategy 3DeadlocksBanker’s Algorithm (Dijkstra 1965)

Think of a small-town banker who deals with a group of customers to whom he has granted lines of credit. If granting a request leads to an unsafe state, the request is denied. If a request leads to a safe state, the request is granted.

Knowing that not all customers need their credit line immediately, the banker has reserved 10 money units instead of 22 to service them.

Initial state

There are four customers (processes) demandingfor a total of 22 money units (resources).

The banker (operating system) has provided 10money units in total.

Slide 344 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Strategy 3DeadlocksContinuation Banker’s Algorithm

The banker’s algorithm considers each request as it occurs. A request is granted when the state remains safe, otherwise the request is postponed until later.

a) Initial state (safe)

b) Safe state: C’s maximum request can be satisfied. When C has paid back the 4 money units, B’s request (or D’s) can be satisfied. ...

c) Unsafe state: If any of the customers requests the maximum,the banker would be stuck (deadlock). Figure from [Ta01 p.178]

(b)(a) (c)

Page 87: LectureCA All Slides

87

Slide 345 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Strategy 3DeadlocksBanker’s Algorithm for multiple resource instances

Current allocation matrix C Request matrix R

Existing

Possessed (allocated)

Available

Figure from [Ta01 p.179]

Slide 346 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Strategy 3DeadlocksBanker’s Algorithm for multiple resource instances

1. Look for a row Ri whose unmet requirements are smaller than (or equal) to A. If no such row exists, the system will deadlock sooner or later since no process can run to completion.

2. Assume the process of the row chosen requests its maximum resources (which is guaranteed to be possible) and finishes. Mark the process as terminated and add its resources to the pool A.

3. Repeat steps 1 and 2 until either all processes are marked (in which case the initial state was safe), or until a deadlock occurs (in which case the initial state was unsafe).

Slide 347 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Strategy 3DeadlocksBanker’s Algorithm for multiple resource instances

The pool is A = (1 0 2 0).

Process D can be scheduled next because (0 0 1 0) < (1 0 2 0).When finished, the pool is A = (1 0 1 0) + (1 1 1 1) = (2 1 2 1) .

Process A can be scheduled because (1 1 0 0) < ( 2 1 2 1).When finished, the pool is A = (1 0 2 1) + (4 1 1 1) = (5 1 3 2).

Process B can be scheduled because (0 1 1 2) < (5 1 3 2).When finished, the pool is A = (5 0 2 0) + (0 2 1 2) = (5 2 3 2).

Process C can be scheduled because (3 1 0 0) < (5 2 3 2).When finished, the pool is A = (2 1 3 2) + (4 2 1 0) = (6 3 4 2).

Process E can be scheduled because (2 1 1 0) < (6 3 4 2).When finished, the pool is A = (4 2 3 2) + (2 1 1 0) = (6 3 4 2).

Slide 348 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Strategy 3DeadlocksBanker’s Algorithm for multiple resource instances

The state shown is a safe state since we have found at least oneway to complete all processes. Other sequences are possible.

No more processes. All processes have successfully completed.

In practice the banker’s algorithm is of minor use, because

processes rarely know in advance the maximum numberof resources needed,

the number of processes is not constant over time as userslog in and out (or other events require computational attention).

Page 88: LectureCA All Slides

88

Slide 349 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

DeadlocksStrategy 4 (Deadlock Prevention)Break (at least) one of the four conditions for a deadlock.

• Avoiding mutual exclusionSometimes possible. Instead of using a printer exclusively, the processes write into a print spooler directory. This way several processes can use the printer at the same time. However, an internal system table (e.g. process table) cannot be spooled. Similar applies to a CD-Writer.

• Breaking the “hold and wait”Processes request all their resources at once (“either all or none”). However, not all processes know their demand from the beginning. Moreover, the resources are not optimally used then (degradation in multi-programming).

Variation: each time an additional resource is needed, the process releases all its resources first and then tries to acquire all of them at once. This way a process does not occupy resources while waiting for a new one.

Slide 350 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Strategy 4

• Attacking the “no preemption” conditionForcibly removing a resource from a process is barely possible.

• Breaking “circular wait”Provide a global numbering of all resources (ranking). Resource requests must be made in ascending order. This way a resource allocation graph can have no cycles. In the figure, B cannot request the scanner – even if it would be available.

Deadlocks

1. Imagesetter2. Scanner3. Plotter4. Tape drive5. CD-Rom drive

A B

Scanner Plotter

However, not all resources allow for a reasonable order. How to order table slots, disk spooling space, locked database records?

Slide 351 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Computer Architecture

Memory Management

Slide 352 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Memory Management

Memory (353)

Paging (381)

Segmentation (400)

Paged Segments (412)

Virtual Memory (419)

Caches (471)

Page 89: LectureCA All Slides

89

Slide 353 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

MemoryCore Memory

Image source: http://www.psych.usyd.edu.au/pdp-11/core.html

• Period: 1950 ... 1975

• Non-volatile

• Matrix of magnetic cores

• Storing a bit by changing themagnetic polarity of a core

• Access time 3µs ... 300ns• Destructive read

After reading a core, the content is lost. A read cyclemust be followed by a write cycle i.o. to restore.

Slide 354 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

MemorySemiconductor Memory (≈1970 ...)

• Storing a bit by charging a capacitor(sometimes just the self-capacitance of a transistor)

• One transistor per bitHigh density / capacity per area unit

• Volatile

• Destructive read

• Self-dischargingPeriodic refresh needed

Dynamic memory (DRAM)

Image source: http://www.research.ibm.com/journal/rd/391/adler.html

Memory Management

Slide 355 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

MemorySemiconductor Memory (≈1970 ...)

• Storing a bit in a flip-flopSetting / Resetting the flip-flop

• 6 transistors per bitMore chip area than with DRAM

• Volatile

• Non-destructive read

• No self-discharge

• Fast!

Static memory (SRAM)

Image source: Wikipedia on „SRAM“ (English)

Memory Management

Slide 356 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Memory Hierarchy

Memory hierarchy levels in typical desktop / server computers, figure from [HP06 p.288]

Program(mer)s want unlimited amountsof fast memory. Economical solution: Memory hierarchy.

Memory Management

Page 90: LectureCA All Slides

90

Slide 357 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Main Memory

• Central to computer system

• Large array of words / bytes

• Many programs at a timefor multi-programming / tasking to be effective

OperatingSystem

program 1

Memory layout of a time sharing system →

program 2

program 3

program 4

program n

program 5

program 6

WorkingMemory

Slide 358 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Address Binding• Program = binary executable file• Code/data accessible via addresses

...i = i + 1;check(i);...

Addresses in the source code are symbolic,here: i (a variable) and check (a function).

The loader finally binds the relocatable addresses to absolute addresses, such as „i is at 74014“ when loading the code into memory.

The compiler typically binds the symbolic addresses to relocatable addresses, such as „i is 14 bytes from the beginning of the module“. The compiler may also

be instructed to produce absolute addresses (non-relocatable code).

Memory Management

Slide 359 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Address Binding Schemes

• Compile time (Program creation)The resulting code is absolute code. All addresses are absolute. The program must be loaded exactly to a particular logical address in memory.

• Load timeThe code must be relocatable, that is, all addresses are given as an offset from some starting address (relative addresses). The loader calculates and fills in the resulting absolute addresses at load time (before execution starts).

• Execution timeThe relocatable code is executed. Address translation from relative to absolute addresses takes place at execution time (for every single memory access). Special hardware needed (MMU).

The binding of code and data to logicalmemory addresses can be done at three stages:

Memory Management

Slide 360 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Logical / Physical AddressesLogical AddressThe address generated by the CPU, also termed virtual address. All logical addresses form the logical (virtual) address space.

Physical AddressThe address seen by the memory. All physical addresses form the physical address space.

In compile-time and load-time address-binding schemesthe logical and the physical addresses are the same.

In execution-time address-binding the logical and physical addresses differ.

Memory Management

Page 91: LectureCA All Slides

91

Slide 361 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Memory Management UnitHardware device that maps logical addresses tophysical addresses (MMU).

A program (a process) deals with logical addresses, it never sees the real physical addresses.

Figure from [Sil00 p.258]

Memory Management

Slide 362 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Protection• Protecting the kernel against user processes

No user process may read, modify or even destroy kernel data (or kernel code). Access to kernel data (system tables) only through system calls.

• Protecting user processes from one anotherNo user process may read or modify other processes` data or code. Any data exchange between processes only via IPC.

MMU equipped with limit register

• Loaded with the highest allowed logical addressThis is done by the dispatcher as part of the context switch.

• Any address beyond the limit causes an error• Assumption: contiguous physical memory per process

Memory Management

Slide 363 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Protection

Figure from [Sil00 p.266]

Limit register for protecting process spaces against each other

Memory Management

Slide 364 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Memory Occupation

• Dynamic Loading„Load what is needed when it is needed“.

• Overlays„Replace code by other code“.

• Dynamic Linking (Shared Libraries) „Use shared code rather than back-pack everything“.

Obtaining better memory-space utilizationInitially the entire program plus its data (variables) needed to be in memory

• Swapping„Temporarily kick out a process from memory“.

Memory Management

Page 92: LectureCA All Slides

92

Slide 365 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Dynamic Loading

• Routines are kept on diskMain program is loaded into memory.

• Routine loaded when neededUpon each call it is checked whether the routine is in memory.If not, the routine is loaded into memory.

• Unused routines are never loadedAlthough the total program size may be large, the portion that isactually executed can be much smaller.

• No special OS support requiredDynamic loading is implemented by the user. System libraries (and corresponding system calls) may help the programmer.

Memory Occupation

Slide 366 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Overlays

• Existing code is replaced by new codeSimilar to dynamic loading, but instead of adding new routines to the memory, existing code is replaced by the loaded code.

• No special OS support requiredOverlay technique implemented by the user.

Example: Consider a two-pass assembler

Pass 1 70 kBPass 2 80 kBSymbol table 20 kBCommon routines 30 kB

Pass 1 and pass 2 do not need to be in memory at the same time → Overlay

Loading everything at oncewould require 200 kB.

Memory Occupation

Slide 367 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Memory

Overlays

Figure from [Sil00 p.262]

Pass 1, when finished, is overlayed by pass 2.An additional overlay driver is needed (10 kB), but the total memory requirement now is 140 kB instead of 200 kB.

Memory Occupation

Slide 368 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Dynamic Linking• Different processes use same code

This especially true for shared system libraries (e.g. reading from keyboard,graphical output on screen, networking, printing, disk access).

• Single copy of shared code in memory Rather than linking the libraries statically to each program (which increases the size of each binary executable), the libraries (or individual routines) are linked dynamically during execution time. Each library only resides once in physical memory.

• „Stub“is a piece of program code initially located at the library references in the program. When first called it loads the library (if not yet loaded) and replaces itself with the address of the library routine.

• OS support requiredsince a user process cannot look beyond its address space whether (and where) the library code may be located in physical memory (protection!).

Memory Occupation

Page 93: LectureCA All Slides

93

Slide 369 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Swapping• A process can be swapped temporarily out of memory to a

backing store, and then brought back into memory for continued execution.

• Backing store: fast disk large enough to accommodate copies of all memory images for all users; must provide direct access to these memory images.

• Roll out, roll in – swapping variant used for priority-based scheduling algorithms; lower-priority process is swapped out so higher-priority process can be loaded and executed.

• Major part of swap time is transfer time; total transfer time isdirectly proportional to the amount of memory swapped.

Memory Occupation

Slide 370 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Swapping

Figure from [Sil00 p.263]

Figure: Process P1 is swapped out, and process P2 is swapped in.

Memory Occupation

Slide 371 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Memory Allocation

• ContiguousThe physical memory space is contiguous (linear) for each process.

• Non-ContiguousThe physical memory space per process is fragmented (has holes).

Fixed-sized partitionsVariable sized partitionsPlacement schemes: first fit, best fit, worst fit

PagingSegmentationCombination of Paging and Segmentation

Allocation of physical memory to a processMemory Management

Slide 372 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

OperatingSystem

process 1

process 2

process 4

process 3

free partition

Contiguous Memory Allocation

• Fixed-sized partitionsMemory is divided into fixed sized partitions. Originally used by IBM OS/360, no longer in use today.

The physical memory allocated to a process is contiguous (no holes).

Simple to implement

Degree of multiprogramming is bound by the number of partitions

Internal fragmentation

Page 94: LectureCA All Slides

94

Slide 373 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Contiguous Memory AllocationThe physical memory allocated to a process is contiguous (no holes).

OS must keep a free listlisting free memory (holes)

OS must provide placement scheme

Degree of multiprogramming onlylimited by available memory

No (or very little) internal fragmentation

External fragmentationThe holes may be too small for a new process

• Variable-sized partitionsPartitions are of variable size.

OperatingSystem

process 2

process 4

process 3

process 1

Slide 374 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

CompactionReducing external fragmentation (for variable-sized partitions)

OperatingSystem

process 2

process 4

process 3

process 1

OperatingSystem

process 2

process 4

process 3

process 1

Copy operation is expensive

free memory

Slide 375 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Placement SchemesSatisfying a request of size n from a list of free holes.

• First fitFind the first hole that is large enough. Fastest method.

• Best fitFind the smallest hole that is large enough. The entire list must be searched (unless it is sorted by hole size). This strategy produces the smallest leftover hole.

• Worst fitFind the largest hole. Search entire list (unless sorted). This strategy produces the largest left-over hole, which may be more useful than the smallest leftover hole from the best-fit approach.

General to the following schemes: find a large enough hole, allocate the portion needed, and return the remainder (leftover hole) to the free list.

Slide 376 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

First FitExample: we need this amount of memory:Search starts at the bottom.

OperatingSystem

process 2

process 4

process 3

process 1

OperatingSystem

process 2

process 4

process 3

process 1

Search

leftover hole

The first hole encountered is large enough.

Page 95: LectureCA All Slides

95

Slide 377 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Best FitExample: we need this amount of memory:Search starts at the bottom.

OperatingSystem

process 2

process 4

process 3

process 1

OperatingSystem

process 2

process 4

process 3

process 1

Search

leftover hole

We have to search all holes. The top hole fits best.

This scheme creates the smallest leftover hole among the three schemes.

Slide 378 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Worst FitExample: we need this amount of memory:Search starts at the bottom.

OperatingSystem

process 2

process 4

process 3

process 1

Search

OperatingSystem

process 2

process 4

process 3

process 1

leftover hole

We have to search all holes. The bottom hole is found to be the largest.

This scheme creates the largest leftover hole among the three schemes.

Slide 379 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Memory Allocation

• ContiguousThe physical memory space is contiguous (linear) for each process.

• Non-ContiguousThe physical memory space of a process is fragmented (has holes).

Fixed-sized partitionsVariable sized partitionsPlacement schemes: first fit, best fit, worst fit

PagingSegmentationCombination of Paging and Segmentation

Allocation of physical memory to a process

Slide 380 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Memory Management

Memory (353)

Paging (381)

Segmentation (400)

Paged Segments (412)

Virtual Memory (419)

Caches (471)

Page 96: LectureCA All Slides

96

Slide 381 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Paging• Physical address space of a process can

be non-contiguous

• Physical memory divided into fixed-sized framesFrame size is power of 2, between 512 bytes and 8192 bytes

• Logical memory divided into pagesPrage size is identical to frame size.

• OS keeps track of all free frames (free-frame list)

• Running a program of size n pages requiresfinding n free frames

• Page table translates logical to physical addresses.

• Internal fragmentation, no external fragmentation.

Slide 382 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Address TranslationAddress generated by CPU is divided into:

Page number p – used as in index into a page table which contains the base address f of the corresponding frame in physical memory.

Page offset d – the offset from the frame start,physical memory address = f + d.

page number page offset

logical address p dm – n n

Logical address is m bits wide. Page size = frame size = 2n.

Paging

Slide 383 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Paging

low memory

high memory

Figure from [Sil00 p.270]

Physical address = f + df = PageTable[p]p = m-n significant bits of logical addressd = n least significant bits

Slide 384 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Paging

Figure from [Sil00 p.271]

Paging model: logical address space is contiguous, whereasthe corresponding physical address space is not.

Page 97: LectureCA All Slides

97

Slide 385 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

PagingFigure from [Sil00 p.272]

n = 2 (page size is 4 byte)

m = 4 (logical address space is 16 byte)

10D = 1010 B

frame 0

frame 1

frame 2

frame 3

frame 4

frame 5

frame 6

frame 7

What is the physical address of k?

k is located at logical address 10D

0 201 242 43 8

Physical address = f + d = 4 + 2 = 6

frame number

10 10

p d

f = PageTable[2] = 4

PageTable

frame address

p = 2, d = 2.

Slide 386 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Free-Frame List

1413182015

free-frame list

page 0page 1page 2page 3

new process

15

free-frame list13

14

15

16

17

18

19

20

page 0

page 1

page 2

page 3

0 141 132 183 20

page table ofnew process

frame number

13

14

15

16

17

18

19

20

free

The OS must maintain a table of free frames (free-frame list)Paging

Slide 387 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Page-Table

• Dedicated registers within CPUOnly suitable for small memory. Used e.g. in PDP-11 (8 page registers,each page 8 kB, 64 kB main memory total). Fast access (high speed registers).

• Table in main memoryA dedicated CPU register, the page-table base register (PTBR), points to thetable in memory (the table currently in use). With each context switch the PTBR is reloaded (then pointing to another page table in memory).The actual size of the page table is given by a second register, the page table length register (PTLR).

Where to locate the page table?

With the latter scheme we need two memory accesses, one for the page table, and one for accessing the memory location itself. Slowdown!Solution: Special hardware cache: translation look-aside buffer (TLB)

Paging

Slide 388 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Translation Look-Aside Buffer

A translation look-aside buffer (TLB) is a small fast-lookup associative memory.

The associative registers contain page ↔ frame entries (key | value). When a page number is presented to the TLB, all keys are checked simultaneously. If the desired page number is not in the TLB, it must be fetched from memory.

Paging

key value

page numberframe address orframe number

5 120 141 134 42 186 159 173 20

2 18

Page 98: LectureCA All Slides

98

Slide 389 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Translation Look-Aside Buffer

Paging hardware with TLB. Figure from [Sil00 p.276]

Paging

Slide 390 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Memory Access TimeAssume: Memory access time = 100ns. TLB access time = 20ns

When page number is in TLB (hit):total access time = 20ns + 100ns = 120ns

When page number is not in TLB (miss):total access time = 20ns + 100ns + 100ns = 220ns

With 80% hit ratio:average access time = 0.8 · 120ns + 0.2 · 220ns = 140ns

With 98% hit ratio:average access time = 0.98 · 120ns + 0.02 · 220ns = 122ns

Paging

Slide 391 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

ProtectionWith paging the processes’ memory spaces are automatically protected against each other since each process is assigned its own set of frames.

If a page is tried to be accessed that is not in the page table (or is marked invalid -- see next slide), the process is trapped by the OS.

Figure from [Sil00 p.272]

frame 0

frame 1

frame 2

frame 3

frame 4

frame 5

frame 6

frame 7

0 201 242 43 8page table

frame address

Valid physical addresses:20 ... 2324 ... 2704 ... 0708 ... 11

Paging

Slide 392 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Frame AttributesEach frame may be characterized by additional bits in the page table.

Figure from [Sil00 p.277]

Valid / invalidWhether the frame is currentlyallocated to the process

Read-OnlyFrame is read-only

Execute-OnlyFrame contains code

SharedFrame is accessible toother processes as well.

Paging

Page 99: LectureCA All Slides

99

Slide 393 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Shared PagesImplementation of shared memory through paging is rather easy.

A shared page is a page whose frame is allocated to other processes as well. Many processes share a page in that each of the shared pages is mapped to the same frame in physical memory.

Shared code must be non-self modifying code (reentrant code).

Figure on the next slide:

Three processes are using an editor. The editor needs 3 pages for its code. Rather than loading the code three times into memory, the code is shared. It is loaded only once into memory, but is visible to each process as if it is their private code.

The data (the text edited), of course, is private to each process. Each process thushas its own data frame.

Paging

Slide 394 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Shared Pages

Figure from [Sil00 p.283]

Note:Free memory is shown in gray,occupied memory is in white.

0123

0123

0123

01

2

3

01

2

3

01

2

3

Pages 0,1,2 of each process are mapped to physical frames 3,4,6.

Slide 395 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

PagingLogical address space of modern CPUs: 232 ... 264

Assume: 32-bit CPU, frame size = 4K⇒ 232 / 212 = 220 page table entries (per process)

Each entry size = 20 bit + 20 bit = 5 byte20 bit for page number. 20 bit for frame number (less thanrequiring 32 bit for the frame address).

⇒ 220 x 5 byte = 5MB per page table!

page number frame numberpage table entry20 20

Slide 396 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Two-Level PagingOften, a process will not use all of its logical address space. Ratherthan allocating the page table contiguously in main memory (for the worst case), the page table is divided into small pieces and is paged itself.

outer page table

inner page table

output points to a frame containing pagetable entries (inner page table entries)

output points to finaldestination frame

Paging

Page 100: LectureCA All Slides

100

Slide 397 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Two-Level Pagingpage number page offset

logical address p2 d

10

p1

10 12

Numbers are for the 32-bit, 4kB frame, example

Figure from [Sil00 p.279]

max 210 entrieseach page of innertable has 210 entries

final destinationframe in memory

Paging

Slide 398 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Multi-Level Paging• Tree-Structure principle

Each outer page entry defines a root node of a tree.

Four-level paging with 98% hit rate:Effective access time = 0.98 · 120 ns + 0.02 · 520 ns = 128 ns

• Two / three / four – level pagingSPARC (32 bit): three-level paging.Motorola 68030 (32 bit): four-level paging.

• Better memory utilizationthan using a contiguous (and possibly maximum-sized) page table.

• Increase in access timesince we hop several times until final memory location is reached. Caching (TLB) however helps out a lot.

Paging

Slide 399 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Computer Architecture

Memory (353)

Paging (381)

Segmentation (400)

Paged Segments (412)

Virtual Memory (419)

Caches (471)

Slide 400 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

SegmentationUser views of logical memory:

Linear array of bytesReflected by the ‘Paging’ memory scheme

A collection of variable-sized entitiesUser thinks in terms of “subroutines”, “stack”, “symbol table”, “main program” which are somehow located somewherein memory.

Figure from [Sil00 p.285]

Segmentation supports this userview. The logical address space isa collection of segments.

Page 101: LectureCA All Slides

101

Slide 401 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Segmentation

1

3

2

4

Segmentation model: The user space (logical address space) consists of a collection of segments which are mapped through the segmentation architecture onto the physical memory.

User space

1

4

2

3

Physical memory

Slide 402 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Segmentation• Physical address space of a process can

be non-contiguous as with paging

• Logical address consists of a tuple<segment number, offset>

• Segment table maps logical address onto physical address

• Segment table can hold additional segment attributesLike with frame attributes (see paging).

• Shared SegmentsShared segments are mapped to the same segment in physical memory.

base: physical address of segmentlimit: length of segment

Slide 403 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Segmentation

Figure from [Sil00 p.286]

s selects the entry from the table. Offset d is checked against the maximum size of the segment (limit).Final physical address = base + d.

Slide 404 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Segmentation• Segments are variable-sized

Dynamic memory allocation required (first fit, best fit, worst fit).

• External fragmentationIn the worst case the largest hole may not be large enough to fit in a new segment.Note that paging has no external fragmentation problem.

• Each process has its own segment tablelike with paging where each process has its own page table. The size of the segment table is determined by the number of segments, whereas the size of the page table depends on the total amount of memory occupied.

• Segment table located in main memoryas is the page table with paging

• Segment table base register (STBR)points to current segment table in memory

• Segment table length register (STLR)indicates number of segments

Page 102: LectureCA All Slides

102

Slide 405 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

SegmentationExample:

A program is being assembled. The compiler determines the sizes of the individual components (segments) as follows:

1100 bytestack

1000 bytesubroutine

400 bytefunction sqrt()

1000 bytesymbol table

400 bytemain program

SizeSegment

Slide 406 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

SegmentationExample (continued):

The process is assigned 5 segments in memory as well as a segment table.

Figure from [Sil00 p.287]

Slide 407 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Shared Segments

Process P1 and P2 share the editor code. Segment 0 of each process is mapped onto the same physical segment at address 43062.

Figure from [Sil00 p.288]

The data segments are private to each process, so segment 1 of each process is mapped to its own segment in physical memory.

Segmentation

Slide 408 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Paging versus Segmentation• With paging physical memory is divided into fixed-size frames. When

memory space is needed, as many free frames are occupied as necessary. These frames can be located anywhere in memory, the user process always sees a logical contiguous address space.

• With segmentation the memory is not systematically divided. When a program needs k segments (usually these have different sizes), the OS tries to place these segments in the available memory holes. The segments can be scattered around memory. The user process does not see a contiguous address space, but sees a collection of segments (of course each individual segment is contiguous as is each page or frame).

Page 103: LectureCA All Slides

103

Slide 409 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Paging versus Segmentation

13

14

15

16

17

18

19

20

seg1

seg2

seg4

seg3

Paging is based on fixed-sizeunits of memory (frames)

Segmentation is based on variable-sizeunits of memory (segments)

free memorycan be allocated

unused memoryinternal fragmentation

Slide 410 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Paging versus Segmentation• Each process is assigned its page table.

• Page table size proportional to allocated memory

• Often large page tables and/or multi-level paging

• Internal fragmentation

• Free memory is quickly allocated to a process

• Motorola 68000 line is based on a flat address space

• Each process is assigned a segment table

• Segment table size proportional to number of segments

• Usually small segment tables

• External fragmentation.

• Lengthy search times when allocating memory to a process.

• Intel 80X86 family is based on segmentation

Pag

ing

Seg

men

tatio

n

Slide 411 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Memory Management

Memory (353)

Paging (381)

Segmentation (400)

Paged Segments (412)

Virtual Memory (419)

Caches (471)

Slide 412 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Paged Segments

13

14

15

16

17

18

19

20

seg1

seg2

seg4

seg3

Combining segmentation with paging yields paged segments

segmentation paged segments

With segmentation, each segment is a contiguous space in physical memory.

With paged segments, each segment is sliced into pages. The pages can be scattered in memory.

Page 104: LectureCA All Slides

104

Slide 413 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Paged Segments

13

14

15

16

17

18

19

20

seg1

seg2

seg4

seg3

Each segment has its own page table

15

16

17page table

14

13

18

20

logical process space

page table

page table

page table

frame numbers

unused memoryinternal fragmentation

physical memory

Slide 414 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Paged SegmentsThe MULTICS (predecessor of UNIX) operating system solved the problems of external fragmentation and lengthy search timesby paging the segments.

This solution differs from pure segmentation in that each segment table entry does not contain the base address of the segment, but rather contains the base address of a page table for this segment.

In contrast to pure paging where each process is assigned a pagetable, here each segment is assigned a page table.

The processes still see just segments – not knowing that the segments themselves are paged.

With paged segments there is no more time spent on optimal segment placing, however, there is introduced some internal fragmentation.

Slide 415 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Paged Segments

The logical address is a tuple <segment number s, offset d>. The segment number is added to the STBR (segment table base register) and by this points to a segment table entry. The segment table is located in main memory. From the entry the page table base is derived which points to the beginning of the corresponding page table in memory. The first part p of the offset ddetermines the entry in the page table. The output of the page table is the frame address f (or alternatively a frame number). Finally f + d´ is the physical memory address.

PageTable = SegmentTable[s].base;

f = PageTable[p];

final address = f + d´

Steps in resolving the final physical address:

logical address s p d´d

Explanation of next slide (principle of paged segments)

Slide 416 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Paged Segments

Principle of paged segments

Page 105: LectureCA All Slides

105

Slide 417 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Paged Segments

• Combination of segmentation and pagingUser view is segmentation, memory allocation scheme is paging

• Used by modern processors / architectures

CPU has 6 segment registerswhich act as a quick 6-entry segment table

Up to 16384 segments per process possiblein which case the segment table resides in main memory.

Maximum segment size is 4 GBWithin each segment we have a flat address scheme of 232 byte addresses

Page size is 4 kBA two-level paging scheme is used

Example: Intel 80386

Slide 418 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Computer Architecture

Memory (353)

Paging (381)

Segmentation (400)

Paged Segments (412)

Virtual Memory (419)

Caches (471)

Slide 419 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Virtual MemoryWhat if the physical memory is smaller than required by a process?

Dynamic LoadingOverlays

Require special precautions and extra work by the programmer.

It would be much easier if we would not have to worry about the memory size and could leave the problem of fitting a larger program into smaller memory to the operating system.

„Virtual Memory“

Memory is abstracted into an extremely large uniform array of storage, apart from the amount of physical memory available.

Slide 420 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Virtual Memory

• Based on locality assumptionNo process can access all its code and data at the same time, therefore theentire process space does not need to be in memory at all time instants.

• Only parts of the process space are in memoryThe remaining ones are on disk and are loaded when demanded

• Logical address space can be much largerthan physical address space

A program larger than physical memory can be executed

More programs can (partially) reside in memory which increases the degree of multiprogramming!

Page 106: LectureCA All Slides

106

Slide 421 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Virtual Memory

OS OS

program physical memory physical memory

backing store(usually a disk)

free memory

virtual memory conceptsize

Virtual memory concept (one program)

Slide 422 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Virtual Memory

OS

programA physical memory backing store

size

programB

programC

A‘

A‘‘

B‘

C‘ B‘‘ C‘‘

Virtual memory concept (three programs)

Slide 423 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Virtual Memory

• Demand SegmentationUsed in early Burroughs‘ computer systems and in IBM OS/2.Complex segment-replacement algorithms.

• Demand PagingCommonly used today. Physical memory is divided into frames (paging principle). Demand paging applies to both paging systems and paged segment systems.

Virtual memory can be implemented by means of

Figure next slide: Virtual memory usually is much larger than physical memory (e.g. modern 64-bit processors). The pages currently needed by a process are in memory, the other pages reside on disk. From the page table is known whether a page is in memory or on disk.

Slide 424 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

page table

Virtual Memory

Figure from [Sil00 p.299]

disk

Virtual memory consists of more pages than there are frames in physical memory

Page 107: LectureCA All Slides

107

Slide 425 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Demand PagingA page is brought from disk into memory when it is needed (when it is demanded by the process)

• Less I/Othan loading the entire program (at least for the moment)

• Less memory neededsince a (hopefully) great part of the program remains on disk

• Faster responseThe process can start earlier since loading is quicker

• More processes in memoryThe memory saved can be given to other processes

Loading a page on demand is done by the pager (a part of the operating system – usually a daemon process).

Virtual Memory

Slide 426 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Demand PagingQ: How does the OS know that a page is demanded by a process?

• If v = 1 ⇒ page is in memory• If v = 0 ⇒ page is in on disk• validity bit is also termed valid-invalid bit

A: When the process tries to access a page that is not in memory!A process does not know whether or not a page is in memory, only the OS knows.

Each page table entry has a validity bit (v)

During address translation, when the validity bit is found 0, the hardware causes a page fault trap to the operating system.

Virtual Memory

Slide 427 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Page Fault

1. A reference to some page is made

2. The page is not listed in the table (or is marked invalid) whichcauses a page fault trap (a hardware interrupt) to the operating system.

3. An internal table is checked (usually kept with the process control block) to determine whether the reference was a validor an invalid memory access. If the reference was valid, a free frame is to be found.

A page fault is the fact that a non-memory-resident pagewas tried to be accessed by some process.

Steps in demand paging:

Virtual Memory

Slide 428 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Page Fault

4. A disk operation is scheduled to read in the desired page into the free frame.

5. When disk read is complete, the internal tables are updated to reflect that the page now is in memory.

6. The process is restarted at the instruction that caused the page fault trap. The process can now access the page.

These steps are symbolized in the next figure →

Virtual Memory

Page 108: LectureCA All Slides

108

Slide 429 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Virtual Memory

Figure from [Sil00 p.301]

Page table – indicating that pages 0, 2 and 5 are currently in memory, while pages1, 3, 4, 6, 7 are not.

Slide 430 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Page Fault

Figure from [Sil00 p.302]Steps in handling a page fault

Virtual Memory

Slide 431 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Performance of Demand PagingPage fault rate 0 ≤ p ≤ 1Average probability that a memory reference will cause a page fault

if p = 0 ⇒ no page faults at allif p = 1 ⇒ every reference causes a page fault

Memory access time tmaTime to access physical memory (usually in the range of 10 ns ...150 ns)

Effective access time teffAverage effective memory access time. This time finally counts for system performance

teff = (1 – p) · tma + p · ‚page fault time‘

Virtual Memory

Slide 432 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Performance of Demand PagingPage fault timeThe time from the failed memory reference until the machine instruction continues

Trap to the OSContext switchCheck validityFind a free frameSchedule disk read

Context switch to another process (optional)Place page in frameAdjust tablesContext switch and restart process

Assuming a disk system with an average latency of 8 ms, average seek time of 15 ms and a transfer time of 1 ms (and neglecting that the disk queue may hold other processes waiting for disk I/O), and assuming the execution time of the page fault handling instructions to be 1 ms, the page fault time is 25 ms.

Virtual Memory

Page 109: LectureCA All Slides

109

Slide 433 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Performance of Demand Paging

When each memory reference causes a page fault (p = 1), the system is slowed down by a factor of 250000.

⇒ Effective access time

teff = (1 – p) · 100ns + p · 25ms

= 100ns + p · 249999ns≈ 25ms

tma

When one out of 1000 references causes a page fault (p = 0.001), the system is slowed down by a factor of 250.

For less than a 10% degradation, the page fault rate p must be less than 0.000004 (1 page fault in 2.5 million references).

Virtual Memory

Slide 434 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Performance of Demand Paging

• Increase page sizeWith larger pages the likelihood of crossing page boundaries is lesser.

• Use „good“ page replacement schemePreferably one that minimizes page faults.

• Assign „sufficient“ framesThe system constantly monitors memory accesses, creates page-usage statistics and on-the-fly adjusts the number of allocated frames. Costly, but used in some systems (so-called working set model).

• Enforce program localityPrograms can contribute to locality by minimizing cross-page accesses. This applies to the implemented algorithms as well as to the addressing modes of the individual machine instructions.

Some possibilities for lowering the page fault rate Virtual Memory

Slide 435 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Page Size

Small Pageslittle internal fragmentationlarge page tablesslower disk I/Omore page faults

Large Pagesinternal fragmentationsmaller page tablesfaster disk I/Oless page faults

Intel 80386: 4 kBIntel Pentium II: 4 kB or 4 MBSun UltraSparc: 8 kB, 64 kB, 512 kB, 4MB

What should be the page (= frame) size?

Trend goes toward larger pages. Page faults are more costly todaybecause the gap between CPU-speed and disk speed increased.

Virtual Memory

Slide 436 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Page AttributesNext to the validity bit v, each page may in addition beequipped with the following attribute bits in the page table entry:

• Reference bit rUpon any reference to the page (read / write) the bit is set. Once thebit is set it remains set until cleared by the OS.

• Modify bit mEach time the page is modified (write access), the bit is set. The bit remains set until cleared by the OS.

A page that is modified is also called dirty. The modify bit isalso termed dirty bit. When the page is not modified it is clean.

Virtual Memory

Page 110: LectureCA All Slides

110

Slide 437 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Finding Free Frames

• Terminate another processNot acceptable. The process may already have done some work (e.g. changed a data base) which may mistakenly be repeated when the process is started again.

• Swap out a processAn option only in case of rare circumstances (e.g. thrashing).

• Hold some frames in spareSooner or later the spare frames are used up. Memory utilization is lower sincethe spare memory is not used productively.

• Borrow frames Yes! Take an allocated frame, use it, and give it (or another one) back tothe owner later.

What options does the OS have when needing free frames?

⇒ Page Replacement

Virtual Memory

Slide 438 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Page ReplacementPage replacement scheme:

If there is a free frame use it,

otherwise use a page-replacement algorithmto select a victim frame.

Save the victim page to disk and adjust the tablesof the owner process.

Read in the desired page and adjust the tables.

ImprovementPreferably use a victim page that is clean (not modified, m = 0).Clean pages do not need to be saved to disk.

Two page transfers

Virtual Memory

Slide 439 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Page Replacement

Figure from [Sil00 p.309]

Need for page-replacementUser process 1 wants to access module M (page 3).All memory however is occupied.Now a victim frame needs to be determined.

0123

0123

Virtual Memory

Slide 440 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Page ReplacementFigure from [Sil00 p.310]

Page-replacementThe victim is saved to disk (1) andthe page table is adjusted (2). Thedesired page is read in (3) and thetable is adjusted again.In this figure the victim used to be apage from the same process (or same segment in case of paged segments).

Virtual Memory

Page 111: LectureCA All Slides

111

Slide 441 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Page ReplacementGlobal Page ReplacementThe victim frame can be from the set of all frames, that is, one process can take a frame from another.Processes can affect each others page fault rate, though.

Local Page ReplacementThe victim frame may only be from the own set of frames, that is, the number of allocated frames per process does not change.No impact onto other processes.

The figure on the previous slide shows a local page replacement strategy.

Virtual Memory

Slide 442 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Page ReplacementPage replacement algorithms

First-in first-out (FIFO)and its variations second-chance and clock.

Optimal page replacement (OPT)

Least Recently Used (LRU)

LRU Approximations

Desired: Lowest page-fault rate!

Evaluation of the algorithms through applying them ontomemory reference strings.

Virtual Memory

Slide 443 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Memory Reference StringsAssume the following address sequence:(e.g. recorded by tracing the memory accesses of a process)

Assuming a page size of 100 bytes, the sequence can be reduced to

This memory reference string lists the pages accessed over time (at the time steps at which page access changes).

0100, 0432, 0101, 0612, 0102, 0103, 0104, 0101, 0611, 0102, 01030104, 0101, 0610, 0102, 0103, 0104, 0101, 0609, 0102, 0105

1, 4, 1, 6, 1, 6, 1, 6, 1, 6, 1

If there is only 1 frame available, the sequence would cause 11 page faults.

If there are 3 frames available, the sequence would cause 3 page faults.

Virtual Memory

Slide 444 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Page faults versus number of frames

Memory Reference Strings

Figure from [Sil00 p.312]

In general, the more frames available the lesser is the expectednumber of page faults.

Page 112: LectureCA All Slides

112

Slide 445 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

0

1

2

frame contents over time

FIFO Page Replacement

Figure from [Sil00 p.313]

Principle: Replace the oldest page (old = swap-in time).

Memory reference string: 7,0,1,2,0,3, 0,4, 2, 3, 0,3,2,1, 2,0, 1, 7, 0,1

Number of frames: 3

Total: 15 page faults.

Example VM.1

Virtual Memory

Slide 446 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

FIFO Page Replacement

Memory reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5Number of frames: 3

1 12

1 2

12

3

3

42

3

4

413

1

41

2

2

51

2

5

51

2

1

51

2

2

532

3

53

4

53

4

4 5

Example VM.2

9 page faults

Virtual Memory

Slide 447 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

FIFO Page Replacement

Memory reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 (as in VM.2)

Example VM.3

1 12

12

3

12

3

4

1 2 3 4 1 2 5 1 2 3 4 5

12

3

4

12

3

4

52

3

4

513

4

51

2

4

51

2

3

41

2

3

152

3

10 page faults

Number of frames: 4

Although we have more frames available than previously, the page fault rate did not decrease!

Virtual Memory

Slide 448 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

FIFO Page ReplacementFrom the examples VM.2, VM.3 it can be noticed thatthe number of page faults for 4 frames is greater than for 3 frames.

This unexpected result is known as Belady‘s Anomaly1:

For some page-replacement algorithms the page-fault rate may increase as the number of allocated frames increases.

1 Lazlo Belady, R. Nelson, G. Shedler: An anomaly in space-time characteristics of certain programs running in a paging machine, Communications of the ACM, Volume 12, Issue 6, June 1969, Pages: 349 - 353, ISSN:0001-0782, also available online as pdf from the ACM.

Virtual Memory

Page 113: LectureCA All Slides

113

Slide 449 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Belady‘s Anomaly

Figure from [Sil00 p.314]

Page faults versus number of frames for the string 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5.

Virtual Memory

Slide 450 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Second-Chance Algorithm This algorithm is a derivative of the FIFO algorithm.

Start with the oldest page

Inspect the page

If r = 0: replace the page. Done.

If r = 1: give the page a second chance byclearing r and moving the page to the top of the FIFO

Proceed to next oldest page

When a page is used often enough to keep the r bit set, it will never be replaced. Avoids the problem of throwing out a heavily used page (as may happen with strict FIFO). If all pages have r =1, the algorithm however is FIFO.

Virtual Memory

Slide 451 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

r = 1

Second-Chance Algorithm

Figure from [Ta01 p.218]

Example: page A is the oldest in the FIFO (see a). With pure FIFO it would have been replaced. However, as r = 1 it is given a second chance and is moved to the top of the FIFO (see b). The algorithm continues with page B.

FIFO

Virtual Memory

Slide 452 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Clock AlgorithmSecond chance constantly moves pages within the FIFO (overhead)!When the FIFO is arranged as a circular list the overhead is less.

Initially the hand (a pointer) pointsto the oldest page.

The algorithm then appliedis second chance.

Figure from [Ta01 p.219]

Virtual Memory

Page 114: LectureCA All Slides

114

Slide 453 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

frame contents over time

Optimal Page Replacement

Figure from [Sil00 p.315]

Principle: Replace the page that will not be used for the longest time.

Memory reference string: 7,0,1,2,0,3, 0,4, 2, 3, 0,3,2,1, 2,0, 1, 7, 0,1

Number of frames: 3

Total: 9 page faults.

Example VM.4

Slide 454 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

LRU Page Replacement

Figure from [Sil00 p.315]

Principle: Replace the page that has not been used for the longest time.

Memory reference string: 7,0,1,2,0,3, 0,4, 2, 3, 0,3,2,1, 2,0, 1, 7, 0,1

Number of frames: 3

Example VM.5

frame contents over time

Virtual Memory

Slide 455 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

LRU Page ReplacementPossible LRU implementations:

• Counter ImplementationEvery page table entry has a counter field. The system hardware must have a logical counter. With each page access the counter value is copied to the entry.

Update on each page access requiredSearching the table for finding the LRU pageAccount for clock overflow

• Stack ImplementationKeep a stack containing all page numbers. Each time a page is referenced, its number is searched and moved to the top. The top holds the MRU pages, the bottom holds the LRU pages.

Update on page access requiredSearching the stack for the current page number

Virtual Memory

Slide 456 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

LRU Page ReplacementExample for the stack implementation principle

bottom of stack

Figure from [Sil00 p.317]

Virtual Memory

Page 115: LectureCA All Slides

115

Slide 457 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

LRU ApproximationNot many systems provide sufficient hardware support fortrue LRU page replacement. ⇒ Approximate LRU!

• Use reference bitWhen looking for LRU page, take a page with r = 0

No ordering among the pages (only used and unused)

• History FieldEach page table entry has a history field h (e.g. a byte)

When page is accessed, set most significant bit (e.g. bit 7)

Periodically (e.g. every 100 ms) shift right the bits

When looking for LRU page, take page with smallest unsigned int(h)

Better ordering among the pages (256 history values)

Virtual Memory

Slide 458 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

LRU ApproximationHistory field examples

00000000 = Not used for the last 8 time periods

11111111 = Used in each of the past 8 periods

01001000 = Used in last period and in the fifth last period

111011601107011150101

value (unsigned int)history field

This page will be chosen as victim

Virtual Memory

Slide 459 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

0

5

10

15

20

25

30

35

40

6 8 10 12 14

Number of Frames Allocated

Pag

e Fa

ults

per

100

0 R

efer

ence

s

FIFOClockLRUOpt

Page Replacement

Figure from lecture slides WS 05/06

Exemplary page fault rates

Differences noticeable only for smaller number of frames

Virtual Memory

Slide 460 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Page Replacement

First-in first-out (FIFO)Simplest algorithm, easy to implement, but has worst performance. The clock version is somewhat better as it does not replace busy pages.

Optimal page replacement (OPT)Not of practical use as one must know future! Used for comparisons only. Lowest page fault rate of all algorithms.

Least Recently Used (LRU)The best algorithm usable, but requires much execution time or highly sophisticated hardware.

LRU ApproximationsSlightly worse than LRU, but faster. Applicable in practice.

Algorithms SummaryVirtual Memory

Page 116: LectureCA All Slides

116

Slide 461 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Thrashing

Figure from [Sil00 p.326]

When the number of allocated frames falls belowa certain number of pages actively used by a process,the process will cause page fault after page fault.

This high paging activity is called thrashing.

A too high degree of multi-programming results inthrashing becauseeach process does nothave „enough“ frames.

Virtual Memory

Slide 462 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

ThrashingCountermeasures

• Switching to local page replacementA thrashing process cannot steal frames from others. The page device queue (used by all) however is still full of requests – lowering overall system performance.

• Swap outThe thrashing process or some other process can be swapped out for a while. Choice depends on process priorities.

• Assign „sufficient“ framesHow many frames are sufficient?

Working-set model: All page references are monitored (online memory reference string creation). The pages recently accessed form theworking-set. Its size is used as the number of ‚sufficient‘ frames.

Virtual Memory

Slide 463 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Working-Set

The working-set model uses a parameter ∆ to define the working-set window. The set of pages in ∆ defines the working-set WS.The OS allocates to the process enough frames to maintain the size of the working-set.Keeping track of the working set requires the observationof memory accesses (constantly or in time intervals).

Figure from [Sil00 p.328]

Virtual Memory

Slide 464 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Program LocalityDemand paging is transparent to the user program. A program however can enforce locality (at least for data).

int A[][] = new int[128][128];

for (int j = 0; j < 128; j++)for (int i = 0; i < 128; i++)A[i][j] = 0;

Program A

Assume a page size of 128 words and consider the following program which clears the elements of a 128 x 128 matrix.

row column

Program clearing the matrix elementscolumn-wise.

Virtual Memory

Page 117: LectureCA All Slides

117

Slide 465 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Program Locality

In row major storage, a multidimensional array in linear memory is accessed such that rows are stored one after the other. It is the approach used by C, Java, and many other languages, with the notable exception of Fortran.

The array is stored in memory row major.

1 2 34 5 6

int A[2][3]= { {1,2,3}, {4,5,6} };

1

2

34

5

6

low

high

word memory

For example, the matrix is definedin C as

and is stored in memory row-wise.

row

1ro

w 2

Virtual Memory

Slide 466 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Program Locality

If the operating system allocates only one frame (for the data) to process A, the process will cause128 x 128 = 16384 page faults!

Thus, each row of the 128 x 128 matrix occupies one page.

j

j

j

j

i

i+1

i+2

i+3

This is because the process clears one word in each page (word j), then the next word, ..., thus „jumping“ from page to page in the inner loop.

for (int j = 0; j < 128; j++)

for (int i = 0; i < 128; i++)

A[i][j] = 0;

Virtual Memory

Slide 467 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Program Locality

Now, if the operating system allocates only one frame (for the data) to process B, the process will cause only 128 page faults!

By changing the loop order, the process first finishesone page before going to the next.

j

j

j

j

i

i+1

i+2

i+3int A[][] = new int[128][128];

for (int i = 0; i < 128; i++)for (int j = 0; j < 128; j++)A[i][j] = 0;

Program B

Virtual Memory

Slide 468 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Program Locality

Consider a three-address instruction, such as ADD A,B,C which performs C:=A+B. In the worst case the operands A, B, C arelocated in 3 different pages.

Locality is also influenced by the addressing modes atmachine instruction level.

Another example is the PDP-11 instruction MOV @(R1)+,@(R2)+which in the worst case straddles across 6 pages. R1

PDP 11 addressing mode 3for the source operand

Virtual Memory

Page 118: LectureCA All Slides

118

Slide 469 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Virtual Memory• Separation logical – physical memory

The user / programmer can think of an extremely large virtual address space.

• Pure Paging / Paged Segments Virtual memory can be implemented upon both memory allocation schemes.

• Execution of large programswhich do not fit into physical memory in their entirety.

• Better multiprogrammingas there can be more programs in memory.

• Not suitable for hard real-time systems!Virtual memory is the antithesis of hard real-time computing. This is because the response times cannot be guaranteed owing to the fact that processes mayinfluence each other (page device queue, thrashing, ...).

Slide 470 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Computer Architecture

Memory (353)

Paging (381)

Segmentation (400)

Paged Segments (412)

Virtual Memory (419)

Caches (471)

Slide 471 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Memory Hierarchy

Memory hierarchy levels in typical desktop / server computers, figure from [HP06 p.288]

The farther away from CPU, the larger and slower the memory. The hierarchy is the consequence of locality.

Caches

Slide 472 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Locality Principle

Programs tend to reuse data and instructions.

Rule of thumb:

A program spends 90% of its execution time in only 10% of the code.

Temporal locality: recently accessed items are likely to be accessed in near future.

Spatial locality: items whose addresses are near one another tend to be referenced close together in time.

[HP06 p.38]

Caches

Page 119: LectureCA All Slides

119

Slide 473 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Locality Principle

Example of a memory-accesstrace of a process

Figure from [Sil00 p.327]

Slide 474 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Caches

Cache: a safe place for hiding or storing thingsWebsters‘ Dictionary [HP06 p. C-1]

Here: Fast memory that stores copies of data from the most frequently used main memory locations. Used by the CPU to reduce the average time to access memory locations.

Result: faster program execution ⇒ improved system performance

Effect: instructions (in execution) can proceed quicker.

Instruction fetch is quickerMemory operands are accessed quicker

from the CPU‘s point of view

Slide 475 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Cached Memory Access

• CPU requests content from a memory location

• Cache is checked for this datum

• When present, deliver datum from cache

• When not, transfer datum from main memory to cache

• Then deliver from cache to CPU

Steps in accessing memory (here: reading from memory), simplified.Caches

Slide 476 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

CachesTo take advantage of spatial locality a cache contains blocks of data rather than individual bytes. A block is a contiguous line of processor words. It is also called a cache line.

Common block sizes: 8 ... 128 bytes

block transfer

word transfer

Cache componentsData Area

Tag Area

Attribute Area

Page 120: LectureCA All Slides

120

Slide 477 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Data AreaAll blocks in the cache make up the data area.

0

1

2

3

4

B–1

Block

... .........

Data area

Cache capacity = B · N bytes

N bytes per block

Caches

Slide 478 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Tag AreaThe block addresses of the cached blocks make up the tags of the cache lines. All tags form the tag area.

0

1

2

3

4

B–1

Block

... .........

Data area

N byte per block

Tag area

Caches

The statement is slightly simplified. In real caches, often just a fraction of the block address is used as tag.

1

1

Slide 479 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Attribute AreaThe attribute area contains attribute bits for each cache line.

0

1

2

3

4

B–1

Block

... ......

Data area

N bytes per block

...

Tag areaAttributes

V D• Validity bit V

indicates whether the cache line holds valid data

• Dirty bit Dindicates whether the cache line data is modified with respect to main memory

V = 1 ⇒ data is validV = 0 ⇒ data is invalid

D = 1 ⇒ data is modifiedD = 0 ⇒ data is not modified

Caches

Slide 480 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

CachesEach cache line plus its tag plus its attributes forms a slot.

...

Attributes

V D

...

Tag area

0

1

2

3

4

B–1

Block / Slot

... ...

Data area

N bytes per block

Cache slot

Page 121: LectureCA All Slides

121

Slide 481 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

CachesHow to find a certain byte in the cache?

• Block address is compared against all tags simultaneously

• In case of a match (cache hit), the offset selects the byte

The address generated by the CPU is divided into two fields.

• High order bits make up the block address• Low order bits determine the offset within that block

block address offset

m

m - n n

Remark: CPU address space = 2m, Cache line size (block size) = 2n

Caches

Slide 482 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Block AddressMemory can be considered as an array of blocks.

Caches

000000000100001000001100010000010100011000011100100000100100

Memory address (binary)

block address offset

The block address should not be confused with the memory address at which the block starts. The block address is a block number.

0

4

8

16

20

24

28

32

36

12

memory address (decimal)

4 bytes per block

Memory

0

1

2

4

5

6

7

8

9

3

blockaddress

block address = memory address DIV block size

Slide 483 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Caches

...

V D

...

Tags

...

Data

Comparator

block address offset

hit / miss Data out

CPU memory address

Cache mechanism

Slide 484 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Hit RateCaches

Cache capacity is smaller than the capacity of main memory.

Consequently, not all memory locations can be mirrored in the cache. When a required datum is found in the cache, we have a cache hit, otherwise a cache miss.

The miss rate is the fraction of cache accesses that result in a miss.

The hit rate is the fraction of cache accesses that result in a hit.

Hit rate = number of memory accesses

number of hits

Page 122: LectureCA All Slides

122

Slide 485 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Amdahl

P: proportion of the system improved, 0 ≤ P ≤ 1

S: speedup of that proportion, S > 0, usually S > 1

SPP

I+−

=)1(

1

I: maximum expected improvement, I > 0 (usually I > 1)

Used to find the maximum expected improvement to an overall system when a part of the system is improved. The law is a general law, not restricted to caches or computers.

’s Law Caches

Slide 486 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Amdahl’s LawExample: „30% of the computations can be made twice as fast.“

⇒ P = 0.3, S = 2.

N: grade of parallelism (e.g. N processors), N > 0

F: proportion of sequential calculations (no speedup possible), 0 ≤ F ≤ 1

177.115.07.0

1

23.0)3.01(

1=

+=

+−=IImprovement

NFF

I )1(1−+

=

Amdahl’s Law in the special case of parallelization

See lecture „AdvancedComputer Architecture“

Caches

Slide 487 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Caches

Example: Assume: Cache = 1 ns, main memory = 100 ns, 90% hit rate.What is the overall improvement?

175.9

1009.0)9.01(

1=

+−=I

10 ms100 ns1 ns250 psAccess time

1 TB1 GB64 kB500 ByteMemory space

I/O Devices (disks)

Main memory (DRAM)

Cache (SRAM)

CPU (registers)

1001

100,9.0 ===nsnsSP

Memory accesses (as seen by the CPU) now are more than 9 times as fast than without a cache.

Slide 488 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Read AccessReading from memory (improvement)

• CPU requests datum

• Search cache while fetching block from memory

• Cache hit: deliver datum, discard fetched block

• Cache miss: put block in cache and deliver datum

In case of a hit, the datum is available quickly.In case of a miss there is no benefit from the cache, but also no harm.

Things are not that easy when writing into memory. Let’s look at thecases of a write hit and a write miss.

Caches

Page 123: LectureCA All Slides

123

Slide 489 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Write Hit PolicyAssume a write hit. How to keep cache and main memory consistent on write-accesses?

• Write throughCPU

Cache

Write Buffer

MemoryThe datum is written to both the block in the cache and the block in memory.

• Write backThe datum is only written to the cache (dirty bit is set). The modified block is written to main memory once it is evicted from cache.

Write speed = cache speedMultiple writes to the same block still result in only one write to memoryLess memory bandwidth needed

Cache always clean (no dirty bit required) CPU write stall (problem reduced through write buffer)Main memory always has the most current copy(cache coherency in multi-processor systems)

Caches

Slide 490 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Write Miss PolicyAssume a write miss. What to do?

The block containing the referenced datum is transferred from main memory to the cache. Then one of the write hit policies is applied. Normally used with write back caches.

• Write allocate

• No-write allocateWrite misses do not affect the cache. Instead the datum is modified only in main memory. Write hits however do affect the cache. Normally used with write through caches.

Caches

Slide 491 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Write Miss PolicyAssume an empty cache and the following sequence of memory operations.

WriteMem[100]WriteMem[100]ReadMem[200]WriteMem[200]WriteMem[100]

What are the number of hits and misses when using no-write allocateversus write allocate?

missmissmisshit

miss

misshit

misshithit

WriteMem[100]WriteMem[100]ReadMem[200]WriteMem[200]WriteMem[100]

No-write allocate Write allocate

Caches

Slide 492 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Caches

... Memory

• Where exactly are the blocks placed in the cache?

• What if the cache if full?

Cache

⇒ Cache Organization

⇒ Replacement Strategies

Page 124: LectureCA All Slides

124

Slide 493 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Cache OrganizationWhere can a block be placed in the cache?

• Direct MappedWith this mapping scheme a memory block can be placed in only one particular slot. The slot number is calculated from

((memory address) DIV (blocksize)) MOD (slots in cache).

• Fully AssociativeThe block can be placed in any slot.

• Set AssociativeThe block can be placed in a restricted set of slots. A set is a group of slots. The block is first mapped onto the set and can then be placed anywhere within the set. The set number is calculated from

(memory address) DIV (block size) MOD (number of sets in cache).

Caches

Slide 494 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Direct Mapped

0

4

8

16

20

24

28

...

12

0

1

3

2

...

Each memory block is mapped to exactly one slot in thecache (many-to-one mapping).

Memory Cache

Block size = 4 byteCache capacity = 4 x 4 = 16 byte

Slot

memory address (decimal)

If slot occupied (V = 1) evict cache line

Caches

Slide 495 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Direct Mapped

12 DIV 4 = 33 MOD 4 = 3 (slot 3)

20 DIV 4 = 55 MOD 4 = 1 (slot 1)

Examples• In which slot goes the block located at address 12D?

Slot = ((memory address) DIV (blocksize)) MOD (slots in cache).

• In which slot goes the block located at address 20D?

• Where goes the byte located at address 23D?23 DIV 4 = 55 MOD 4 = 123 MOD 4 = 3

The byte goes in cache line (slot) 1 at offset 3

offset within slot = (memory address) MOD (blocksize).

0

2

13 MOD 4

Caches

Slide 496 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Direct MappedExtracting slot number and offset directly from memory address

• Where goes the byte located at address 23D?

Example

23D = 1 0 1 1 1B

slot offset ⇒ Slot 1, offset 3

The lower bits of the block address select the slot. The size of the slot field depends on the number of slots (size = ld(number of slots)).

ld = logarithmus dualis (base 2)

block address

offset

m

m - n n

slottag bits

Caches

Page 125: LectureCA All Slides

125

Slide 497 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Direct Mapped

offset

31 30 29 28 . 14 13 12 5 47 6. . 3 2 1 0. . 1618.. 19. 15. 17.

16 12

Address (showing bit positions)

Valid

16

Byte

. .

Tag Data

128 bits16 bits

32

16kentries

Hit

=

Data

32 32 32

MUX

32

2

4 klines

64 kByte cache usingfour-word (16 Byte) blocks Figure from lecture CA WS05/06

WordoffsetSlot

Caches

Slide 498 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Direct MappedExplanations for previous slide

Logical address space of CPU: 232 byte

Number of cache slots: 64kB / 16 Byte = 4K = 4096 slots.

Bit 0,1 determine the position of the selected byte in a word. However, as the CPU uses 4-byte words as smallest entity, the byte offset is not used.

Bit 2,3 determine the position of the word within a cache line.

Bits 4 to 15 (12 bits) determine the slot. 212 = 4K = number of slots.

Bits 16 to 31 are compared against the tags to see whether or not the block is in the cache.

Caches

Slide 499 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Fully Associative

0

4

8

16

20

24

28

32

36

12

A memory block can go in any cache slot (many-to-many).

0

1

3

2

4 choices

Slot selection

check all tags (preferably simultaneously)

take a slot with V = 0 (a free slot)

otherwise select a slot accordingto some replacement strategymemory address (decimal)

Caches

Slide 500 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Set Associative

0

4

8

16

20

24

28

32

36

12

A memory block goes into a set, and can be placed anywhere within the set (many-to some)

Slot selection

Determine set from block address

In this set, take a free slot ...

... or evict a slot according tosome replacement strategy

0

1

1

0

0

1

Slot Set

2-way set associative cache

memory address (decimal)

Caches

Page 126: LectureCA All Slides

126

Slide 501 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Set AssociativeSet = ((memory address) DIV (blocksize)) MOD (sets in cache).

12 DIV 4 = 3 (block address)3 MOD 2 = 1 (set 1)In which slot the block finally goes depends on occupation and replacement strategy

Example• In which set goes the block located at address 12D?

block address

offset

m

m - n n

settag bits

Similar to direct mapping, the low order bits of the block address determine the destination set.

Caches

Slide 502 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Set AssociativeN-way set associative cache

N = number of slots per set, not the number of sets

N is a power of 2, common values are 2, 4, 8.

Extremes

N = 1 There is only one slot per set, that is, each slot is a set. The set number (thus the slot) is drawn from the block address.

N = B There is only one set containing all slots (B = number of blocks in cache = number of slots).

⇒ Direct Mapped

⇒ Fully Associative

Caches

Set Associative

AMD Opteron CacheTwo-way set associative

Figure from [HP06 p. C-13]

Caches

Slide 504 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Set AssociativeOpteron cache

Physical address is 40 bits. Address is divided into 34 bit block address (subdivided into 25 tag bits and 9 index bits) and 6 bits byte offset ( ).

Cache capacity: 64 kB in 64-byte blocks (1024 slots)

Cache is two-way set associative: ⇒ 512 sets à 2 cache lines

Hardware: Two arrays with each 512 caches lines, that is, each set has one cache line in array1 and one in array2.

Figure: The index selects the set (29 = 512), see . The two tags of the set are compared against the tag bits. The valid bit must be set for a hit ( ). On a hit, the corresponding data is delivered using the winning input from a 2:1 multiplexer ( ). The data goes to „Data in“ of the CPU. The victim buffer is needed when a cache line has to be written back to main memory (replacement).

Caches

Page 127: LectureCA All Slides

127

Slide 505 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Cache Organization

Memory

Cache

Where canblock 12 go?

Block address

Figure from [HP06 p.C-7]

Caches

Slide 506 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Cache OrganizationFor the previous figure,

assume block 12 and block 20 being used very often. What is the problem?

Fully associative: No special problem. Both blocks can be stored in the cache at the same time.

Direct mapped: Problem! Only one of them can be stored at the same time since both map to the same slot.12 mod 8 = 20 mod 8 = 4

Set associative: No special problem. Both blocks can be stored in the same set at the same time.

Caches

Slide 507 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Cache OrganizationDirect Mapped

Hard-Allocation (no choice)Simple & InexpensiveNo replacement strategy requiredIf a process uses 2 blocks mapping to the same slot, cache misses are high.

Fully AssociativeFull choiceExpensive searching (hardware) for free slotReplacement strategy required

Set AssociativeCompromise between direct mapped and fully associativeSome choiceReplacement strategy required

Caches

Slide 508 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Replacement Strategies

•RandomVictim cache lines are selected randomly. Hardware pseudo-random number generator generates slot-numbers.

•Least-Recently Used (LRU)Relies on the temporal locality principle. The least recently used block ishoped to have smallest likelihood of (re)usage. Expensive hardware.

•First-in, First out (FIFO)Approximation of LRU by selecting the oldest block (oldest = load time).

Strategies for selecting a slot to evict (when necessary)Caches

Page 128: LectureCA All Slides

128

Slide 509 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Replacement Strategies

92.5

103.9

115.5

FIFO

92.1

102.4

111.7

LRU

92.1

102.3

115.1

Rand

92.5

103.1

113.3

FIFO

92.1

99.7

109.0

LRU

92.1

100.5

111.8

Rand

92.5

100.3

110.4

FIFO

92.192.2256 kB

104.3103.464 kB

117.3114.116 kB

RandLRUCapacity

Table: Data cache misses per 1000 instructions

• LRU is best for small caches

• little difference between all strategies for large caches

Data from [HP06 p.C-10]

Two-way Four-way Eight-way

Data collected for Alpha architecture, block size = 64 byte.

Caches

Slide 510 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Miss Categories•Compulsory Misses

The very first access to a block cannot be in the cache, so the block must be loaded. Also called cold-start misses.

•Capacity MissesOwing to the limited capacity of the cache, capacity misses will occur in addition to compulsory misses.

•Conflict MissesIn set associative or direct mapped caches too many blocks may map to the same set (or slot). Also called collision misses.

• Coherency Missesare owing to cache flushes to keep multiple caches coherent in a multiprocessor. Not considered in this lecture (see lecture „Advanced Computer Architecture“).

Caches

Slide 511 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Cache OptimizationAverage memory access time = hit time + miss rate · miss penalty

• Larger block size

• larger cache capacity

• higher associativity

reducing miss rate

• avoiding address translation

reducing hit time

• Multilevel caches

• read over write

reducing miss penalty

Caches

Slide 512 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Block Size

• reduced miss rate, taking advantage of spatial localitymore accesses will likely go to the same block

• increased miss penaltyMore bytes have to be fetched from main memory

• increased conflict missescache has less slots (per set)

• increased capacity missesonly for small caches. In case of high locality (e.g. repeatedly access to only one byte in a block) the remaining bytes are unused and waste up cache capacity.

The data area gets larger cache lines (but less lines), theoverall cache capacity remains the same.

Common block sizes are 32 ... 128 bytes

Caches

Page 129: LectureCA All Slides

129

Slide 513 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Block Size

Miss rate versus block size [from HP06 p.C-26]Cache capacity

Caches

Slide 514 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Block Size

Average memory access time = hit time + miss rate · miss penalty

For the previous figure, assume the memory system takes 80 clock cycles

of overhead and then delivers 16 bytes every 2 clock cycles.

Assume the hit time to be 1 clock cycle independent of block size.

Which block size has the smallest average memory access time?

4K cache, 16 byte block:

Average memory access time = 1 + (8.57 % · 82) = 8.027 clock cycles

4K cache, 32 byte block:

Average memory access time = 1 + (7.24 % · 84) = 7.082 clock cycles

... and so on for all cache sizes and block sizes.

Caches

Slide 515 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Block Size

1.5492.2884.68511.651112256

1.4701.9793.6598.46996128

1.4491.9333.3237.1608864

1.5882.1343.4117.0828432

1.8942.6734.2318.0278216

256K64K16K4KMiss penaltyBlock size

Average memory access time (in clock cycles) versus block size for 4 different cache capacities

cache capacity

green values = best (smallest access time) per column (thus per cache)

Caches

Slide 516 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Cache Capacity

• reduced miss rateowing to less capacity misses

• potentially increased hit timeowing to increased complexity

• increased hardware & power consumption

The cache is enlarged by adding more cache slots.

0.51%1.06%2.64%7.00%256K64K16K4K

38 % 40 % 48 %

Miss rates for block size 64 bytes

Cache capacity

Miss rate

Caches

Page 130: LectureCA All Slides

130

Slide 517 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Associativity

Eight-way set associative is almost as effectiveas fully associative.

A direct mapped cache with capacity N has about the same miss rate as a two-way set associative cache of capacity N/2.

The higher the associativity the more slots per set.

Common associativities are 1 (direct mapped), 2, 4, 8

• reduced miss rateprimarily owing to less conflict misses

• increased hit timetime needed for finding a free slot in the set

Rules of Thumb

Caches

Slide 518 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Associativity4 K cache

7.18-way7.14-way7.62-way9.81-way

Miss rate [%]Degree

8 K cache

4.48-way4.44-way4.92-way6.81-way

Miss rate [%]Degree

16 K cache

4.18-way4.14-way4.12-way4.91-way

Miss rate [%]Degree

128 K cache

1.98-way1.94-way1.92-way2.11-way

Miss rate [%]Degree

64 K cache

2.98-way3.04-way3.12-way3.71-way

Miss rate [%]Degree

512 K cache

0.68-way0.64-way0.72-way0.81-way

Miss rate [%]Degree

Data from [HP06 p.C-23]

Caches

Slide 519 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Associativity

(hit time)

Caches

Slide 520 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Multi-Level CachesBuilding a cache hierarchy.

CPU

• First-Level Cache (L1)small high speed cache usually located in the CPU

L1Main

MemoryL3

• Third level cache (L3), optionalSeparate memory chip between L2 and main memory

L2

• Second level cache (L2)fast and bigger cache located close to CPU (chip set)

Caches

Page 131: LectureCA All Slides

131

Slide 521 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Multi-Level CachesMulti-level caches reduce the average miss penaltybecause – on a miss – the block can be fetched fromthe next higher level instead from main memory.

Distinction between local and global cache considerations:

Local miss rate = number of cache misses

number of cache accesses

Local to a cache (e.g. L1, L2, ...)

Global miss rate = number of cache misses

number of memory references by CPU

local misses versus global references

Caches

Slide 522 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Multi-Level CachesExample CH.1: Suppose that in 1000 memory references there are 40

misses in L1 and 20 misses in L2. What are the various miss rates?

L1 L2CPU MainMemory

Local miss rateL1 = global miss rate L1 = 401000

= 4 %

Local miss rateL2 = 2040

= 50 %

Global miss rateL2 = 201000

= 2 %

These 4% go from L1 to L2

These 2 % go from L2 to main memory

Local miss rate L2 islarge because L1 skimsthe cream of memory accesses.

Caches

Slide 523 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Multi-Level Caches

L1 L2CPU MainMemory

Average memory access time = hit timeL1 + miss rateL1 · miss penaltyL1

miss penaltyL1 = hit timeL2 + local miss rateL2 · miss penaltyL2

= hit timeL1 + miss rateL1 · (hit timeL2 + local miss rateL2 · miss penaltyL2)

Caches

Slide 524 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Multi-Level CachesUsing the miss rates from example CH.1, and the following data

hit time L1 = 1 clock cycle,

hit timeL2 = 10 clock cycles,

miss penaltyL2 = 200 clock cycles,

the average memory access time is

= 1 + 0.04 · (10 + 0.05 · 200) = 5.5 clock cycles

hit timeL1 + miss rateL1 · (hit timeL2 + local miss rateL2 · miss penaltyL2)

Caches

Page 132: LectureCA All Slides

132

Slide 525 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Read over WriteAssume a direct mapped write-through cache with 512 slots,

and a four-word write buffer that is not checked on a read miss.

CPUCache

Write Buffer

Memory

Read-after-write hazard: The data in R3 is placed in write buffer. causes a read miss. Cache line is discarded. again causes a read miss. If the write buffer has not completed writing R3 into memory, will read an incorrect value from mem[512].

SW R3, 512(R0) ; mem[512]:= R3 (cache slot 0)

LW R1, 1024(R0) ; R1:= mem[1024] (cache slot 0)

LW R2, 512(R0) ; R2:= mem[512] (cache slot 0)load word

store word

Caches

Slide 526 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Read over WriteSolutions to previous problem

• Read misses wait until write buffer is empty,and thereafter the required memory block is fetched into cache.

• Check contents of write buffer,if referenced data not in buffer let the read-access continue fetching the blockinto the cache. Write buffer is flushed later when memory system is available.

Also applicable to write-back caches. The dirty block is put into a write buffer that allows inspection in case of a read miss. Read misses check the buffer before directly going to memory.

„Giving reads priority over writes“

Caches

Slide 527 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Address TranslationWhat addresses are cached, virtual or physical addresses?

A fully virtual cache uses logical addresses only, a fully physical

cache uses physical addresses only.

CPUvirtualcache translation memory

virtual

address

physical

address

virtual

address

segment tables / page tables / TLB

CPU translationphysicalcache memory

virtual

address

physical

address

physical

address

Caches

Slide 528 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Address TranslationFully virtual cache:

• No address translation time on a hit

• Cache must have copies of protection informationProtection info must be fetched from page/segment tables.

• Cache flush on processor switchIndividual virtual addresses usually refer to different physical addresses.

• Shared MemoryDifferent virt. addresses refer to same phys. address. Copies of same data in cache.

Fully physical cache:

• Very well on shared memory accesses

• Always address translation (time)Hits are of no advantage regarding address translation

Caches

Page 133: LectureCA All Slides

133

Slide 529 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Address TranslationSolution: get the best from both virtual and physical caches

Two issues in accessing a cache:

• Indexing the cachethat is, calculating the target set (or slot with direct mapping)

• Comparing tagscomparing the tag field with (parts of) the block address

The page offset (the part that is identical in both virtual and physical address space) is used to index the cache. In parallel, the virtual part of the address is translated into the physical address and used for tag comparison. Improved hit time.

⇒ „virtually indexed, physically tagged cache“

Caches

Slide 530 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Address TranslationVirtually indexed, physically tagged cache

Cache

page number

.........Translation

(TLB, page table)

page offset

tags data

word offset

virtual address

physical address

frame address offset

next memory level

CPU

data

next memory level

Caches

Slide 531 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Cache Optimization

Widely used1+Address translation

Widely used1+Read over write

Costly hardware; harder if L1

block size ≠ L2 block size2+Multi-level cache

Widely used1+–Higher associativity

Widely used for L21+–Larger cache capacity

Trivial0+–Larger block size

CommentComplexityMiss

rate

Miss

penalty

Hit

timeTechnique

+ = improves a factor, – = hurts a factor, blank = no impact

Summary of basic cache optimizations

Data from [HP06 p.C-39]

Caches

Slide 532 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Exam Computer Architecture

09.03.2007 ST 025/118 8:30 hrs Date Location Time

CACA

Duisburg - Ruhrort !

March 9th, 2007

Page 134: LectureCA All Slides

134

Slide 533 Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Computer Architecture