CS519: Lecture 7 zUniprocessor and Multiprocessor Scheduling.

28
CS519: Lecture 7 Uniprocessor and Multiprocessor Scheduling
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    230
  • download

    3

Transcript of CS519: Lecture 7 zUniprocessor and Multiprocessor Scheduling.

Page 1: CS519: Lecture 7 zUniprocessor and Multiprocessor Scheduling.

CS519: Lecture 7

Uniprocessor and Multiprocessor Scheduling

Page 2: CS519: Lecture 7 zUniprocessor and Multiprocessor Scheduling.

CS 519 – Operating Systems Theory DCS, Rutgers University2

What and Why?

What is processor scheduling? Why?

At first to share an expensive resource – multiprogramming

Now to perform concurrent tasks because processor is so powerful

Future looks like past + now Rent-a-computer approach – large data/processing centers

use multiprogramming to maximize resource utilization Systems still powerful enough for each user to run multiple

concurrent tasks

Page 3: CS519: Lecture 7 zUniprocessor and Multiprocessor Scheduling.

CS 519 – Operating Systems Theory DCS, Rutgers University3

Assumptions

Pool of jobs contending for the CPU CPU is a scarce resource

Jobs are independent and compete for resources (this assumption is not always used)

Scheduler mediates between jobs to optimize some performance criteria

Page 4: CS519: Lecture 7 zUniprocessor and Multiprocessor Scheduling.

CS 519 – Operating Systems Theory DCS, Rutgers University4

Types of Scheduling

We’re mostlyconcerned withshort-termscheduling

Page 5: CS519: Lecture 7 zUniprocessor and Multiprocessor Scheduling.

CS 519 – Operating Systems Theory DCS, Rutgers University5

What Do We Optimize?

System-oriented metrics: Processor utilization: percentage of time the processor is busy Throughput: number of processes completed per unit of time

User-oriented metrics: Turnaround time: interval of time between submission and

termination (including any waiting time). Appropriate for batch jobs

Response time: for interactive jobs, time from the submission of a request until the response begins to be received

Deadlines: when process completion deadlines are specified, the percentage of deadlines met must be promoted

Page 6: CS519: Lecture 7 zUniprocessor and Multiprocessor Scheduling.

CS 519 – Operating Systems Theory DCS, Rutgers University6

Design Space

Two dimensions Selection function

Which of the ready jobs should be run next? Preemption

Preemptive: currently running job may be interrupted and moved to Ready state

Non-preemptive: once a process is in Running state, it continues to execute until it terminates or it blocks for I/O or system service

Page 7: CS519: Lecture 7 zUniprocessor and Multiprocessor Scheduling.

CS 519 – Operating Systems Theory DCS, Rutgers University7

Job Behavior

I/O-bound jobs Jobs that perform lots of

I/O Tend to have short CPU

bursts CPU-bound jobs

Jobs that perform very little I/O

Tend to have very long CPU bursts

Distribution tends to be hyper-exponential Very large number of very

short CPU bursts A small number of very

long CPU bursts

CPU

Disk

Page 8: CS519: Lecture 7 zUniprocessor and Multiprocessor Scheduling.

CS 519 – Operating Systems Theory DCS, Rutgers University8

Scheduling Algorithms

FIFO: non-preemptive Round-Robin: preemptive Shortest Job Next: non-preemptive Shortest Remaining Time: preemptive at arrival Highest Response Ratio (turnaround/service time)

Next: non-preemptive Priority with feedback: preemptive at time

quantum

Page 9: CS519: Lecture 7 zUniprocessor and Multiprocessor Scheduling.

CS 519 – Operating Systems Theory DCS, Rutgers University9

Example Job Set

Process

ArrivalTime

ServiceTime

1

2

3

4

5

0

2

4

6

8

3

6

4

5

2

Page 10: CS519: Lecture 7 zUniprocessor and Multiprocessor Scheduling.

CS 519 – Operating Systems Theory DCS, Rutgers University10

Behavior of Scheduling Policies

Page 11: CS519: Lecture 7 zUniprocessor and Multiprocessor Scheduling.

CS 519 – Operating Systems Theory DCS, Rutgers University11

Behavior of Scheduling Policies

Page 12: CS519: Lecture 7 zUniprocessor and Multiprocessor Scheduling.

CS 519 – Operating Systems Theory DCS, Rutgers University12

Priority with Feedback Scheduling

After each preemption,process moves to

lower-priority queue

Page 13: CS519: Lecture 7 zUniprocessor and Multiprocessor Scheduling.

CS 519 – Operating Systems Theory DCS, Rutgers University13

Scheduling Algorithms

FIFO is simple but leads to poor average response times. Short processes are delayed by long processes that arrive before them

RR eliminate this problem, but favors CPU-bound jobs, which have longer CPU bursts than I/O-bound jobs

SJN, SRT, and HRRN alleviate the problem with FIFO, but require information on the length of each process. This information is not always available (although it can sometimes be approximated based on past history or user input)

Feedback is a way of alleviating the problem with FIFO without information on process length

Page 14: CS519: Lecture 7 zUniprocessor and Multiprocessor Scheduling.

CS 519 – Operating Systems Theory DCS, Rutgers University14

It’s a Changing World

Assumption about bi-modal workload no longer holds Interactive continuous media applications are

sometimes processor-bound but require good response times

New computing model requires more flexibility How to match priorities of cooperative jobs, such as

client/server jobs? How to balance execution between multiple threads of a

single process?

Page 15: CS519: Lecture 7 zUniprocessor and Multiprocessor Scheduling.

CS 519 – Operating Systems Theory DCS, Rutgers University15

Lottery Scheduling

Randomized resource allocation mechanism Resource rights are represented by lottery tickets Have rounds of lottery In each round, the winning ticket (and therefore

the winner) is chosen at random The chances of you winning directly depends on

the number of tickets that you have P[wining] = t/T, t = your number of tickets, T = total

number of tickets

Page 16: CS519: Lecture 7 zUniprocessor and Multiprocessor Scheduling.

CS 519 – Operating Systems Theory DCS, Rutgers University16

Lottery Scheduling

After n rounds, your expected number of wins is E[win] = nP[wining]

The expected number of lotteries that a client must wait before its first win E[wait] = 1/P[wining]

Lottery scheduling implements proportional-share resource management

Ticket currencies allow isolation between users, processes, and threads

OK, so how do we actually schedule the processor using lottery scheduling?

Page 17: CS519: Lecture 7 zUniprocessor and Multiprocessor Scheduling.

CS 519 – Operating Systems Theory DCS, Rutgers University17

Implementation

Page 18: CS519: Lecture 7 zUniprocessor and Multiprocessor Scheduling.

CS 519 – Operating Systems Theory DCS, Rutgers University18

Performance

Allocated and observed execution ratios between

two tasks running the Dhrystone benchmark.With exception of 10:1

allocation ratio, all observedratios are close to allocations

Page 19: CS519: Lecture 7 zUniprocessor and Multiprocessor Scheduling.

CS 519 – Operating Systems Theory DCS, Rutgers University19

Short-term Allocation Ratio

Page 20: CS519: Lecture 7 zUniprocessor and Multiprocessor Scheduling.

CS 519 – Operating Systems Theory DCS, Rutgers University20

Isolation

Five tasks running the Dhrystonebenchmark. Let amount.currency

denote a ticket allocation of amountdenominated in currency. Tasks

A1 and A2 have allocations 100.A and200.A, respectively. Tasks B1 and B2have allocations 100.B and 200.B,

respectively. Halfway thru experimentB3 is started with allocation 300.B.

This inflates the number of tickets in Bfrom 300 to 600. There’s no effect on

tasks in currency A or on the aggregate iteration ratio of A tasks to B tasks. Tasks B1 and B2 slow to half their

original rates, corresponding to the factor of 2 inflation caused by B3.

Page 21: CS519: Lecture 7 zUniprocessor and Multiprocessor Scheduling.

CS 519 – Operating Systems Theory DCS, Rutgers University21

Thread Scheduling for Cache Locality

Traditionally, each resource (CPU, memory, I/O) has been managed separately

Resources are not independent, however Policy for one resource can affect how another

resource is used. For instance, the order in which threads are scheduled can affect performance of memory subsystem

Neat paper that uses a very simple scheduling idea to enhance memory performance

Page 22: CS519: Lecture 7 zUniprocessor and Multiprocessor Scheduling.

CS 519 – Operating Systems Theory DCS, Rutgers University22

Main Idea

When working with a large array, want to tile (block) for efficient use of the cache What is tiling? Restructuring loops for data re-use.

Tiling by hand is a pain and is error-prone Compilers can automatically tile but not always.

For instance, when program contains dynamically allocated or indirectly accessed data

So, use threads and hints to improve cache utilization

Page 23: CS519: Lecture 7 zUniprocessor and Multiprocessor Scheduling.

CS 519 – Operating Systems Theory DCS, Rutgers University23

Example

Thread ti is denoted by ti(ai1,…,aik),

where aij is the address of the jth

piece of data reference by thread ti.Simplify by using just 2 or 3 addressesor hints. Hints might be elements ofrows or columns of a matrix, for ex.

Page 24: CS519: Lecture 7 zUniprocessor and Multiprocessor Scheduling.

CS 519 – Operating Systems Theory DCS, Rutgers University24

Algorithm

Hash algorithm should assign threads to bins so that threads with similar hints fall in the same bin

Threads in each bin are scheduled together

2-D plane of bins with 2 hints

Key insight: the sum of the two dimensions of a bin is less than the cache size C

Easy to extend to k hints

Page 25: CS519: Lecture 7 zUniprocessor and Multiprocessor Scheduling.

CS 519 – Operating Systems Theory DCS, Rutgers University25

Performance

Fork = create and schedule null threadRun = execute and terminate null thread

Page 26: CS519: Lecture 7 zUniprocessor and Multiprocessor Scheduling.

CS 519 – Operating Systems Theory DCS, Rutgers University26

More Complex Examples

Partial differential equation solver

Page 27: CS519: Lecture 7 zUniprocessor and Multiprocessor Scheduling.

CS 519 – Operating Systems Theory DCS, Rutgers University27

Multiprocessor Scheduling

Load sharing: single ready queue; processor dequeues thread at the front of the queue when idle; preempted threads are placed at the end of queue

Gang scheduling: all threads belonging to an application run at the same time

Hardware partitions: chunk of the machine is dedicated to each application

Advantages and disadvantages?

Page 28: CS519: Lecture 7 zUniprocessor and Multiprocessor Scheduling.

CS 519 – Operating Systems Theory DCS, Rutgers University28

Multiprocessor Scheduling

Load sharing: poor locality; poor synchronization behavior; simple; good processor utilization. Affinity or per processor queues can improve locality.

Gang scheduling: central control; fragmentation --unnecessary processor idle times (e.g., two applications with P/2+1 threads); good synchronization behavior; if careful, good locality

Hardware partitions: poor utilization for I/O-intensive applications; fragmentation – unnecessary processor idle times when partitions left are small; good locality and synchronization behavior