9/23/20151 Energy Efficient Scheduling Techniques For Real-Time Embedded Systems Rabi Mahapatra.

Post on 16-Jan-2016

217 views 0 download

Tags:

Transcript of 9/23/20151 Energy Efficient Scheduling Techniques For Real-Time Embedded Systems Rabi Mahapatra.

04/21/2304/21/23 11

Energy Efficient Scheduling Techniques Energy Efficient Scheduling Techniques For Real-Time Embedded SystemsFor Real-Time Embedded Systems

Rabi Mahapatra

04/21/2304/21/23 22

OutlineOutline

• Introduction• Motivation• Related Work• Single Processor Systems• Distributed Multiprocessor Systems• Experiments & Results• Summary

04/21/2304/21/23 33

Introduction

PDAaudio/video entertainment devices robots Handheld computer

Mobile Phone Network Camera Wireless presentation Gateway Cerfcube

Sample Embedded Systems

04/21/2304/21/23 44

Application Specification for Application Specification for Embedded SystemsEmbedded Systems

• Periodic Task graphsPeriodic Task graphs• Each task characterized Each task characterized

by:by:• PeriodPeriod• Execution timeExecution time• DeadlineDeadline

• Sporadic TasksSporadic Tasks• Invoked at any timeInvoked at any time• Hard deadlineHard deadline

• Soft AperiodicSoft Aperiodic• Invoked at any timeInvoked at any time• No deadlineNo deadline

t1

t3

t2

t4

t5

Period =90, Deadline =90

Sporadic Task, Deadline =30

Typical Input Specification of Embedded Systems

04/21/2304/21/23 55

Why Low Power ?Why Low Power ?

High Power dissipation causes Chip failures

Expensive Cooling & Packaging overheads High Manufacturing Costs

Portable Systems, User convenience limited by: Battery Size Recharging Interval

04/21/2304/21/23 66

Power ManagementPower Management Processor power dissipation is a function of Processor power dissipation is a function of

α . Cl . V2dd . f

Various Low-Power TechniquesVarious Low-Power Techniques System-LevelSystem-Level Architecture-LevelArchitecture-Level Circuit-LevelCircuit-Level

System-Level power reduction techniques:System-Level power reduction techniques: Dynamic Voltage ScalingDynamic Voltage Scaling Dynamic Power ManagementDynamic Power Management

04/21/2304/21/23 77

System Level Power Management System Level Power Management TaxonomyTaxonomy

SLPM

DPM LPS (DVS)

Fixed Tasks Variable Tasks

Single Processor

Multiprocessors

Single processor

Multiprocessors

Fixed Task set Variable Task set

Single Processor

Multiprocessors

Single Processor

MultiprocessorsD≤P (contd ..)

No Restrictions

Tolerance DL

Hard Realtime

SLPM – System Level Power ManagementDPM – Dynamic Power ManagementLPS – Low power Scheduling

04/21/2304/21/23 88

System Level Power Management System Level Power Management Taxonomy (contd …)Taxonomy (contd …)

D≤P

Tolerance DL Hard Real time

No Precedence With Precedence

Periodic

Periodic + Sporadic

Periodic

Periodic + Sporadic

04/21/2304/21/23 99

Our ObjectiveOur Objective

Given Embedded system and its application task graphs with library functions (i.e. period, execution time, Deadline etc.), our goal is

toReduce the system wide power consumption while

guaranteeing the deadlines

04/21/2304/21/23 1010

Related WorkRelated WorkMulti-Processor J.Luo and N.K.Jha, 2001 J.Luo and N.K.Jha, 2001

““Battery-Aware static scheduling”Battery-Aware static scheduling”• Global shifting scheme & local schedule transformationsGlobal shifting scheme & local schedule transformations• More suitable to small scale systemsMore suitable to small scale systems

R.Mishra, N.Rastogi, and D.Zhu, 2003R.Mishra, N.Rastogi, and D.Zhu, 2003 ““Energy aware scheduling for distributed”Energy aware scheduling for distributed”

• Greedy and gap-filling dynamic power management techniquesGreedy and gap-filling dynamic power management techniques• Limited to task graphs with equal deadlineLimited to task graphs with equal deadline

D. Zhu, R. Melhem, and B. Childers, 2003 D. Zhu, R. Melhem, and B. Childers, 2003 ““Scheduling with Dynamic Voltage/Speed”Scheduling with Dynamic Voltage/Speed”

• Slack sharing among processors, global queueSlack sharing among processors, global queue• Limited homogenous systems with shared memoryLimited homogenous systems with shared memory

Single Processor:Single Processor: G.Quan, and X. HU, 2001 G.Quan, and X. HU, 2001

Minimum constant voltage for each intervalMinimum constant voltage for each interval Assumes deadline less than or equal to period.Assumes deadline less than or equal to period.

V.Swaminathan, and K.Chakrabarty, 2000V.Swaminathan, and K.Chakrabarty, 2000 Low-energy earliest deadline first heuristicLow-energy earliest deadline first heuristic No guarantee on required maximum processor speedNo guarantee on required maximum processor speed

04/21/2304/21/23 1111

ContributionsContributions

Provides a framework for single processor that consider tasks

Whose response time is greater than the period. With Precedence constraints

Introduced chain of task set based execution approach to model low-power in distributed embedded systems.

04/21/2304/21/23 1212

Energy Efficient Scheduling Techniques for Single Processor

04/21/2304/21/23 1313

Proposed Approach

“A 3-step approach to reduce power in single processor embedded systems with arbitrary response times and precedence constraints.”

Step1: Task priority assignment that guarantees precedence constraints.

Step 2: Determination of task speed that guarantees deadlines. reduces power consumption.

Step 3: Dynamic power management Idle Intervals. Run-time variations in task execution time.

04/21/2304/21/23 1414

Task ModelingTask Modeling Periodic task graphs

Scheduled according to their priorities

Sporadic task Invoked at any time Hard deadline Execution slot is needed Let ‘’ be the worst-case execution time and ‘d’ be the

deadlineExecution Slots are defined with

Period : d - Deadline: d -

04/21/2304/21/23 1515

STEP 1 : Priority Assignment

Remove the node with no Predecessor and least slack

time

END

Arrange the task graphs & EX. Slots in increasing order Of their period

Remove the task graph with smallest period

Assign the node next highest priority

If all nodes in the Graph are Assigned priorities

no

List is emptyyes yes

no

04/21/2304/21/23 1616

STEP 2 : Task Speed Determination

yes

Find the task with largest speed, ‘s’.

Mark the speed for this task and all other high priority tasks as ‘s’

Arrange tasks in decreasing order of priority

For each task in the list,determine the speed at whichthe task and all high prioritytasks in the list can be run

Remove all these tasks from the list

List is empty

END

no

04/21/2304/21/23 1717

Task SchedulabilityTask Schedulability Let = {T1,T2,…,TN} be the task set arranged in

decreasing order of priorities.

Characteristics of Ti : {Pi, ei, Di}. A task set is feasible if the deadline of all tasks are always met.

Critical Instant Theorem (Critical Instant Theorem (Liu and Layland, 1973)“Scheduling algorithms for multiprogramming” if a task meets its deadline whenever the task is requested simultaneously with all the high priority tasks, then the deadline will always be met for all task phasing.

04/21/2304/21/23 1818

In other words, the task set = {T1,T2,…,TN} is schedulable if and only ti Di i =1,..n, where

otherwise ti,j Di,j i =1,..n, and ‘j’ instances of ti, where ti,j = R(ti,j + (j-1)Pi) – (j-1)Pi , where

R(ti,j) = + j*ei …………… (2)

k

i

k k

i ePt

1

1+ ei ti if Pi Di ………….. (1)

k

i

k k

i ePt

1

1

Task Schedulability (Contd … )

04/21/2304/21/23 1919

STEP 2 : Task Speed Determination

yes

Find the task with largest speed, ‘s’.

Mark the speed for this task and all other high priority tasks as ‘s’

Arrange tasks in decreasing order of priority

For each task in the list,determine the speed at whichthe task and all high prioritytasks in the list can be run

Remove all these tasks from the list

List is empty

END

no

04/21/2304/21/23 2020

Step 3: Dynamic Power ManagementStep 3: Dynamic Power Management

During System operation, idle intervals arise During System operation, idle intervals arise when:when: Actual task execution time is less than the worst-case Actual task execution time is less than the worst-case

execution time. (that is assumed at the time of fixed execution time. (that is assumed at the time of fixed priority scheduling).priority scheduling).

Since these Idle intervals can not be exploited by Since these Idle intervals can not be exploited by off-line methods.off-line methods.

An on-line method that adapts the clock speed An on-line method that adapts the clock speed to take advantage of idle intervals is needed.to take advantage of idle intervals is needed.

04/21/2304/21/23 2121

DPM (Contd ..)DPM (Contd ..)

Schedule the tasks according to their pre-Schedule the tasks according to their pre-determined speeds in a preemptive manner.determined speeds in a preemptive manner.

If the current task has finished and the queue of If the current task has finished and the queue of ready tasks is empty, then:ready tasks is empty, then: Determine the length of idle intervalDetermine the length of idle interval If feasible, put the processor in the power If feasible, put the processor in the power

down mode.down mode.

04/21/2304/21/23 2222

Experimental SetupExperimental Setup

Event driven simulatorEvent driven simulator

Intel Strong Arm SA-1100 Embedded Intel Strong Arm SA-1100 Embedded Processor SpecificationsProcessor Specifications

Real-world test cases Real-world test cases (CNC controller, INS, avionics,…)(CNC controller, INS, avionics,…)

04/21/2304/21/23 2323

BenchmarksBenchmarksTest cases # Periodic

task graphs

# sporadic

tasks

# tasks

with D > P

Utilization

Synthetic I 3 1 2 0.52

Synthetic II 5 3 4 0.61

Synthetic III 10 5 8 0.737

CNC [1] 8 --- --- 0.488

INS [2] 6 -- --- 0.72

Avionics [3] 14 1 --- 0.692

Characteristics of various test cases

04/21/2304/21/23 2424

0

20

40

60

80

100

VLPS [5] proposed technique

CNC

INS

Various low power techniques

% Energy savings

Comparison of % Energy savings with variousLow power techniques

04/21/2304/21/23 2525

0. 00%

10. 00%

20. 00%

30. 00%

40. 00%

50. 00%

60. 00%

70. 00%

80. 00%

90. 00%

100. 00%

Synthetic I Synthetic I I Synthetic I I I CNC INS Avionics

various test cases

% E

nerg

y Sa

ving

s

%Energy Savings

% Energy Savings with the proposed technique on various test cases

04/21/2304/21/23 2626

Energy Efficient Scheduling Techniques for Multi-Processor Embedded Systems

04/21/2304/21/23 2727

OverviewOverview

PreliminariesPreliminaries System modelSystem model Slack distribution heuristicSlack distribution heuristic Periodical determination of service ratePeriodical determination of service rate Experiments & ResultsExperiments & Results

04/21/2304/21/23 2828

PreliminariesPreliminaries Command and control systems that comprise of hard Command and control systems that comprise of hard

real-time applications in a distributed environment.real-time applications in a distributed environment. An application comprises of:An application comprises of:

Chain(s) of tasks or Task setsChain(s) of tasks or Task sets Hard deadlinesHard deadlines Exchange of messages during executionExchange of messages during execution

Admitting task set (Connection establishment) : Key Admitting task set (Connection establishment) : Key IssuesIssues

Traffic descriptor [6]Traffic descriptor [6] Worst-case delay analysisWorst-case delay analysis

Power Reduction approachesPower Reduction approaches slack distributionslack distribution Clock speed adaptation during system run-timeClock speed adaptation during system run-time

04/21/2304/21/23 2929

System Model A task set is described by a vector triplet where

Di

Pi

niii

CCC ,..1

nii DD },.../{ 1

≡ ,........,P 1i

A distributed system with 3 nodes & 2 task sets

PE1 PE2

PE3 M1

M2

),,( DCP iii

04/21/2304/21/23 3030

Admission of Task SetAdmission of Task Set Task set admission: Key PhasesTask set admission: Key Phases

Setting up task setSetting up task set Reply task setReply task set

Setting up task set : Key IssuesSetting up task set : Key Issues local worst-case delay < local deadlinelocal worst-case delay < local deadline end-to-end worst-case delay < end-to-end deadlineend-to-end worst-case delay < end-to-end deadline

Reply task set : Key IssuesReply task set : Key Issues Slack distributionSlack distribution Service rate < 1 (periodic service rate determination)Service rate < 1 (periodic service rate determination)

04/21/2304/21/23 3131

ObservationsObservations Processing of messages at a node can be extended up

to their delay bounds.

This slack can be utilized to increase the worst-case delay tolerable at the computational nodes involved in processing the task set.

The actual processing time demanded by the messages of a task set during the run-time varies and is less than the worst-case specification.

A technique to adapt the clock speed periodically is introduced to take advantage of run-time variations

04/21/2304/21/23 3232

Slack DistributionSlack Distribution

The slack in a task set is the difference between the end-The slack in a task set is the difference between the end-to-end deadline and the sum of the worst-case delays to-end deadline and the sum of the worst-case delays suffered at each node.suffered at each node.

This slack can be distributed among the nodes serving This slack can be distributed among the nodes serving task set to reduce the system energy consumption.task set to reduce the system energy consumption.

The slack is distributed among the nodes according to The slack is distributed among the nodes according to the service rate of the nodes.the service rate of the nodes.

04/21/2304/21/23 3333

Service Rate DeterminationService Rate Determination

Key Issues:Key Issues: Monitoring the traffic patternMonitoring the traffic pattern Feedback incorporation while determining Feedback incorporation while determining

service rate.service rate. Periodical service rate determinationPeriodical service rate determination

• guarantees processing of messages of guarantees processing of messages of outstanding intervals by their delay boundsoutstanding intervals by their delay bounds

• guarantees processing of messages of upcoming guarantees processing of messages of upcoming interval by their delay boundsinterval by their delay bounds

Scheduling policies considered: FCFS & WRRScheduling policies considered: FCFS & WRR

04/21/2304/21/23 3434

FCFS Scheduling Policy

k

1j

jtQ Qt

and the corresponding queue is determined according to

The service rate should be such that it must process the outstandingmessages that arrived during the interval (t-j,t-(j-1)) by their remaining delay bound. i.e., (dfcfs - j).

The new service rate at the beginning of every interval is determined The new service rate at the beginning of every interval is determined according toaccording to

k

jFCFS

tjttd

SS1

where k = /)1(,max nttt s

S jt

QjdSjt

FCFSjt )( .

04/21/2304/21/23 3535

WRR Scheduling PolicyThe new service rate at the beginning of every interval is determined according to

dS

ii

it

j

it

S k

1

ji,t

and the corresponding queue is determined according to

k

j

jit

it QQ

1

,

The service rate and the corresponding processing time demanded by the outstanding messages that arrived during the interval (t-j,t-(j-1)) are given by

S jit,

Q - . ji,t

, jdS iji

t

04/21/2304/21/23 3636

Experimental SetupExperimental Setup

Event driven simulatorEvent driven simulator

Socket interface for communicationSocket interface for communication

Intel PXA250 XScale Embedded ProcessorIntel PXA250 XScale Embedded Processor

Real-life test cases (DSP, Multimedia,..)Real-life test cases (DSP, Multimedia,..)

04/21/2304/21/23 3737

BenchmarksBenchmarks

Test CasesTest Cases Number Of Number Of NodesNodes

Number Of Number Of ConnectionsConnections

Number Of Number Of ModesModes

Synthetic I Synthetic I 33 1010 22

Synthetic IISynthetic II 55 2020 22

Synthetic IIISynthetic III 1010 3030 33

MultimediaMultimedia 44 44 33

DSP [4]DSP [4] 1616 3131 11Characteristics of various test cases

04/21/2304/21/23 3838

Test CasesTest Cases Mode 1 Mode 1 (nodes, (nodes, connections)connections)

Mode 2Mode 2(nodes,connecti(nodes,connections)ons)

Mode 3Mode 3(nodes,connectio(nodes,connections)ns)

Synthetic Synthetic (10,30)(10,30)

(9,20)(9,20) (9,25)(9,25) (10,30)(10,30)

Multimedia(Multimedia(4,4)4,4)

(3,2)(3,2) (3,3)(3,3) (4,4)(4,4)

Mode configurations for Multimedia and Synthetic test cases

Benchmarks (Contd …)

04/21/2304/21/23 3939

Energy Saving versus Slack distributationEnergy Saving versus Slack distributation

0

10

20

30

40

50

Srate Equal Wcet Greedy

Slack Distribution Schemes

Syste

m E

nerg

y S

avin

gs %

(3,10)

(5,20)

(10,30)

MM(4,4)

DSP(16,31)

FCFS

0

10

20

30

40

50

60

70

80

Srate Equal Wcet Greedy

Slack Distribution Schemes

Syste

m E

nerg

y S

avin

gs %

WRR

04/21/2304/21/23 4040

Energy Saving at different ModesEnergy Saving at different Modes

0

10

20

30

40

50

Synthetic 1 Synthetic 0.8 Multimedia 1 Multimedia 0.8

Normalised Peak Power

Syst

em E

nerg

y Sa

ving

s%

Mode 1

Mode 2

Mode 3

04/21/2304/21/23 4141

Service rate at intervalsService rate at intervals

0

0.2

0.4

0.6

0.8

1

1.2

1 2 3 4 5 6 7 8 9 10 11 12

Intervals

Nor

mal

ised

Ser

vice

Rat

e

0.4

0.6

0.8

0.9

1

(10,30) at one node

04/21/2304/21/23 4242

Service rate vs MIService rate vs MI

0.35

0.4

0.45

0.5

0.55

0.6

1 2 3 4 5MI (Monitoring Interval)

Nor

mal

ised

Ser

vice

Rat

e

1

0.9

0.8

(3,10) at one node

04/21/2304/21/23 4343

Overhead due to number of task Overhead due to number of task sets on servicesets on service

100

150

200

250

300

1 5 10 15 20 25 30Number of connections

Ove

rhea

d(u

secs

)

(10,30) at one node

04/21/2304/21/23 4444

SummarySummary Energy Efficient Scheduling technique for Single Processor that:Energy Efficient Scheduling technique for Single Processor that:

handles Sporadic and periodic task graphs with precedence constraintshandles Sporadic and periodic task graphs with precedence constraints takes into account tasks with arbitrary response timestakes into account tasks with arbitrary response times determines minimum speed for each taskdetermines minimum speed for each task adapts clock speed to take advantage of idle intervals.adapts clock speed to take advantage of idle intervals.

A connection based task execution approach for distributed A connection based task execution approach for distributed embedded systems that:embedded systems that:

effectively distributes the slack available in the connection to reduce effectively distributes the slack available in the connection to reduce system wide power consumption.system wide power consumption.

periodically adjusts the clock speed to take advantage of run-time periodically adjusts the clock speed to take advantage of run-time variations.variations.

Experimental results indicate that the proposed techniques yield Experimental results indicate that the proposed techniques yield significant energy savings.significant energy savings.

04/21/2304/21/23 4545

ReferencesReferences

1. N. Kim, M. Ryu, S. Hong, M. Saksena, C. Choi, and H. Shin, “Visual assessment of a real time system design: A case study on a CNC controller,” in Proc. IEEE Real-Time Systems Symposium, December. 1996.

2. A. Burns, K. Tindell, and A. Wellings, “Effective analysis for engineering real-time fixed priority schedulers,” IEEE Trans. on Software Eng., vol. 21, no. 5, pp. 475–480, May 1995.

3. C. Locke, D. Vogel, and T. Mesler, “Building a predictable avionics platform in Ada: A casestudy,” in Proc. IEEE Real-Time Systems Symposium, December. 1991.

4. C. M. Woodside and G. G. Monforton, “Fast allocation of processes in distributed and parallel systems,” Proc. IEEE Trans. Parallel & Distr. Systems., vol. 4, no. 2, pp. 164-174, Feb. 1993.

04/21/2304/21/23 4646

References (Contd ..)References (Contd ..)

5. G.Quan, and X.Hu, “Energy efficient fixed priority scheduling for real-time 5. G.Quan, and X.Hu, “Energy efficient fixed priority scheduling for real-time systems on variable voltage processors,” In Proc. Design Automation systems on variable voltage processors,” In Proc. Design Automation Conference, June 2001.Conference, June 2001.

6. A.Raha, N.Malcom, and W.Zhao, “Guaranteeing end-to-end deadlines in 6. A.Raha, N.Malcom, and W.Zhao, “Guaranteeing end-to-end deadlines in ATM networks,” In Proc. International conference on Distributed ATM networks,” In Proc. International conference on Distributed Computing Systems, May 1995.Computing Systems, May 1995.

04/21/2304/21/23 4747

THANK YOU