A Methodology for Evaluating Runtime Support in Network Processors

24
epartment of Electrical and Computer Engineering University of Massachusetts, Amherst Xin Huang and Tilman Wolf {xhuang,wolf}@ecs.umass.edu A Methodology for Evaluating Runtime Support in Network Processors

description

A Methodology for Evaluating Runtime Support in Network Processors. University of Massachusetts, Amherst Xin Huang and Tilman Wolf {xhuang,wolf}@ecs.umass.edu. Runtime Support in Network Processor. Network processor (NP) Multi-core system-on-chip - PowerPoint PPT Presentation

Transcript of A Methodology for Evaluating Runtime Support in Network Processors

Page 1: A Methodology for Evaluating Runtime Support in Network Processors

Department of Electrical and Computer Engineering

University of Massachusetts, Amherst Xin Huang and Tilman Wolf

{xhuang,wolf}@ecs.umass.edu

A Methodology for Evaluating Runtime Support in Network

Processors

Page 2: A Methodology for Evaluating Runtime Support in Network Processors

2Department of Electrical and Computer Engineering

Runtime Support in Network Processor

Network processor (NP)• Multi-core system-on-chip• Programmability & high packet processing rate

Heterogeneous resources• Control processors• Multiple packet processors• Co-processors• Memory hierarchy• Interconnection

Runtime support• Dynamic task allocation

Receiveand

Transmit

Scratchpad

Hash Unit

μEμEμEμE

μEμEμEμE

SRAMand

DRAMInterface

XscaleControl

Processor

μEμEμEμE

μEμEμEμE

IXP 2800

Page 3: A Methodology for Evaluating Runtime Support in Network Processors

3Department of Electrical and Computer Engineering

Receiveand

Transmit

Scratchpad

Hash Unit

μEμEμEμE

μEμEμEμE

SRAMand

DRAMInterface

XscaleControl

Processor

μEμEμEμE

μEμEμEμE

NP Hardware Resources

SRAM

Flash

Memory Mapped I/O

SDRAM

Workload

Task Allocation on the Processors

Runtime Mapping

General Operation of Runtime Support in NP

Input• Hardware resources• Workload

Mapping method Output

• Task allocation

Dynamic adaptation• Different runtime

support systems• Difficult to compare

AP2

AP1

AP3AP2 AP3AP3

Page 4: A Methodology for Evaluating Runtime Support in Network Processors

4Department of Electrical and Computer Engineering

Contributions

Evaluation methodology• Traffic representation• Analytical system model based on queuing networks• Results

Specific: 3 example runtime support systemI. Ideal AllocationII. Full Processor Allocation

• R. Kokku, T. Riche, A. Kunze, J. Mudigonda, J. Jason, and H. Vin. A case for run-time adaptation in packet processing systems. In Proc. of the 2nd workshop on Hot Topics in Networks (HOTNETS-II), Cambridge, MA, Nov. 2003

III.Partitioned Application Allocation• T. Wolf, N. Weng, and C.-H. Tai. Design consideration for network

processor operating systems. In Proc. of ACM/IEEE Symposium on Architectures for Networking and Communication System (ANCS), pages 71-80, Princeton, NJ, Oct. 2005

Page 5: A Methodology for Evaluating Runtime Support in Network Processors

5Department of Electrical and Computer Engineering

Outline

Introduction Evaluation Methodology

• Dynamic Workload Model• Runtime System Model

Result Summary

Page 6: A Methodology for Evaluating Runtime Support in Network Processors

6Department of Electrical and Computer Engineering

Workload

NP workload is characterized by applications and traffic

How to represent workload?

Page 7: A Methodology for Evaluating Runtime Support in Network Processors

7Department of Electrical and Computer Engineering

Dynamic Workload Model

Workload graph:• Application/Task: T• Traffic: • Processing requirement:

Example:

Processing requirement:• R. Ramaswamy and T. Wolf. PacketBench: A tool for workload

characterization of network processing. In Proc. of IEEE 6th Annual Workshop on Workload Characterization (WWC-6), page 42-50, Austin, TX, Oct. 2003

( , )W T U

,t tU R( )iD t

Page 8: A Methodology for Evaluating Runtime Support in Network Processors

8Department of Electrical and Computer Engineering

Outline

Introduction Evaluation Methodology

• Dynamic Workload Model• Runtime System Model

Result Summary

Page 9: A Methodology for Evaluating Runtime Support in Network Processors

9Department of Electrical and Computer Engineering

Runtime System Model

Unified approach for all runtime systems• Queuing networks• Specific solution for each runtime system

• Runtime mapping: • Graph:• Packet arrival rate:• Service time:

Metrics for all runtime systems• Processor utilization:• Average number of packets in the system:

( , )i jD t p,ti j

:t tM T P( , )S P Q

K

Page 10: A Methodology for Evaluating Runtime Support in Network Processors

10Department of Electrical and Computer Engineering

Three Example Runtime Support Systems

System I: Ideal Allocation System II: Full Processor Allocation System III: Partitioned Application Allocation

Workload

T1 T2T2

T1 & T2T1 & T2

T1 & T2T1 & T2

T1

T2 T2

T1_1

T2_1T2_1T2_1

T1_2T2_2T2_2

T1_4T2_4T2_4

T1_3T2_3T2_3

Ideal Allocation Full Processor Allocation Partitioned Application Allocation

Page 11: A Methodology for Evaluating Runtime Support in Network Processors

11Department of Electrical and Computer Engineering

Example Evaluation Model – System I

Ideal Allocation • All processors can process all packets completely• Unrealistic, but can provide baseline

M/G/m FCFS single station

Page 12: A Methodology for Evaluating Runtime Support in Network Processors

12Department of Electrical and Computer Engineering

M/G/m Single Station Queuing System

Cosmetatos approximation

Evaluation metrics

2 2/ / / / / /

11

/ /

0

1/ / / /

(1 ) ,

( ) ( ) ( ) 1; ; [ ] ,

(1 ) !(1 ) ! ! (1 )

1 1 4 5 2; (1 (1 )( 1) )

2 16

M G m M M m M D mB B

m k mmm

M M m mk

M D m M M m DmDm

W c W c W

where

P m m mW P

m m m k m

and

mW W nc m

nc m

K W m

G. Cosmetatos. Some Approximate Equilibrium Results for the Multiserver Queue (M/G/r). Operations Research Quarterly, USA, pages 615 – 620, 1976

G. Bolch, S. Greiner, H. de Meer, and K. S. Trivedi. Queueing Networks and Markov Chains: Modeling and Performance Evaluation with Computer Science Applications. John Wiley & Sons, Inc., New York, NY, August 1998

;m

Page 13: A Methodology for Evaluating Runtime Support in Network Processors

13Department of Electrical and Computer Engineering

Example Evaluation Model – System II

Full Processor Allocation• Allocate entire tasks to subsets of processors• Allocate as few processors as possible to save power• One processor run one type of task• Reallocation is triggered by queue length

BCMP M/M/1-FCFS model

(Jackson network)

Page 14: A Methodology for Evaluating Runtime Support in Network Processors

14Department of Electrical and Computer Engineering

BCMP Network

BCMP: Basket, Chandy, Muntz, and Palacios Characteristics: Open, closed, and mixed queuing network;

Several job classes; Four types of nodes: M/M/m–FCFS (class-independent service time), M/G/1–PS, M/G/∞–IS, and M/G/1–LCFS PR

Product-form steady-state solution: Open M/M/1-FCFS BCMP Queuing Network:

• Evaluation metrics:

11

1( ,..., ) ( ) ( ),

( )

N

N i ii

s s d s n sG K

11

( ,..., ) ( ), ( ) (1 ) i

Nk

N i i i i i ii

k k k k

F. Baskett, K. Chandy, R. Muntz, and F. Palacios. Open, Closed, and Mixed Networks of Queues wit Different Classes of Customers. Journal of the ACM, 22(2): 248 – 260, April 1975

,1 1 1

,1

C C Cir ir

i iri ir ir rr r r i i

eK K

Page 15: A Methodology for Evaluating Runtime Support in Network Processors

15Department of Electrical and Computer Engineering

Example Evaluation Model – System III

Partitioned Application Allocation• Tasks be partitioned across multiple processors• Synchronized pipelines• Allocate tasks equally across all processors to maximize

throughput• Reallocate at fixed time intervals

Equations for evaluation metrics are the same as System II.

BCMP M/M/1-FCFS model(Jackson network)

Page 16: A Methodology for Evaluating Runtime Support in Network Processors

16Department of Electrical and Computer Engineering

Outline

Introduction Evaluation Methodology

• Dynamic Workload Model• Runtime System Model

Result Summary

Page 17: A Methodology for Evaluating Runtime Support in Network Processors

17Department of Electrical and Computer Engineering

Setup

System• 16 100MIPS processing engines • Queue lengths are infinite

Workload

Other assumptions• Partition applications into 7-15 subtasks

Page 18: A Methodology for Evaluating Runtime Support in Network Processors

18Department of Electrical and Computer Engineering

Processor Allocation Over Time

Ideal:• 16 processors

Full Processor:• Change with traffic

Partitioned Application:• 16 processors

Full processor allocation system

Page 19: A Methodology for Evaluating Runtime Support in Network Processors

19Department of Electrical and Computer Engineering

Processor Utilization Over Time

Ideal:• Lowest processor

utilization Full Processor:

• Highest processor utilization because using fewer number of processors

Partitioned Application:• Low processor utilization• Not equal to ideal case

due to the unbalanced task allocation and pipeline overhead

Page 20: A Methodology for Evaluating Runtime Support in Network Processors

20Department of Electrical and Computer Engineering

Packets in System Over Time

Ideal:• Least number of packets

Full Processor:• Packets queued up due to

its high processor utilization

Partitioned Application:• Most number of packets

due to unbalanced task allocation and pipeline overhead

• More stable performance because of finer processor allocation granularity

Page 21: A Methodology for Evaluating Runtime Support in Network Processors

21Department of Electrical and Computer Engineering

Performance for Different Data Rates

Ideal:• Smooth increase

Full Processor: • Periodical peak

Partitioned Application:• Smooth increase

The maximum data rate supported by the systems• Ideal: 100%• Full Processor: 79.6%• Partitioned application:

75.1%

Page 22: A Methodology for Evaluating Runtime Support in Network Processors

22Department of Electrical and Computer Engineering

Implication of the Results

Ideal Allocation• Provide a base line

Full Processor Allocation• Allocate as few processors as possible to save power• Use entire processor as the allocation granularity• Good: High processor utilization• Bad: High performance variance

Partitioned Application Allocation• Equally distribute tasks on all the processors• Finer processor allocation granularity• Good: Stable performance• Bad: Difficult to get optimized solution => pipeline

synchronization overhead

Page 23: A Methodology for Evaluating Runtime Support in Network Processors

23Department of Electrical and Computer Engineering

Summary

Analytical methodology for evaluating different runtime support NP systems

Dynamic workload model and runtime system model

Results: 3 example runtime support systems• Quantitative metrics• Tradeoffs

Page 24: A Methodology for Evaluating Runtime Support in Network Processors

24Department of Electrical and Computer Engineering

Questions ?