Chapter 4 M. Keshtgary Spring 91 Type of Workloads
Slide 2
Types of Workloads 2 Test workload denotes any workload used in
performance study Real workload one observed on a system while
being used Cannot be repeated (easily) May not even exist (proposed
system) is generally not suitable for use as a test workload
Synthetic workload similar characteristics to real workload Can be
applied in a repeated manner Relatively easy to port; Relatively
easy to modify without affecting operation No large real-world data
files; No sensitive data May have built-in measurement capabilities
Benchmark == Workload Benchmarking is process of comparing 2+
systems with workloads
Slide 3
Test Workloads for Computer Systems 3 Addition instructions
Instruction mixes Kernels Synthetic programs Application
benchmarks
Slide 4
Addition Instructions 4 Early computers had CPU as most
expensive component System performance == Processor Performance
CPUs supported few operations; the most frequent one was addition
Computer with faster addition instruction performed better Run many
addition operations as test workload Problem More operations, not
only addition Some more complicated than others
Slide 5
Instruction Mixes 5 Number and complexity of instructions
increased Additions were no longer sufficient Could measure
instructions individually, but they are used in different amounts
=> Measure relative frequencies of various instructions on real
systems Use as weighting factors to get average instruction time
Instruction mix specification of various instructions coupled with
their usage frequency Use average instruction time to compare
different processors Often use inverse of average instruction time
MIPS Million Instructions Per Second FLOPS Millions of
Floating-Point Operations Per Second Gibson mix: Developed by Jack
C. Gibson in 1959 for IBM 704 systems
Slide 6
Example: Gibson Instruction Mix 6 1.Load and Store13.2
2.Fixed-Point Add/Sub6.1 3.Compares3.8 4.Branches16.6 5.Float
Add/Sub6.9 6.Float Multiply3.8 7.Float Divide1.5 8.Fixed-Point
Multiply0.6 9.Fixed-Point Divide0.2 10.Shifting4.4 11.Logical
And/Or1.6 12.Instructions not using regs5.3 13.Indexing18.0
Total100 1959, IBM 650 IBM 704
Slide 7
Problems with Instruction Mixes 7 In modern systems,
instruction time variable depending upon Addressing modes, cache
hit rates, pipelining Interference with other devices during
processor-memory access Distribution of zeros in multiplier Times a
conditional branch is taken Only represents speed of processor
Bottleneck may be in other parts of system
Slide 8
Kernels 8 Pipelining, caching, address translation, made
computer instruction times highly variable Therefore we cannot use
individual instructions in isolation Instead, it became more
appropriate to consider a set of instructions, which constitutes a
higher level function, a service provided by the processors Since
most of the initial kernels did not make use of the input/output
(I/O) devices and concentrated solely on the processor performance,
this class of kernels could be called the processing kernel Kernel
= the most frequent function Commonly used kernels: Tree Searching,
Matrix Inversion, and Sorting Disadvantages Do not make use of I/O
devices
Slide 9
Synthetic Programs 9 Proliferation in computer systems, OS
emerged, changes in applications No more processing-only apps, I/O
became important too Use simple exerciser loops Make a number of
service calls or I/O requests Compute average CPU time and elapsed
time for each service call Easy to port, distribute (Fortran,
Pascal) First exerciser loop by Buchholz (1969) Called it synthetic
program May have built-in measurement capabilities
Slide 10
Synthetic Programs 10 Advantages Quickly developed and given to
different vendors No real data files Easily modified and ported to
different systems Have built-in measurement capabilities
Measurement process is automated Repeated easily on successive
versions of the operating systems Disadvantages Too small Do not
make representative memory or disk references Mechanisms for page
faults and disk cache may not be adequately exercised CPU-I/O
overlap may not be representative Not suitable for multi-user
environments because loops may create synchronizations, which may
result in better or worse performance
Slide 11
Application Workloads 11 For special-purpose systems, may be
able to run representative applications as measure of performance
E.g.: airline reservation E.g.: banking Make use of entire system
(I/O, etc) Issues may be Input parameters Multiuser Only applicable
when specific applications are targeted For a particular industry:
Debit-Credit for Banks
Slide 12
Benchmarks 12 Benchmark = workload Kernels, synthetic programs,
application-level workloads are all called benchmarks Instruction
mixes are not called benchrmarks Some authors try to restrict the
term benchmark only to a set of programs taken from real workloads
Benchmarking is the process of performance comparison of two or
more systems by measurements Workloads used in measurements are
called benchmarks
Slide 13
13 SPEC Systems Performance Evaluation Cooperative (SPEC)
(http://www.spec.org) Non-profit, founded in 1988, by leading HW
and SW vendors Aim: ensure that the marketplace has a fair and
useful set of metrics to differentiate candidate systems Product:
fair, impartial and meaningful benchmarks for computers Initially,
focus on CPUs: SPEC89, SPEC92, SPEC95, SPEC CPU 2000, SPEC CPU 2006
Now, many suites are available Results are published on the SPEC
web site
Slide 14
14 SPEC (contd) Benchmarks aim to test "real-life" situations
E.g., SPECweb2005 tests web server performance by performing
various types of parallel HTTP requests E.g., SPEC CPU tests CPU
performance by measuring the run time of several programs such as
the compiler gcc