Benchmarking in IT Brand names Fiduciary Forum 2008 Washington DC, March 2008.

Benchmarking in ITBrand names

Fiduciary Forum 2008Washington DC, March 2008

Agenda

Background Benchmarks Benchmarking personal computers Benchmarking servers Examples Conclusions

Background

Brand naming (WB GL 2.20, Guidance Note No. 4) AMD complaints EU opened inquiry (October 2004) on whether France,

Netherlands, Sweden, and Finland illegally favor Intel processor maker

The U.S. Office of Management and Budget re-enforced (April, 2005) to all federal purchasers that brand-name specifications are forbidden and should not be associated with single manufacturer

Background

5 EU Member states have issued guidelines suggesting the use of benchmarks instead of trademarks and technical features such as clock rate – more guidelines are in the process of being developed

U.S. OMB recommended rather than issue brand name specifications for microprocessors, agencies should either: 1) articulate a benchmark for performance; or 2) specify the requirements for applications and interoperability. Benchmarks for microprocessors can be specific for functions such as Internet content creation, office applications, or mail servers. Benchmarks may also measure the overall performance of computers.

Typical processor and main-board specifications

Frequency – 3.0 GHz Cache – 1 MB FSB – 400 MHz Chipset – Intel 915

Processor frequency issue BAPCo Sysmark 2004SE results

Intel Pentium 4, 3.2 GHz /512 kB L2/800;

186

AMD Athlon 64, 2.2 GHz /1 MB L2/HT;

185

Typical PC specs

CPU, main-board, memory, disk, graphic adapter, modem/LAN adapter, free slots, interfaces (serial, parallel, USB, etc.), monitor, keyboard/mouse, OS,

Case, power supply,

Benchmarks

Each benchmark tries to answer the question: “What computer should I buy?”

Clearly, the answer to the question is “The system that does the job with the lowest cost-of-ownership”.

Cost-of-ownership includes project risks, programming costs, operations costs, hardware

costs, and software costs. It’s difficult to quantify

project risks, programming costs, and operations costs. In contrast, computer performance can be quantified and

compared.

Benchmarks

Domain specific No single metric possible The more general the benchmark, the

less useful it is for anything in particular. A benchmark is a distillation of the

essential attributes of a workload

Benchmarking process

BAPCo* Consortium BAPCo (Business Applications Performance Corporation) is a

non-profit consortium, whose charter is to develop and distribute a set of objective performance benchmarks based on popular computer applications and industry standard operating systems.

BAPCo members as of September 29, 2005

Source: www.bapco.com on September 29, 2005

Sysmark2004SE concept

Identification of business usage categories of Personal Computers, followed by determination of the types and characteristics of the output created by users in those categories.

These interactions are converted into instructions (or “scripts”) and integrated into BAPCo’s automated benchmarking environment resulting in candidate workloads for final placement in the benchmark suite in order to arrive at a balanced workload.

A key participant in the development process is the application expert provided by member companies. These application experts have at least five years of professional experience working with the applications.

Identifying Usage Categories

For SYSmark 2004 SE, BAPCo identified two distinct business usage categories:

Internet Content Creation: Tasks for creation of content for a website with an enhanced user experience: web pages with text, images, video and animations.

Office Productivity: Tasks common to business users: processing email, preparing documents and presentations, data management and data analysis

User Output

For SYSmark 2004 SE, the following types of output were identified:

Internet Content Creation: Digital Images, Digital Video, Animation, Encoded Media, Web pages and 3-D Rendered Images.

Office Productivity: Text Documents, Spreadsheets, Presentations, Emails, Databases, Transcribed documents, Virus Free documents, Compressed Files, Browsed Files and Portable Documents.

SysMark 2004 SE

The fundamental performance unit in SysMark 2004 SE is “Response Time”. Response time, in the context of SysMark 2004 SE, is defined as the time it takes the computer to complete a task that has been initiated by the automated script.

SysMark 2004 SE adds the individual response times of all operations within a group (e.g. 2D creation) and uses the total response time to compare the respective groups on 2 systems (using the calibration system as the base).

Benchmarking examples

1. Before Intel Pentium IV, 3.0 GHz, 800 MHz FSB, 1 MB CacheAfter x86-microprocessor with a performance giving a minimum score of 193

under the benchmark Sysmark 2004 rating2. Before

Intel Pentium 4, 3 GHz or equivalent After x86 microprocessor with the following performance scores:

between 165 and 205 under the Sysmark 2004 overall office productivity benchmark

between 200 and 235 under the Sysmark 2004 overall internet content creation benchmark

between 180 and 220 under the Sysmark 2004 rating

Client Benchmark Summary Choose benchmarks that measure relevant usage

models Desktop

Sysmark 2004 SE (productivity & content creation) Sysmark 2007 Preview

Mobile MobileMark 2005 (mobile performance & battery life) MobileMark 2007

Consider cost of implementation Use benchmarks from industry consortia Use of benchmarks doesn’t necessarily solve all

problems

Benchmarking servers

More complex issue Selection of benchmarking entities

SPEC TPC SAP ORACLE

SPEC

The Standard Performance Evaluation Corporation (SPEC) is a non-profit corporation formed to establish, maintain and endorse a standardized set of relevant benchmarks that can be applied to the newest generation of high-performance computers. SPEC develops benchmark suites and also reviews and publishes submitted results from our member organizations and other benchmark licensees.

http://www.spec.org/

SPEC members and associates

SPEC Members: 3DLabs * Acer Inc. * Advanced Micro Devices * Apple Computer, Inc. * ATI Research * Azul

Systems, Inc. * BEA Systems * Borland * Bull S.A. * CommuniGate Systems * Dell * EMC * Exanet * Fabric7 Systems, Inc. * Freescale Semiconductor, Inc. * Fujitsu Limited * Fujitsu Siemens * Hewlett-Packard * Hitachi Data Systems * Hitachi Ltd. * IBM * Intel * ION Computer Systems * JBoss * Microsoft * Mirapoint * NEC - Japan * Network Appliance * Novell * NVIDIA * Openwave Systems * Oracle * P.A. Semi * Panasas * PathScale * The Portland Group * S3 Graphics Co., Ltd. * SAP AG * SGI * Sun Microsystems * Super Micro Computer, Inc. * Sybase * Symantec Corporation * Unisys * Verisign * Zeus Technology *

SPEC Associates: California Institute of Technology * Center for Scientific Computing (CSC) * Defence Science and Technology Organisation - Stirling *

Duke University * JAIST * Kyushu University * Leibniz Rechenzentrum - Germany * National University of Singapore * New South Wales Department of Education and Training * Purdue University * Queen's University * Rightmark * Stanford University * Technical University of Darmstadt * Texas A&M University * Tsinghua University * University of Aizu - Japan * University of California - Berkeley * University of Central Florida * University of Illinois - NCSA * University of Maryland * University of Modena * University of Nebraska, Lincoln * University of New Mexico * University of Pavia * University of Stuttgart * University of Texas at Austin * University of Texas at El Paso * University of Tsukuba * University of Waterloo * VA Austin Automation Center *

SPEC Supporting Members: EP Network Storage Performance Lab * SuSE Linux AG *

SPEC benchmarks

CPU Graphics/Applications HPC/OMP Java Client/Server Mail Servers Network File System Web Servers

SPEC CPU benchmark

SPEC CPU2000 V1.3 Technology evolves at a breakneck pace. SPEC CPU2000 is the next-generation industry-

standardized CPU-intensive benchmark suite. SPEC designed CPU2000 to provide a comparative

measure of compute intensive performance across the widest practical range of hardware.

The implementation resulted in source code benchmarks developed from real user applications.

These benchmarks measure the performance of the processor, memory and compiler on the tested system.

Typical excuses from PIUs

We do not know what is it It is too complex to use it It is technically unjustified - our experts have different

views We use the clause “or equivalent performance” (… but

then rejecting anything but Intel) Intel based computers are more stable (whatever it

means)

And finally Where is it written in Bank documents that benchmarks

must be used ?

Bank activities

AMD presentation (June 2005) Intel presentation (September 2005)

White Paper, November 2007: Technical Specifications in the Public Procurement of Computers

IT thematic group discussions Suppliers comments WB Technical guidance (?)

Benchmarking in ITBrand names

Fiduciary Forum 2008Washington DC, March 2008

Benchmarking in IT Brand names Fiduciary Forum 2008 Washington DC, March 2008.

Documents

Transcript of Benchmarking in IT Brand names Fiduciary Forum 2008 Washington DC, March 2008.