Fast Benchmark Michele Michelotto – INFN Padova Manfred Alef – GridKa Karlsruhe 1.
-
Upload
harold-hicks -
Category
Documents
-
view
216 -
download
0
Transcript of Fast Benchmark Michele Michelotto – INFN Padova Manfred Alef – GridKa Karlsruhe 1.
Fast Benchmark
Michele Michelotto – INFN Padova
Manfred Alef – GridKa Karlsruhe
1
Fast Benchmark
2
Request mainly from WLCG community via machine/job task force to recommend a fast benchmark to estimate the performance of the provided job slots, since some sites don’t disclose performance scores or hardware details
Requirements clear Open source Easy to run Fast (few minutes) Small, no download (apart from first download)
Requirement not clear Reproducible? Reliable? Single core or multicore?
Use Cases Run everytime we land on a queue/VM/Cloud machine? Run to sample the resources available? Run to crosscheck is the HS06 declared are reliable?
An example with Geant4
3
Thanks to G.Cosmo and A.Dotti Based on Geant4
Runs on linux x86-64 and ARM realist description of the geometry of the detector footprint 1/3 to ¼ of real experiment No digitization, no analysis. Cpu bound, no I/O
Download a bootstrap.sh script from Cern Running the script download the rest of the
program and compile (5 – 10 minutes) ./run.sh <numThreads> <numEvents>
Single core
4
0 2000 4000 6000 8000 10000 120000.000
0.100
0.200
0.300
0.400
0.500
0.600
Events/second
Events...
1 10 100 1000 100000.0000
0.5000
1.0000
1.5000
2.0000
2.5000
3.0000Seconds/Event
Seconds/Event
1 10 100 1000 100000.0000
1.0000
2.0000
3.0000
4.0000
5.0000
6.0000
7.0000
8.0000
Second/Event
wall clock sec/event
Multicore
5
We have 32 Logical CPU I’m forced to use to wall clock time from the shell
instead of the Real Time computed Now it takes more time to a steady number
10 100 1000 10000 100000 10000000.0000
0.5000
1.0000
1.5000
2.0000
2.5000
3.0000
3.5000
4.0000
sec/evt single thread
sec/evt 16 threads
sec/evt 4 threads
sec/evt 32 threads
Seconds/event vs Wall Clock Time (minutes)
Variance
6
Xeon E5-2660 16C / 32Lcpu 16 thread in
parallel, 10Kevts, about 20 minutes
Average Wall clock time 1077 Stdev.S = 58
Average User time 16498 seconds
Stdev.S = 976
0 5 10 15 20 25 300
200
400
600
800
1000
1200
1400
Wall Clock Time
0 5 10 15 20 25 300
2000400060008000
100001200014000160001800020000
User Time
User time / Wall Clock time < 16
7
0 5 10 15 20 25 30 3513.00
13.50
14.00
14.50
15.00
15.50
User/Wall Clock
LHCB fast benchmark
8
New contact with P.Charpentier (LHCB) provided by Manfred Alef
Manfred is investigating this tool
HS14 update from Manfred Alef
9
HS06 based on widely used, industry standard, SPEC CPU 2006 SPEC is shipping well tested tools, on several
architectures, professionally maintained Very stable: 3 minor version in 8 years Hardware vendors and technical press are familiar
with it Widely adopted in GRID, WLCG and also other
scientific communities
Next version coming soon
10
Benchmark tests need to be revised to reflect improvements of hardware
SPEC is working on the next revision of CPU intensive benchmark suite currently designated as CPUv6 after original specmark, SPEC92, SPEC CPU95, SPEC
CPU2000, SPEC CPU 2006, this will be the 6th version. KIT is an SPEC OSG associate and had the CPUv6 in
beta (closed source, no permission to redistribute) GridKa will provide a config file to run the benchmark
on SL and GNU CPUv6 is running with SL6 default compiler gcc-4.4.7 but
not all the tests however SL7 is coming with gcc-4.8.2 Using gcc-4.9.0 all the tests compile