Why are the biggest supercomputers so...
Transcript of Why are the biggest supercomputers so...
Why are the biggest supercomputers so challenging?
Balancing Software Engineering
26
Katherine Riley August 14, 2013
Define a supercomputer
“A supercomputer is a computer at the frontline of contemporary processing capacity – particularly speed of calculation which can happen at speeds of nanoseconds” - Wikipedia
¤ But not all supercomputers are as ‘frontline’ as others ¤ This makes the ecosystem … challenging.
27
Top 500 Comment
28
8/14/14, 9:36 AMEfficiency, Power, Cores... | TOP500 Supercomputer Sites
Page 1 of 2http://www.top500.org/statistics/efficiency-power-cores/
Home (/) / Statistics (/statistics/) / Efficiency, Power, Cores...
Efficiency, Power, Cores...R and R values are in GFlops. For more details about other fields, check the TOP500 description(/project/top500_description).
R values are calculated using the advertised clock rate of the CPU. For the efficiency of the systems you should take intoaccount the Turbo CPU clock rate where it applies.
Chart Type
Rmax
TOP500 Release
June 2014
Category
Cores per Socket
Submit
Legend:
4, 6, 8, 10, 12, 16,
max peak
peak
0 100 200 300 400 500
100,000
1,000,000
10,000,000
100,000,000
Rm
ax (
GFl
op/
s) 2
Robert Scott and 3,527 others like this.LikeLike
This Week in HPC: Big Deals Cascade in for Cray and IBM Commits $3 Billion in Chip Research wp.me/p3RLHQ-cwC
Retweeted by TOP500
insideHPC.com @insideHPC
Expand
The new Trinity #supercomputer begins the transition to new #exascale architectures bit.ly/1qrl6TD @cray_inc pic.twitter.com/jx94tknk4P
Retweeted by TOP500
Los Alamos HPC @LANL_HPC
Expand
Exclusive: @LANL Lead Shares @Cray_inc "Trinity" #HPC System Feeds and Speeds -
HPCwire @HPCwire
11 Jul
11 Jul
10 Jul
Tweets FollowFollow
Tweet to @top500supercomp
Log in (/accounts/login/?next=/statistics/efficiency-power-cores/)
or
Sign up (/accounts/signup/?next=/statistics/efficiency-power-cores/)
8/14/14, 9:37 AMEfficiency, Power, Cores... | TOP500 Supercomputer Sites
Page 1 of 2http://www.top500.org/statistics/efficiency-power-cores/
Home (/) / Statistics (/statistics/) / Efficiency, Power, Cores...
Efficiency, Power, Cores...R and R values are in GFlops. For more details about other fields, check the TOP500 description(/project/top500_description).
R values are calculated using the advertised clock rate of the CPU. For the efficiency of the systems you should take intoaccount the Turbo CPU clock rate where it applies.
Chart Type
Rmax
TOP500 Release
June 2014
Category
Cores per Socket
Submit
Legend:
4, 6, 8, 10, 12, 16,
max peak
peak
0 10 20 30 40 50
100,000
1,000,000
10,000,000
100,000,000
Rm
ax (
GFl
op/
s)
Show all
2
Robert Scott and 3,527 others like this.LikeLike
This Week in HPC: Big Deals Cascade in for Cray and IBM Commits $3 Billion in Chip Research wp.me/p3RLHQ-cwC
Retweeted by TOP500
insideHPC.com @insideHPC
Expand
The new Trinity #supercomputer begins the transition to new #exascale architectures bit.ly/1qrl6TD @cray_inc pic.twitter.com/jx94tknk4P
Retweeted by TOP500
Los Alamos HPC @LANL_HPC
Expand
Exclusive: @LANL Lead Shares @Cray_inc "Trinity" #HPC System Feeds and Speeds -
HPCwire @HPCwire
11 Jul
11 Jul
10 Jul
Tweets FollowFollow
Tweet to @top500supercomp
Log in (/accounts/login/?next=/statistics/efficiency-power-cores/)
or
Sign up (/accounts/signup/?next=/statistics/efficiency-power-cores/)
Top few vs Top 500
¤ There are not many huge supercomputers ¤ The biggest supercomputers are bleeding edge
¥ Hardware is sort of hardened in install ¥ Software is flushed out over the first year or two ¥ What I consider normal, many might consider HOSTILE
¤ Why is That? ¥ Reputation ¥ Move science forward
¡ Machines are targeted to a small number of users
¥ Help steer technology
29
How is a big DOE supercomputer purchased?
30
Science $ Call for
Designs
Choice ( Science, Facility, Budget)
Opera=ons
Poli=cs
Science
Opera=ons
$
Earliest Hardware
Delivery
Science
Ops Vendor
Shake it out at scale
Science
Ops Vendor
Usability Balance Capability Future
Collabora=on
Science
Ops Vendor
Science
Ops Vendor
Here be Early science
How is a big DOE supercomputer purchased?
31
Science $ Call for
Designs
Choice ( Science, Facility, Budget)
Opera=ons
Poli=cs
Science
Opera=ons
$
Earliest Hardware
Delivery
Science
Ops Vendor
Shake it out at scale
Science
Ops Vendor
Usability Balance Capability Future
Collabora=on
Science
Ops Vendor
Science
Ops Vendor
• All over 5 years • We start the next one before the current one is fully “on the floor”
Collaboration does not yield perfection
¤ Even with all these iterations with the vendor, the machine will not be perfect
32
Applica=ons
Budget
Technology
Future Path
Applica=ons
Budget
Technology
Future Path
Collaboration does not yield perfection
¤ Even with all these iterations with the vendor, the machine will not be perfect
33
(Not perfectly weighted)
What this means for your SW ecosystem?
¤ Bleeding edge systems are their own fun ¤ The system is available for production REALLY near to it’s first
construction ¥ Early science buffers this, but not completely ¥ It takes time to move software to a new architecture at a new scale ¥ Even earliest access won’t really resolve this ¥ Vendor does not have a significant leg up
¤ As SW on a machine stabilizes, the machine reaches end of life
¤ The available software will be bare bones ¤ Complicated requirements can slow you down ¤ Facilities tries very very hard to expand that as fast as possible
34
Why Engineering Principles are Crucial
¤ Testing of your software is crucial ¥ You might have a better process than library developers & vendors ¥ Defense against other people’s bugs
¤ Versioning ¤ Abstraction of functionality
¥ To address the speed of change ¥ Help debug problems ¥ Help with performance
¤ Document to help us help you ¤ Running on the biggest systems is a competitive advantage
¥ Own all of your code. Even libraries.
¤ Let the compiler do the work with limits ¤ Use high level languages with care
35
Questions
36