High-Performance Computing in Germany: Structures, Strictures, Strategies F. Hossfeld John von...

51
High-Performance Computing in Germany: Structures, Strictures, Strategies F. Hossfeld John von Neumann Institute for Computing (NIC) Central Institute for Applied Mathematics (ZAM) Research Centre Juelich Juelich - Germany

Transcript of High-Performance Computing in Germany: Structures, Strictures, Strategies F. Hossfeld John von...

  • Slide 1
  • High-Performance Computing in Germany: Structures, Strictures, Strategies F. Hossfeld John von Neumann Institute for Computing (NIC) Central Institute for Applied Mathematics (ZAM) Research Centre Juelich Juelich - Germany
  • Slide 2
  • Foundation of the First Supercomputer Centre HLRZ n 1979: ZAM-Proposal of HPC Centre in Juelich n 1983: Supercomputing in Juelich: CRAY X-MP/24 n 1985: Physics Initiative for German HPC Centre n 1985: Commission installed by BMBF n 1986: Recommendation to found HPC Centre HLRZ in Juelich (with GMD Dependance: Test Lab) n 1986: CRAY X-MP / 416 - ZAM n 1987: Cooperation Contract signed n Partners: DESY Hamburg, GMD Birlinghoven, Research Centre Juelich n Central Institute for Applied Mathematics: HPC Centre n Three Competence Groups: Many-Particle Systems (Res. Centre Juelich),QCD (DESY),Visualization/ later: BioInformatics (GMD)
  • Slide 3
  • Slide 4
  • Foundation of the John von Neumann Institute for Computing (NIC) n 1998: GMD left HLRZ in March 1998 (GMD: now (tensely) negotiating the fusion conditions with the Fraunhofer Society) n 1998: Research Centre Juelich and DESY restructured HLRZ to found NIC n 1998: Cooperation Contract signed July 1998 n Central Institute for Applied Mathematics (as the main HPC Centre within NIC) n Centre for Parallel Computing of DESY- Zeuthen: APE100 (Quadrics) Support for QCD n Two (Three) Competence Groups in Juelich
  • Slide 5
  • Slide 6
  • Mission & Responsibilities of ZAM n Planning, Enhancement, and Operation of the Central Computing Facilities for Res. Centre Juelich & NIC n Planning, Enhancement, and Operation of Campus-wide Communication Networks and Connections to WANs n Research & Development in Mathematics, Computational Science & Engineering, Computer Science, and Information Technology n Education and Consultance in Mathematics, Computer Science and Data Processing, and Communication Technology and Networking n Education of Mathematical-Technical Assistants (Chamber of Industry & Commerce Certificate) & Techno- Mathematicians (Fachhochschule Aachen/Juelich)
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • John von Neumann (1946): "... The advance of analysis is, at this moment, stagnant along the entire front of nonlinear problems.... Mathematicians had nearly exhausted analytic methods which apply mainly to linear differential equations and special geometries...." Source: H. H. Goldstine and J. v. Neumann, On the Principles of Large Scale Computing Machines Report to the Mathematical Computing Advisory Panel, Office of Research and Inventions, Navy Department, Washington, May 1946, in: J. v. N., Collected Works, Vol. V, p. 1-32
  • Slide 13
  • Simulation: The 3-rd Category of Scientific Exploration 2 Simulation ExperimentTheory Problem 1 3
  • Slide 14
  • Visualization: A MUST in Simulation The Purpose of Computing is Insight, not Numbers! (Hamming)
  • Slide 15
  • Scientific Computing: Strategic Key Technology n Computational Science & Engineering (CS&E) n CS&E as Inter-Discipline: new Curricula n Modelling, Simulation and Virtual Reality n Design Optimization as Economy Factor n CS&E Competence as Productivity Factor n Supercomputing and Communication n New (parallel) Algorithms
  • Slide 16
  • Slide 17
  • Strategic Analogon n What Particle Accelerators mean to Experimental Physicists, n Supercomputers mean to Computational Scientists & Engineers. n Supercomputers are the Accelerators of Theory !
  • Slide 18
  • Necessity of Supercomputers n Solution of Complex Problems in Science and Research, Technology, Engineering, and Economy by Innovative Methods n Realistic Modelling and Interactive Design Optimization in Industry n Method and System Development for the Acceleration of the Industrial Product Cycles n Method, Software and Tool Development for the Desktop Computers of Tomorrow
  • Slide 19
  • 14 years to #1 10 years to # 50 8 years to #200 5.5 years to #500 Time spans of PC Technology to R Peak of TOP-#k: R Peak -Performance Distance between Supercomputers and PC Ch. Bischof, RWTH Aachen (Supplement: F. Hossfeld, NIC-ZAM, Jlich) Development of Peak Performance (TOP500) Time (years)
  • Slide 20
  • 14 years to #1 9 years to #50 7 years to #200 5,5 years to #500 Time spans of PC-Technology to R Max of TOP-#k: R Max -Performance Distances between Supercomputers and PC Ch. Bischof, RWTH Aachen (Supplement: F. Hossfeld, NIC-ZAM, Jlich) Development of Linpack-Performance (TOP500) Time (years)
  • Slide 21
  • Supercomputer Centers: Indispensable Structural Creators n Surrogate Functions instead of the missing German Computer Hardware Industry n Motors of the technical Innovation in Computing (Challenge & Response) n Providers of Supercomputing and Information Technology Infrastructure n Crystallization Kernels and Attractors of the technological and scientific Competence of Computational Science & Engineering
  • Slide 22
  • Slide 23
  • Local & Wide Area Networks Client-Server Structure Medium-Scale Supercomputers Tera 3 The Structural Pyramid of CS & E German Science Foundation (DFG) 1991 / NSF Blue Ribbon Panel 1993
  • Slide 24
  • Slide 25
  • URZ Karlsruhe Open Supercomputer Centres in Germany with Partners: RZIK ZAM ZPR/DESY RUS NIC: ZAM+ZPR HLRS: RUS+RZKA HLRB: LRZ (HLRN: ZIB+RZIK)
  • Slide 26
  • Open Supercomputer Centres ( chronological, de facto) n John von Neumann Institute for Computing (NIC) (ZAM, Research Centre Juelich; ZPR, DESY- Zeuthen) - Funding: FZJ & DESY Budgets n Computer Centre of University of Stuttgart: RUS/HLRS - hww (debis, Porsche) - HBFG n Leibniz Computer Centre of Bavarian Academy of Sciences, Munich: LRZ (HLRB) - HBFG n Konrad Zuse Centre for Information Technology Berlin & Regional Centre for Information Processing and Communication Technique: ZIB & RZIK (HLRN) - HBFG
  • Slide 27
  • RZKA Karlsruhe Open Supercomputer Centres with Partners NIC: ZAM+ZPR HLRS: RUS+RZKA HLRB: LRZ (HLRN: ZIB+RZIK) DKRZ Hamburg RZG Garching ZAM ZPR/DESY RZIK RUS and Topical Supercomputer Centres DKRZ, RZG DWD Offenbach
  • Slide 28
  • Topical/Closed Supercomputer Centres n German Climate-Research Computer Centre, Hamburg: DKRZ Funding: AWI, GKSS, MPG and Univ. Hamburg Shares (Regular Funds from Carriers of DKRZ: Operational Costs) plus Special BMBF Funding (Investments) n Computer Centre Garching of the Max Planck Society, Garching/Munich: RZG Funding: Group Budget Share of MPG Institutes n German Weather Service, Offenbach: DWD Funding: Federal Ministry for Traffic
  • Slide 29
  • Slide 30
  • NIC-ZAM/ZPR Systems n ZAM Juelich: - Cray T3E-600 / 544 PE/64 GB/307 GFlops - Cray T3E-1200 / 544 PE/262 GB/614 GFlops - Cray T90 / 10(+2) CPU/8 GB/18 GFlops - Cray J90 / 16 CPU/8 GB/4 GFlops n ZPR Zeuthen: - Diverse Quadrics (APE100/SIMD) / ~50 GFlops
  • Slide 31
  • Univ. Stuttgart: RUS/HLRS Systems n Cray T3E-900 / 540 PE/486 GFlops n NEC SX-4 / 40 CPU/80 GFlops Univ. Karlsruhe: RZKA System n IBM SP2 RS-6000 / 256 PE/130 GB/ 110 GFlops
  • Slide 32
  • Leibniz Computer Centre: LRZ Systems n Cray T90 / 4 CPU/1 GB/ 7.2 GFlops n Fujitsu VPP700 / 52 CPU/104 GB/114 GFlops n IBM SP2 RS-6000/ 77 PE/16.7 GB/20.7 GFlops n Hitachi SR8000 F1 / 112*8 PE/928 GB/1.3 TFlops
  • Slide 33
  • Zuse Centre Berlin: ZIB Systems n Cray T3E-900 / 408 PE/72 GB/ 364 GFlops n Cray J90 / 16 CPU/8 GB/4 GFlops
  • Slide 34
  • German Climate Research: DKRZ Systems n Cray C90 / 16 CPU/2 GB/ 16 GFlops n Cray J90 / 8 CPU/2 GB/1.6 GFlops n Cray T3D / 128 PE/8 GB/19.2 GFlops
  • Slide 35
  • Max Planck Society: RZG Systems n Cray T3E-600 / 812 PE/98 GB/487 GFlops n NEC SX-5 / 3 CPU/12 GB/6 GFlops n Cray J90 / 16 CPU/4 GB/4 GFlops
  • Slide 36
  • German Weather Service: DWD System n Cray T3E-1200 / 812 PE/98 GB/974 GFlops
  • Slide 37
  • Slide 38
  • Selection of HPC Competence Centres n Computer Centre (and diverse Chairs) of the Aachen University of Technology (RWTH Aachen) n Centre for Interdisciplinary Scientific Computing, IWR, University of Heidelberg n Computer Centre (and diverse Chairs) of the University of Karlsruhe n John von Neumann Institute for Computing, NIC, ZAM-Juelich, CS&E Groups-Juelich, ZPR-Zeuthen n Paderborn Centre for Parallel Computing, PC 2, Paderborn n Institute for Scientific Computing and Algorithms, SCAI, Research Centre for Information Technology (GMD), St. Augustin n Centre for High Performance Computing, ZHR, University of Dresden
  • Slide 39
  • Selected Essential Competence Centres in Germany: DKRZ-MPIM GMD-SCAI IWR Heidelberg NIC-ZAM Juelich PC 2 Paderborn RWTH Aachen RZG-MPG Konwihr/Bavarian Unis Uni Erlangen/IMMD Uni Karlsruhe/IAM/RZ Uni Stuttgart/RUS ZHR Dresden ZIB Berlin RWTH Aachen RZ FZ Jlich NIC-ZAM PC 2 Paderborn ZIB Berlin ZHR Dresden IWR Heidelberg Uni Karlsruhe RZ Uni Stuttgart RUS Konwihr Net Munich DKRZ MPIM RZG/MPG Uni Erlangen IMMD GMD Birlinghoven funding: 50:50 federal:local (HBFG) funding: HGF-Centre / MPG budget funding: Bavarian Government
  • Slide 40
  • German Research Network B-WiN: 34 & 155 Mbps Gigabit Network G-WiN: 2.4 Gbps (4/2000+)
  • Slide 41
  • Slide 42
  • BMBF Feasibility Study Recommendations (VESUZ, 1997) n Working Group of the Directors of the open Supercomputer Centres for strategic and tactic coordination of HPC resource planning, innovation steps, profiles, usage etc. n Congregation of the Chairmen of the Steering Committees and Scientific Councils of the SCCs for information exchange and coordination of processes in their responsibility, like reviewing, assessment, resource distribution etc. n BMBF funding of the development of Seamless Interface for the interconnection of SCCs integrating competence centres (UNICORE/UNICORE Plus Projects 1997-1999, 2000-2002) n BMBF funding of broadband network with guarantee of serv- ice for high-speed connection of the SCCs (within DFN WiN) n BMBF funding of R&D in HPC to explore new applications n BMBF funding of a mobility, guest & education programme
  • Slide 43
  • What UNICORE will support n Transaction model (batch) to hide latency n Asynchronous Metacomputing; multiple distributed job steps n Transparent data assembly preliminary to computational task n Interoperation with local operational environments n Uniform user authentication & security mechanisms n Uniform (GUI) interface for job preparation and submission
  • Slide 44
  • Upgrading HPC Centres: Innovation Helix n Procurement Strategy: Turning the Innovation Helix n Open HPC Centres:4 n HPC Centres Profiles: Science vs. Engineering n Mutual Exchange of Convertible HPC Capacity Shares n Investment/Installation:70 Mio. DM (two phases) n Amortization Period: 6 Years n HPC Innovation Cycle:1.5 Years n Diversity of Architectures:MPP, SMP-MPP, PVP... n Performance Increase: 20 within 6 Years (TOP500) i.e.: 5 every Innovation Cycle n Installation Rhythm:one every 1.5 years
  • Slide 45
  • Innovation Cycle 1 Innovation Cycle 2 20 within 6 Years (4 Innovation Cycles) 5 per Innovation Cycle 1 2 3 4 VESUZ-recommended HPC Centres Innovation Helix Performance Increase & Rates 4 HPC Centres Installation: 2 Phases/6 Years (Amortization) Performance: 20 / 6 Years 5 / Innovation Cycle (1.5 y) 1
  • Slide 46
  • Recommendations of the HPC Working Group to the German Science Council 2000 n High performance computing is indispensable for top research in the global competition. n The demand on HPC capacity tends to become unlimited due to increasing problem complexity. n Competition between HPC Centres must be increased by user requests towards service and consulting quality. n Optimal usage and coordinated procurement requires functioning control mechanisms -Transparency of HPC usage costs must be improved ! -Tariff-based funding is unsuitable for HPC control !!!
  • Slide 47
  • Recommendations of the HPC Working Group to the German Science Council 2000 (contd) n Efforts in HPC software development need to be enforced. n Continuous investments are necessary on all levels of pyramid. n Networks of competence must support the efficient usage of supercomputers. n HPC education and training must be enhanced -HPC is yet insufficiently integrated in curricula, -Education must not be limited to usage of PCs and WSs. n Strategic coordination of investments and procurements requires a National Coordination Committee.
  • Slide 48
  • National Coordination Committee: Tasks & Responsibilities n Decisions on Investments -Prospective exploration of HPC demands -Strategic advice for federal and Lnder HPC decisions -Recommendations for upgrades in infrastructure & staff n Orientational Support -Position papers and hearings on HPC issues -Advice to the Science Councils HBFG Committee n Evolution of Control and Steering Models -Development and Testing of demand-driven self-control mechanisms -Investigating differing accounting models for suitable users profiles n Develop Nation-wide Concept for HPC Provision - including all centers independent of funding & institutional type -keeping the list of centers open for change and innovation (criteria!)
  • Slide 49
  • Umberto Eco: Every complex Problem has a simple Solution. - And this is false !
  • Slide 50
  • WWW URLs: n www.fz-juelich.de/nic n www.uni-stuttgart.de/rus n www.uni-karlsruhe.de/uni/rz n www.zib.de/rz n www.lrz-muenchen.de n www.rzg.mpg.de/rzg n www.dkrz.de n www.dwd.de
  • Slide 51