Download - TACC Lonestar Cluster Upgrade to 300 Teraflops

TACC Lonestar Cluster Upgrade to 300 Teraflops

Tommy MinyardSC10

November 16, 2010

TACC Mission & StrategyThe mission of the Texas Advanced Computing Center is to enable discoveries that advance science and society through the application of advanced computing technologies.

To accomplish this mission, TACC: – Evaluates, acquires & operates

advanced computing systems

– Provides training, consulting, anddocumentation to users

– Collaborates with researchers toapply advanced computing techniques

– Conducts research & development toproduce new computational technologies

Resources & Services

Research & Development

TACC Staff Expertise

• Operating as an Advanced Computing Center since 1986

• More than 80 Employees at TACC– 20 Ph.D. level research staff– Graduate and undergraduate students

• Currently support thousands of users on production systems

TACC Resources are Comprehensive and Balanced

• HPC systems to enable larger simulations analyses and faster turnaround times

• Scientific visualization resources to enable large data analysis and knowledge discovery

• Data & information systems to store large datasets from simulations, analyses, digital collections, instruments, and sensors

• Distributed/grid computing servers & software to integrate all resources into computational grids

• Network equipment for high-bandwidth data movements and transfers between systems

TACC’s Migration Towards HPC Clusters

• 1986: TACC founded– Historically had large Cray systems

• 2000: First experimental cluster– 16 AMD workstations

• 2001: First production clusters– 64-processor Pentium III Linux cluster– 20-processor Itanium Linux cluster

• 2003: First terascale cluster, Lonestar– 1028-processor Dell Xeon Linux cluster

• 2006: Largest US academic cluster deployed– 5840-processor cores 64-bit Xeon Linux cluster

Current Dell Production Systems

• Lonestar – 1460 node, dual-core InfiniBand HPC production system, 62 Teraflops

• Longhorn – 256 node, quad-core Nehalem, visualization and GPGPU computing cluster

• Colt – 10 node high-end visualization system with 3x3 tiled wall display

• Stallion – 23 node, large scale Vis system with 15x5 tiled wall display (more than 300M pixels)

• Discovery – 90 node benchmark system with variety of processors, InfiniBand DDR & QDR

TACC Lonestar System

Dell Dual-Core 64-bit Xeon Linux Cluster5840 CPU cores (62.1 Tflops)10+ TB memory, 100+ TB disk

Galerkin wave propagation

• Lucas Wilcox, Institute for Computational Engineering and Sciences, UT-Austin

• Seismic wave propagation, PI Omar Ghattaspart of research recently on cover of Sciencefinalist for Gordon Bell Prize at SC10

Molecular Dynamics

• David LeBard, Institute for Computational Molecular Science, Temple University

• Pretty Fast Analysis: A software suite for analyzing large scale simulations on supercomputers and GPU clusters

• Presented to the American Chemical Society, August 2010

PFA example: E(r) around lysozyme

EOH(r) = rOH . E(r), calculating

P(EOH)

4,311x4,311x

Lonestar Upgrade

• Current Lonestar already 4+ years of operation• Needed replacement to support UT and

TeraGrid users along with several other large projects

• Submitted proposal to NSF with matching UT funds along with funds from UT-ICES, Texas A&M and Texas Tech

New Lonestar Summary• Compute power – 301.7 Teraflops

– 1,888 Dell M610 two-socket blades– Intel X5680 3.33GHz six-core “Westmere” processors– 22,656 total processing cores

• Memory – 44 Terabytes– 2 GB/core, 24 GB/node– 132 TB/s aggregate memory bandwidth

• Disk subsystem – 1.2 Petabytes– Two DDN SFA10000 controllers, 300 2TB drives each– ~20 GB/sec total aggregate I/O bandwidth– 2 MDS, 16 OSS nodes

• Interconnect – InfiniBand QDR– Mellanox 648-port InfiniBand switches (4)– Full non-blocking fabric– Mellanox ConnectX-2 InfiniBand cards

System design challenges

• Limited by power and cooling• X5680 processor 130 Watts per

socket!– M1000e chassis fully populated

~7kW of power

• Three M1000e chassis per rack – 21kW per rack– Six 208V, 30-amp circuits per rack

• Forty total compute racks, four switch racks– Planning mix of underfloor and

overhead cabling

Software Stack

• Reevaluating current cluster management kits and resource managers/schedulers– Platform PCM/LSF– Univa UD– Bright Cluster Manager– SLURM, MOAB, PBS, Torque

• Current plan:– TACC custom cluster install and administration scripts– SGE 6.2U5– Lustre 1.8.4– Intel Compilers– MPI Libraries: MVAPICH, MVAPICH2, OpenMPI

Questions?