Performance of CMAQ on a Mac OS X System Tracey Holloway, John Bachan, Scott Spak Center for...
-
Upload
claribel-evans -
Category
Documents
-
view
217 -
download
0
Transcript of Performance of CMAQ on a Mac OS X System Tracey Holloway, John Bachan, Scott Spak Center for...
Performance of CMAQon a Mac OS X System
Tracey Holloway, John Bachan, Scott Spak
Center for Sustainability and the Global EnvironmentUniversity of Wisconsin-Madison
A presentation to the 3rd annual CMAS Models-3 conference
October 19, 2004
Thinking different.
MotivationMethodsPerformanceHardwareReleaseOngoing Improvements
Motivations.
Simplified operationEasier developmentEasy clusteringImproved performance
Motivation: Operation.Single platform for all research and academic computingUser-friendly interfaceUNIX OSOpen source software, hardware supportToday’s cluster node = tomorrow’s desktop
Motivation: Development.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Better Developer Tools
Xcode(Interface Builder)CHUD performance & debugging suite
Distribution Toolsstandardized profiles PackageMakerFAT binariesautomated installation
Operation & Development.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see th is p icture.
Motivation: Performance.Unique Hardware Advantages
powerful PPC 970 vector chipauto-vectorizing compilers2000 NASA Langley report
Populist Parallelizationmix dedicated cluster nodes with free cycles on personal & lab machinesoff-the-shelf solutionssimple GUI and command-line tools
Methods.IBM XL Fortan v8.1 compiler
auto-vectorizationequivalent to AIX
Modificationsflag conversionbuild settingsarray passing
> 400 man-hours
Performance.
2 Test Machinesdual 2 GHz G5, 5 GB RAM, 1 GHz busstock dual 1 GHz G4, 1.5 GB RAM, 133 MHz bus Mac OS X 10.3.5
1 Test RunFirst day of CMAQ 4.3 tutorial1 day, 32 km x 32 km, 38 x 38, 6 layers default EBI CB4 chemistry
Benchmarks.Tutorial Runtime by Hardware and Compiler (seconds)
IFC = Intel Fortan Compiler 7.1PGF = Portland Group Compiler 4.0-2Intel machines running CMAQ 4.22 on 2 processors with mpich parallelization. Source: Gail Tonnesen, “Benchmarks for CPUs and Compilers for the CMAQ 4.2.2 release.”Macs running CMAQ 4.3 on 1 processor (XLF) or 2 processors (XLF SMP) with OpenMP parallelization
seconds
Chemistry.
SpeciesMean | | from
reference
Max | | from reference (% of cells
>1 ppb)O3 0.1282 ppb 4.52 ppb (0.43)
NO 0.0050 ppb 0.72 ppb (0)
NO2 0.0262 ppb 2.05 ppb (0.02)
NH3 0.0126 ppb 1.67 ppb (0.0002)
SO4 (I + J) 0.0284 g/m3 1.52 g/m3
Source: ACONC.nc output from Day 1 of CMAQ 4.3 tutorialDual 2 GHz G5 running CMAQ 4.3 on 1 processor
Good Chemistry.Small difference from reference set
greater than difference among Intel machines and compilersNoise, floating point calculations, initializationgreatest at surface level, early in runambient concentrations onlyrandom distribution
no biasdoes not propagate in time or spacenot correlated to high or low concentrations
ConsistentG4/G5chemistry modulescompiler flags
Better Chemistry.Tutorial Runtime by Chemistry Module (seconds)
Dual 2 GHz G5 running CMAQ 4.3 on 1 processor
Models-3 on Mac, 10/04.
Core Platform• MM5 (Fovell)• MCIP v2.2• Smoke v2.1• CMAQ v4.3
Libraries & Add-Ons•netCDF v3.5.1•mpich v1.2.2-6 •I/O API v2.2•MCPL
Currently no PAVE, but Vis5d, VisAd, GrADS, NCL, and
Hardware.
Hardware.Dedicated Cluster
XServe G5 Dual 2 GHz, 2 GB RAMXserve RAID 3.5 TB8 Power Mac G5 Dual 2GHz, 5 GB RAM
Distributed Capacitystudent lab eMacspersonal G4 desktops
60 processor vector cluster0 Full-time Sys-admins
18 G5 processors
42 G4 processors
Cost Competitive.Apple
Xserve Dual G5 2GHz < $3500RAID storage at $3 per GBG5 Desktop $2000 - 4000
Compare to Dell PowerVault RAID at $5 per GBDell Precision dual Xeon 2.8 GHz, $1200 - 4200Sysadmin costs
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
JOHN
SCOTT
Release.Following input from the CMAS Center
alpha code to CMAS by November, 2004CMAS testingpotential support
Following CMAS Testing, preliminary code, scripts, binaries, instructions
available for download at www.sage.wisc.edu/cmaqScott Spak will answer questions for early users: [email protected]
Ongoing improvements.Our planned activities
g95 - GNU compilationparallel implementations
CondorXgridPooch/Appleseed
further optimizationDual 2.5 GHz benchmarksCMAQ MADRID
A community effort?
CMAQ Unified MIMSPAVE
Acknowledgements.Mary Sternitzky, UW
Seth Price, UW
Hans Vahlenkamp and NOAA GFDL
Zac Adelman and the CMAS Help Desk
Dr. Gail Tonnesen and Glen Kaukola, UCR
Models-3 Listserv
All funding provided by the University of Wisconsin-Madison.