BU SciDAC Meeting

BU SciDAC Meeting

Balint JooJefferson Lab

Anisotropic Clover

Why do it ? Anisotropy -> Fine Temporal Lattice Spacing at

moderate cost Combine with Group Theoretical Baryon

Operators -> Access to Excited States Nice preliminary results – with just Wilson

Excited states

States with spin 5/2+http://arxiv.org/pdf/hep-lat/0609052

http://arxiv.org/pdf/hep-lat/0601029

Anisotropic Clover

Why do it ? Part of Jlab 3 prong Lattice QCD programme

Prong 1: Dynamical Anisotropic Clover Prong 2: DWF on a staggered sea (MILC Configs) Prong 3: Large Scale Dynamical DWF

This programme was specially commended by the DOE at our recent Science and Technology Review

Anisotropic Clover is a major part of the INCITE proposal (for XT3 and BG/?) machines

Anisotropic Clover

Level 2 Clover Term and Inverse & Force Term Wired into Chroma -> Provides HMC/RHMC

Our Choice of Gauge Action: Plaquette + Rectangle + Adjoint Term

Fermion Action Anisotropic Clover + Stout Smearing

Stout Force Recursion Usual Barrage of DF techniques

Hasenbusch + Chronology for 2 flavours RHMC for the +1 flavour Multi time scale integrators

CG Inverter Performance

We only got 7.3Tflops on 8K CPUs :( - but we didn't work much at all at optimzation

Clover Work Under SciDAC 2

Performance is OK but want better... Optimizations

Clover SSE Optimizations for Clusters & XT3 BAGEL terms for BG/???

Multi Mass Inverter, Trace Terms Would like to optimize the actual bottleneck

CG Inverter is not the current bottleneck Help from our friends at RENCI at identifying the

exact hotspots? (Right now we rely on gprof) Algorithmic: Temporal Preconditioning ('later)

Thoughts at the back of my mind

Are we actually going to get any time at ORNL? We asked for a lot

I think 20M CPU hours just for the clover stuff Incite proposal was extremely hurried We had to respond very quickly

Many small groups did not have (stand?) a chance How much effort should we be investing? Should we be focusing on BlueGene/? and

clusters more?

CRE and ILDG Progress on CRE has been slow. Why?

Manpower reasons in SciDAC 1? People are happily running production already

without it? In which case is it just LOW VALUE? where are the 'armies of new users' who need it?

What are the issues? Intimately tied to infrastructure at each site. site infrastructure leverages off experiments

different everywhere High Maintenance

PBS, LoadLeveller, NSF? dcache anyone? upgrade of mvapich, OpenMPI, IB fabric etc

Inherently non portable (what about ANL/ORNL)

CRE and ILDG

If it has low value, no user demand and is high maintenance and won't work outsideour sites.... is it worth doing? can we just drop it ? PLEASE? Anyway common environments are so passe

and 90s. Nowadays we should think about 'interoperable grid environments' – they're IN!

ILDG

Middleware Progressed but still on eXist MDC dumb RC: (just remap the LFN to a FNAL

dcache name) Issues:

Where is all the markup ? Eventually need more sophisticated RC ? Markup is NOT anisotropy aware (future fights in

the MDWG – will take time) working towards interoperability

Meeting at JlLab Dec 11-13. Can folks from BNL and FNAL come?

Testing and Release

Unit Testing v.s. End to End Testing Too much existing code

We intermix QMP, QDP++, QIO, XpathReader, LIME, Chroma,

Wilson Dslash or BAGEL Dslash, possibly BAGEL linear algebra, level 3 CG-DWF

Unit testing all of these is difficult End to End Tests: Compare the final result

eg: correlation functions Lots of output – selective diffs?

QDP++ Uses XML, Selective Diffs through XMLDiff

Structure Test Consists of

Executable, Input XML, Expected Output XML Metric file to decide which bits of the Output we

need to check Runner – abstract away running

Trivial Runner (just re-echoes your commands) MPIRUN runner (runs on 2 Jlab IB nodes) prototype YOD runner (for XT3) LoadLeveller runner (for BG/L) – yucky

Driver Scripts run interactively (eg scalar targets) & check submit jobs to a queue, check later (for queues)

What has testing taught us?

We run through this regression framework nightly: gcc3,gcc4, scalar, parscalar-ib

What runs fine with gcc3.x on RHEL won't necessarily run fine with gcc4.x on FC5 Maintenance:

Keep up with compilers – identify problems ICC – catastrophic error: can't allocate register (SSE inline) VACPP (XLC) – 'Internal Compiler error: Please contact IBM

representative' on templates PGI: No inline assembler? intrinsics? we really MUST focus on this issue or will it be GCC 3.4.x forever (seems most stable so far)

SciDAC Release Pages?

What's the actual problem here? Jlab page has releases that live in the JLAB CVS

release directory previous versions (by vox populi) We strive to keep the pages up to date

Not everyone uses Jlab CVS. Why? do you prefer to run your own repository? do you you want to use Subversion? do you think only sissies use version control?

Centralizing release management is bad imagine if I had to be responsible for the release of a

code that I myself could only pick up by web page? Is it only John Kogut who is unhappy?

A possible solution ...

... to the problem which may or may not exist A SourceForge like setup (Gforge) Provides Per Project

Web-Space, Release Tarball Space Source Code Management Modules (CVS & SVN)

May be able to 'proxy' for your own repo. Mailing Lists, Bugtracker, Newsfeeds yadda yadda Wiki like authentication

Our new Sysadmins are installing this at JLAB But all the effort is wasted if folks don't use it...

BU SciDAC Meeting

Documents

Transcript of BU SciDAC Meeting