High Performance Computing: Concepts, Methods & Means Scientific Components and Frameworks
description
Transcript of High Performance Computing: Concepts, Methods & Means Scientific Components and Frameworks
AT LOUISIANA STATE UNIVERSITY
High Performance Computing: Concepts, Methods & Means
Scientific Components and Frameworks
Prof. Daniel S. KatzDepartment of Electrical and Computer Engineering
Louisiana State University
April 24th, 2007
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Opening Remarks
• Context: high performance computing• May have multiple types of physics• May have multiple spatial scales• May have multiple time scales• May use multiple solvers• May need multiple I/O libraries• May need multiple visualization interfaces• Need to build real, working, complex
applications• How?
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
3
Topics
• Meat Grinder Introduction (slides from Gary Kumfert, part of CCA tutorial)
• Common Component Architecture (CCA)• Cactus• Earth System Modeling Framework (ESMF)• Summary
A Pictorial Introduction to Components
in Scientific Computing
5
Once upon a time...
Input
Output
Program
6
As Scientific Computing grew...
7
Tried to ease the bottle neck
8
SPMD was born.
21
3 4
21
3 4
2
1
3
4
9
SPMD worked.
21
3 4
21
3 4
2
1
3
4
But it isn’t
easy!!!
But it isn’t
easy!!!
10
Meanwhile, corporate computing was growing in a different way
Input
Output
Program
browser
spreadsheet
editor
graphics
databasemultimedia
email client
Unicode
Input
11
This created a whole new set of problems complexity
browser
spreadsheet
editor
graphics
databasemultimedia
email client
Unicode
Interoperability across multiple languages
Interoperability across multiple platforms
Incremental evolution of large legacy systems (esp. w/ multiple 3rd party software)
12
Component Technology addresses these problems
13
So what’s a component ???
Implementation :No Direct Access
Interface Access :Generated by Tools
Matching Connector :Assigned by FrameworkHidden from User
14
1. Interoperability across multiple languages
C
C++ F77 Java
Python
Language &Platform
independentinterfaces
Automaticallygenerated
bindings toworking code
15
2. Interoperability Across Multiple Platforms Imagine a company
migrates to a new system, OS, etc.
What if the source to
this one part is lost???
16
Transparent Distributed Computing
internetinternet
These wiresare very,
very smart!
17
3. Incremental Evolution WithMultiple 3rd party software
v 1.0
v 2.0 v 3.0
18
Now suppose you find this bug...
v 1.0
v 2.0 v 3.0
19
Good news: an upgrade available
v 1.0
v 2.0 v 3.0
Bad news: there’s a dependency
2.1
2.0
20
v 3.0 2.1
2.0
Great News: Solvable with Components
21
v 1.0
Great News: Solvable with Components
2.1 v 3.0
2.0
22
Why Components for Scientific Computing Complexity
Interoperability across multiple languages
Interoperability across multiple platforms
Incremental evolution of large legacy systems (esp. w/ multiple 3rd party software)
Sapphire
SAMRAI
Ardra Scientific Viz
DataFoundry
Overture
linear solvers hypre
nonlinear solvers
ALPS
JEEP
23
The Model for Scientific Component Programming
Science
Ind
ustry
?CCA
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
24
Topics
• Meat Grinder Introduction • Common Component Architecture (CCA)• Cactus• Earth System Modeling Framework (ESMF)• Summary
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
CCA Motivation: Modern Scientific Software Engineering Challenges
• Productivity– Time to first solution (prototyping)– Time to solution (“production”)– Software infrastructure requirements (“other stuff needed”)
• Complexity– Increasingly sophisticated models– Model coupling – multi-scale, multi-physics, etc.– “Interdisciplinarity”
• Performance– Increasingly complex algorithms– Increasingly complex computers– Increasingly demanding applications
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Motivation: For Library Developers
• People want to use your software, but need wrappers in languages you don’t support– Many component models provide language interoperability
• Discussions about standardizing interfaces are often sidetracked into implementation issues– Components separate interfaces from implementation
• You want users to stick to your published interface and prevent them from stumbling (prying) into the implementation details– Most component models actively enforce the separation
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Motivation: For Application Developers and Users
• You have difficulty managing multiple third-party libraries in your code
• You (want to) use more than two languages in your application
• Your code is long-lived and different pieces evolve at different rates
• You want to be able to swap competing implementations of the same idea and test without modifying any of your code
• You want to compose your application with some other(s) that weren’t originally designed to be combined
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Some Observations About Software…
• “The complexity of software is an essential property, not an accidental one.” [Brooks]– We can’t get rid of complexity
• “Our failure to master the complexity of software results in projects that are late, over budget, and deficient in their stated requirements.” [Booch]– We must find ways to manage it
• “A complex system that works is invariably found to have evolved from a simple system that worked… A complex system designed from scratch never works and cannot be patched up to make it work.” [Gall]– Build up from simpler pieces
• “The best software is code you don’t have to write” [Jobs]– Reuse code wherever possible
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Component-Based Software Engineering
• CBSE methodology is emerging, especially from business and internet areas
• Software productivity– Provides a “plug and play” application development environment– Many components available “off the shelf”– Abstract interfaces facilitate reuse and interoperability of software
• Software complexity– Components encapsulate much complexity into “black boxes”– Plug and play approach simplifies applications– Model coupling is natural in component-based approach
• Software performance (indirect)– Plug and play approach and rich “off the shelf” component library
simplify changes to accommodate different platforms
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
A Simple Example: Numerical Integration Components
FunctionPort
MidpointIntegrator
IntegratorPort
FunctionPort
MonteCarloIntegrator
IntegratorPort
RandomGeneratorPort
IntegratorPort
Driver
GoPort
NonlinearFunction
FunctionPort
LinearFunction
FunctionPort
RandomGenerator
RandomGeneratorPort
PiFunction
FunctionPort
Interoperable components (provide same interfaces)
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Many Applications are Possible…
FunctionPort
MidpointIntegrator
IntegratorPort
FunctionPort
MonteCarloIntegrator
IntegratorPort
RandomGeneratorPort
IntegratorPort
Driver
GoPort
NonlinearFunction
FunctionPort
LinearFunction
FunctionPort
RandomGenerator
RandomGeneratorPort
PiFunction
FunctionPort
Dashed lines indicate alternate
connections
Create different applications in "plug-and-play" fashion
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
The “Sociology” of Components
• Components need to be shared to be truly useful– Sharing can be at several levels
• Source, binaries, remote service
– Various models possible for intellectual property/licensing• Components with different IP constraints can be mixed in a
single application
• Peer component models facilitate collaboration of groups on software development– Group decides overall architecture and interfaces– Individuals/sub-groups create individual components
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Who Writes Components?• “Everyone” involved in creating an application can/should
create components– Domain scientists as well as computer scientists and applied
mathematicians– Most will also use components written by other groups
• Allows developers to focus on their interest/specialty– Get other capabilities via reuse of other’s components
• Sharing components within scientific domain allows everyone to be more productive– Reuse instead of reinvention
• As a unit of publication, a well-written and tested component is like a high-quality library– Often a more appropriate unit of publication/recognition than an
entire application code
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
CCA Concepts: Components
• Components are a unit of software composition– Composition is based on interfaces (ports)
• Components provide/use one or more ports– A component with no ports isn’t very interesting– Components interact via ports; implementation is opaque to the
outside world
• Components include some code which interacts with the CCA framework
• The granularity of components is dictated by the application architecture and by performance considerations
• Components are peers– Application architecture determines relationships
NonlinearFunction
FunctionPortFunctionPort
MidpointIntegrator
IntegratorPort
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
What is a Component Architecture?
• A set of standards that allows:– Multiple groups to write units of software (components)…– And have confidence that their components will work with
other components written in the same architecture
• These standards define…– The rights and responsibilities of a component– How components express their interfaces– The environment in which are composed to form an
application and executed (framework)– The rights and responsibilities of the framework
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
CCA Concepts: Frameworks• The framework provides the means to “hold” components
and compose them into applications– The framework is often application’s “main” or “program”
• Frameworks allow exchange of ports among components without exposing implementation details
• Frameworks provide a small set of standard services to components– BuilderService allow programs to compose CCA apps
• Frameworks may make themselves appear as components in order to connect to components in other frameworks
• Currently: specific frameworks support specific computing models (parallel, distributed, etc.). Future: full flexibility through integration or interoperation
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
CCA Concepts: Ports
• Components interact through well-defined interfaces, or ports– In OO languages, a port is a class or interface– In Fortran, a port is a bunch of subroutines or a module
• Components may provide ports – implement the class or subroutines of the port ( )
• Components may use ports – call methods or subroutines in the port ( )
• Links denote a procedural (caller/callee) relationship, not dataflow!– e.g., FunctionPort could contain: evaluate(in Arg, out Result)
NonlinearFunction
FunctionPortFunctionPort
MidpointIntegrator
IntegratorPort
“Provides” Port
“Uses” Port
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Interfaces, Interoperability, and Reuse
• Interfaces define how components interact…– Therefore interfaces are key to interoperability and reuse
of components
• In many cases, “any old interface” will do, but…– General plug and play interoperability requires multiple
implementations providing the same interface
• Reuse of components occurs when they provide interfaces (functionality) needed in multiple applications
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Designing for Reuse, Implications
• Designing for interoperability and reuse requires “standard” interfaces– Typically domain-specific– “Standard” need not imply a formal process, may mean “widely used”
• Generally means collaborating with others
• Higher initial development cost (amortized over multiple uses)
• Reuse implies longer-lived code– thoroughly tested – highly optimized– improved support for multiple platforms
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Relationships: Components, Objects, and Libraries
• Components are typically discussed as objects or collections of objects– Interfaces generally designed in OO terms, but…– Component internals need not be OO– OO languages are not required
• Component environments can enforce the use of published interfaces (prevent access to internals)– Libraries can not
• It is possible to load several instances (versions) of a component in a single application– Impossible with libraries
• Components must include some code to interface with the framework/component environment– Libraries and objects do not
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Domain-Specific Frameworks vs Generic Component Architectures
Domain-Specific• Often known as “frameworks”• Provide a significant software
infrastructure to support applications in a given domain– Often attempts to generalize
an existing large application
• Often hard to adapt to use outside the original domain– Tend to assume a particular
structure/workflow for application
• Relatively common
Generic• Provide the infrastructure to
hook components together– Domain-specific infrastructure
can be built as components• Usable in many domains
– Few assumptions about application
– More opportunities for reuse• Better supports model coupling
across traditional domain boundaries
• Relatively rare at present– Commodity component models
often not so useful in HPC scientific context
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Special Needs of Scientific HPC
• Support for legacy software– How much change required for component environment?
• Performance is important– What overheads are imposed by the component
environment?
• Both parallel and distributed computing are important– What approaches does the component model support?– What constraints are imposed?– What are the performance costs?
• Support for languages, data types, and platforms– Fortran?– Complex numbers? Arrays? (as first-class objects)– Is it available on my parallel computer?
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
What is the CCA? (User View)
• A component model specifically designed for high-performance scientific computing
• Supports both parallel and distributed applications
• Designed to be implementable without sacrificing performance
• Minimalist approach makes it easier to componentize existing software
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
What is the CCA? (2)
• Components are peers
• Not just a dataflow model
• A tool to enhance the productivity of scientific programmers– Make the hard things easier, make some
intractable things tractable– Support & promote reuse & interoperability– Not a magic bullet
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Importance of Provides/Uses Pattern for Ports
• Fences between components– Components must declare both
what they provide and what they use
– Components cannot interact until ports are connected
– No mechanism to call anything not part of a port
• Ports preserve high performance direct connection semantics…
• …While also allowing distributed computing
Component 1 Component 2
Provides/UsesPort
Direct Connection
Component 1
Component 2
UsesPort
ProvidesPort
NetworkConnection
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
CCA Concepts: Framework Stays “Out of the Way” of Component Parallelism
• Single component multiple data (SCMD) model is component analog of widely used SPMD model
P0 P1 P2 P3
Components: Blue, Green, Red
Framework: Gray
MCMD/MPMD also supported
•Different components in same process “talk to each” other via ports and the framework
•Same component in different processes talk to each other through their favorite communications layer (i.e. MPI, PVM, GA)
• Each process loaded with the same set of components wired the same way
Other component models ignore parallelism entirely
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
CCA Concepts: MxN Parallel Data Redistribution
• Share Data Among Coupled Parallel Models– Disparate Parallel Topologies (M processes vs. N)– e.g. Ocean & Atmosphere, Solver & Optimizer…– e.g. Visualization (Mx1, increasingly, MxN)
Research area -- tools under development
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
CCA Concepts:Language Interoperability
• Existing language interoperability approaches are “point-to-point” solutions
• Babel provides a unified approach in which all languages are considered peers
• Babel used primarily at interfaces
C
C++
f77
f90
Python
Java
Babel
C
C++
f77
f90
Python
JavaFew other component models support all languages and data types important for scientific computing
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
What the CCA isn’t…• CCA doesn’t specify who owns “main”
– CCA components are peers– Up to application to define component relationships
• “Driver component” is a common design pattern
• CCA doesn’t specify a parallel programming environment– Choose your favorite– Mix multiple tools in a single application
• CCA doesn’t specify I/O– But it gives you the infrastructure to create I/O components
• CCA doesn’t specify interfaces– But it gives you the infrastructure to define and enforce them– CCA Forum supports & promotes “standard” interface efforts
• CCA doesn’t require (but does support) separation of algorithms/physics from data
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
What the CCA is…• CCA is a specification for a component environment
–Fundamentally, a design pattern–Multiple “reference” implementations exist–Being used by applications
• CCA increases productivity–Supports and promotes software interoperability and reuse–Provides “plug-and-play” paradigm for scientific software
• CCA offers the flexibility to architect your application as you think best–Doesn’t dictate component relationships, programming models, etc.–Minimal performance overhead–Minimal cost for incorporation of existing software
• CCA provides an environment in which domain-specific application frameworks can be built
–While retaining opportunities for software reuse at multiple levels
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Review of CCA Terms & Concepts
• Ports– Interfaces between components– Uses/provides model
• Framework– Allows assembly of components into applications
• Direct Connection– Maintain performance of local inter-component calls
• Parallelism– Framework stays out of the way of parallel components
• MxN Parallel Data Redistribution– Model coupling, visualization, etc.
• Language Interoperability– Babel, Scientific Interface Definition Language (SIDL)
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
CCA Summary• Components are a software engineering tool to help
address software productivity and complexity
• Important concepts: components, interfaces, frameworks, composability, reuse
• Scientific component environments come in “domain specific” and “generic” flavors
• Scientific HPC imposes special demands on component environments– which commodity tools may have trouble with
• The Common Component Architecture is specially designed for the needs of HPC
• CCA is a research project - intended to be quite general - not heavily used yet in production
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
53
Topics
• Meat Grinder Introduction • Common Component Architecture (CCA)• Cactus (slides from Tom Goodale)• Earth System Modeling Framework (ESMF)• Summary
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
What Is Cactus?• Cactus is a framework for developing portable, modular
applications, in particular, although not exclusively, high-performance simulation codes.
• Cactus is designed to allow experts in different fields to develop modules based upon their expertise and to leverage off modules developed by experts in other fields to perform their work, with minimal knowledge of the internals or operation of the other modules.
• This enables it to be used in large, geographically dispersed, collaborations.
• Cactus and the Cactus Computational Toolkit are Open Source and freely available.
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Cactus Goals• Portable• Modular
– People can write modules that interact through standard interfaces with other modules without knowing internals of the other modules
– Modules with same functionality are interchangeable
• Support legacy codes• Make use of existing technologies and tools where appropriate• Future proof
– Not tied to any particular paradigm– Parallelism is independent but compatible with MPI or PVM– I/O system is independent but compatible with HDF or others
• Easy to use• Maintainable
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Cactus History
• First developed in 1997 by Paul Walker, Joan Masso, and others, as a continuation of a long line of numerical relativity codes, such as the NCSA G-code and Paul's Framework
• In first years, Cactus became progressively more modular, allowing modules for different formulations of Einstein's equations and different physical systems
• Although in principle Cactus was modular, its history and evolution had left many dependencies between modules and between the core and the modules
• Cactus 4.0 (current) is complete redesign of core -- moved everything possible out into modules, and put structures in place to enable modules to be far more independent
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Current Cactus Users• Numerical Relativity
– Used by many groups including: AEI (Germany), UNAM (Mexico), Tuebingen (Germany), Southampton (UK), Sissa (Italy), Valencia (Spain), U. of Thessaloniki (Greece), MPA (Germany), RIKEN (Japan), TAT (Denmark), Penn State, U. of Texas at Austin, U. of Texas at Brownsville, LSU (USA), Wash, U. of Pittsburgh, U. of Arizona, Washburn, UIB (Spain), U. of Maryland, Monash (Australia)
• Quantum Gravity• Coastal and Climate Modeling• CFD
– KISTI– DLR looking at flow in turbines
• Lattice Boltzmann
Over 150 Science Papers
Over 30 Student Theses
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Cactus Structure• Cactus source code consists of core part, the Flesh, and set of
modules, the Thorns• Flesh
– Independent of all thorns– After initialization, acts as a utility and service library that the thorns call
to get information or ask for some action to happen
• Thorns– Separate libraries that encapsulate some functionality– In order to keep a distinction between functionality and implementation
of the functionality, each thorn declares that it provides a certain “implementation”
– Different thorns can provide the same “implementation”, and thorn dependencies are expressed in terms of “implementations” rather than explicit references to thorns, thus allowing the different thorns providing the same “implementation” to be interchangeable
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Structure
Core “Flesh”
Plug-In “Thorns”
(modules)
driverdriver
input/outputinput/output
interpolationinterpolation
SOR solverSOR solver
coordinatescoordinates
boundaryboundary conditionsconditions
black holesblack holes
equations of stateequations of state
remote steeringremote steering
wave evolverswave evolvers multigridmultigrid
parametersparameters
gridgrid variablesvariables
errorerror handlinghandling
schedulingscheduling
extensibleextensible APIsAPIs
makemake systemsystem
ANSI CANSI CFortran/C/C++Fortran/C/C++
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Cactus Flesh
• Make System– Organizes builds as configurations which hold everything
needed to build with a particular set of options on a particular architecture
• API– Functions which must be there for thorns to operate
• Scheduling– Sophisticated scheduler which calls thorn-provided
functions as and when needed
• CCL– Configuration language which tells the flesh all it needs to
know about the thorns
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Thorn Specification• The Flesh finds out about thorns by configuration files in
each thorn• These files are converted at compile time into a set of
routines the Flesh can call to find out about thorns• There are three such files
– Scheduling directives• The flesh incorporates a scheduler which is used to call defined routines
from different thorns in a particular order
– Interface definitions• All variables which are passed between scheduled routines need to be
declared
• Any thorn-provided functions which other thorns call should be declared
– Parameter definitions• The flesh and thorns are controlled by a parameter file; parameters must
be declared along with their allowed values
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Scheduling
• Thorns specify which functions are to be called at which time, and in which order
• Rule-based scheduling system• Routines are either before or after other routines (or
don't care)• Routines can be grouped, and whole group scheduled• Functions or groups can be scheduled while some
condition is true• Flesh sorts all rules and flags an error for inconsistent
schedule requests
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
The Driver Layer
• In principle, drivers are the only thorns which know anything about parallelism
• Other thorns access parallelism via an API provided by the flesh• Underlying parallel layer could be anything from a TCP-socket to Java
RMI -- should be transparent to application thorns• Could even be a combination of things• Can even run with no parallel layer at all
• Can pick actual driver to use at runtime - no need to recompile code to test differences between parallel layers
• Can take one executable and use whatever the best layer for any particular environment happens to be
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Current Drivers
• There are several drivers available at the moment, both developed by the cactus team and by the community.
• PUGH
– a parallel uni-grid driver, which comes as part of the the computational toolkit
• PAGH
– a parallel AMR driver which uses the GrACE library for grid hierarchy management
• Carpet a parallel fixed mesh refinement driver
• SimpleDriver
– a simple demonstration driver which illustrates driver development
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Cactus Computational Toolkit
• Core thorns which provide many basic utilities, such as:– Boundary conditions– I/O methods– Reduction and Interpolation operations– Coordinate Symmetries– Parallel drivers– Elliptic solvers– Web-based interaction and monitoring interface– ...
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Current Capabilities: Methods
• Almost all codes in Cactus are explicit finite difference codes on structured meshes
• In principle, finite volume or finite element on structured meshes is possible
• There is now a generic method-of-lines thorn which makes developing thorns using such methods very quick and easy
• Interface for elliptic solvers and support for generic elliptic solver packages such as PETSc as well as a numerical-relativity-specific multigrid solver written by Bernd Bruegmann– However, interface is not as generic as it could be, and it may not be
too useful as it stands for solving general implicit problems
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Current Capabilities: Interaction
• HTTPD thorn provides interface that allows web browser to connect to running simulation
• Allows a user to examine state of running simulation and change parameters, such as frequency of I/O or variables to be output, or any other parameter that thorn author declared may be changed during the simulation
• These capabilities may be extended by any other thorn– E.g. the HTTPDExtra thorn allows the user to download any file
output by the I/O thorns in the Computational toolkit, and even to view two-dimensional slices as jpegs
– Also, there is helper script for web browsers that allows appropriate visualization tool to be launched when a user requests a file
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Cactus Summary
• Used to build apps in NumRel, and starting to be used in other fields
• Flesh/Thorns distinction– Flesh is like CCA Framework + some
general components– Thorns are like CCA components
• Production code for certain domains, well-used and well-tested
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
69
Topics
• Meat Grinder Introduction • Common Component Architecture (CCA)• Cactus• Earth System Modeling Framework (ESMF) -
slides from ESMF tutorial• Summary
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
ESMF Motivation and Context
In climate research and NWP... increased emphasis on detailed representation of individual physical processes; requires many teams of specialists to contribute components to an overall modeling system
In computing technology... increase in hardware and software complexity in high-performance computing, as we shift toward the use of scalable computing architectures
In software …development of first-generation frameworks, such as FMS, GEMS, CCA and WRF, that encourage software reuse and interoperability
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
What is ESMF?• ESMF provides tools for turning model
codes into components with standard interfaces and standard drivers.
• ESMF provides data structures and common utilities that components use for routine services such as data communications, regridding, time management and message logging.
ESMF InfrastructureData Classes: Bundle, Field, Grid, Array
Utility Classes: Clock, LogErr, DELayout, Machine
ESMF SuperstructureAppDriver
Component Classes: GridComp, CplComp, State
User Code
ESMF GOALS
1. Increase scientific productivity by making model components much easier to build, combine, and exchange, and by enabling modelers to take full advantage of high-end computers.
2. Promote new scientific opportunities and services through community building and increased interoperability of codes (impacts in collaboration, code validation and tuning, teaching, migration from research to operations)
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
GEOS-5
surface fvcore gravity_wave_drag
history agcm
dynamics physics
chemistry moist_processes radiation turbulence
infrared solar lake land_ice data_ocean land
vegetation catchment
coupler
coupler coupler
coupler
coupler
coupler
coupler
• Each box is an ESMF component
• Every component has a standard interface so that it is swappable
• Data in and out of components are packaged as state types with user-defined fields
• New components can be added to the system• Each ESMF application is also a Gridded Component• Entire ESMF applications can be nested within larger applications• This strategy can be used to systematically compose very large, multi-
component codes.• Coupling tools include regridding and redistribution methods
Application Example: GEOS-5 AGCM
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Design Strategies• Modularity
◦ Gridded Components don’t have access to the internals of other Gridded Components, and don’t store any coupling information
◦ Gridded Components pass their States to other components through their argument list.
◦ Components can be used standalone or coupled with others into a larger application.
• Flexibility◦ Users write their own drivers as well as their own Gridded Components and
Coupler Components -- Users decide on their own control flow
• Communication◦ All communication handled within components. If an atmosphere is coupled
to an ocean, Coupler Component is defined on both atmosphere and ocean processors.
◦ The same programming interface is used for shared memory, distributed memory, and combinations thereof. This buffers the user from variations and changes in the underlying platforms.
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Elements of Parallelism: Serial vs. Parallel
• Computing platforms can have multiple processors, some or all of which may share the same memory pools
• Can be multiple Persistent Execution Threads (PETs)• Can be multiple PETs per processor• Software like MPI and OpenMP commonly used for
parallelization• Programs can run in a serial fashion, with one PET, or in
parallel, using multiple PETs• Often, a PET can be thought of as a processor• Sets of PETs are represented by Virtual Machines (VMs)
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Elements of Parallelism: Sequential vs. Concurrent
In sequential mode components run one after the other on the same set of PETs.
GridComp “Atmosphere”
GridComp “Hurricane Model”
GridComp “Ocean”
CplComp “Atm-Ocean Coupler”
LOOP Call Run
Run
Run
Run
Run
AppDriv er (“Main”)
Call Run
1 2 3 5 4 6
PETs
T i m
e
7 8 9
In concurrent mode components run at the same time on different sets of PETs
GridComp “Atmosphere”
GridComp “Hurricane Model”
GridComp “Ocean”
CplComp “Atm-Ocean Coupler”
LOOP Call Run
Run
Run Run
Run
AppDriver (“Main”)
Call Run
1 2 3 5 4 6
PETs
T i m
e
7 8 9
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Elements of Parallelism: DEs
1 2 3 5 4 6
PETs
7 8 9
1 2 3 5 4 6
DEs
7 8 9
Temperature Field T
VM with 9 PETs
1 x 9 DELay out
4 x 9 f ield T1 T10 T19 T28
T2 T11 T20 T29
T3 T12 T21 T30
T4 T13 T22 T31
T5 T14 T23 T32
T6 T15 T24 T33
T7 T16 T25 T34
T8 T17 T26 T35
T9 T18 T27 T36
• Data decomposition represented as set of Decomposition Elements (DEs)
• Sets of DEs are represented by the DELayout class• DELayouts define how data is mapped to PETs• In many applications
there is one DE per PET
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Modes of Parallelism:Single vs. Multiple Executable
• In Single Program Multiple Datastream (SPMD) mode the same program runs across all PETs in the application - components may run sequentially or concurrently.
• In Multiple Program Multiple Datastream (MPMD) mode the application consists of separate programs launched as separate executables - components may run concurrently or sequentially, but in this mode almost always run concurrently
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Classes and Objects in ESMF
• The ESMF Application Programming Interface (API) based on object-oriented programming notion of class– A software construct that’s used for grouping a set of
related variables together with the subroutines and functions that operate on them
– They help to organize the code, and often make it easier to maintain and understand.
• A particular instance of a class is an object– For example, Field is an ESMF class– An actual Field called temperature is an object
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
ESMF Class Structure
DELayoutCommunications
StateData imported or exported
BundleCollection of fields
GridCompLand, ocean, atm, … model
F90
Superstructure
Infrastructure
FieldPhysical field, e.g. pressure
GridLogRect, Unstruct, etc.
Data Communications
C++
RegridComputes interp weights
CplCompXfers between GridComps
UtilitiesVirtual Machine, TimeMgr, LogErr, IO, ConfigAttr, Base etc.
ArrayHybrid F90/C++ arrays Route
Stores comm paths
DistGridGrid decomposition
PhysGridMath description
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
ESMF Superstructure Classes
• Gridded Component – Models, data assimilation systems - “real code”
• Coupler Component– Data transformations and transfers between Gridded
Components
• State – Packages of data sent between Components– Can be Bundles, Fields, Arrays, States, or name-
placeholders
• Application Driver – Generic driver
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
ESMF Components
• Component has two parts– One supplied by ESMF - an ESMF derived type that is either a
Gridded Component or a Coupler Component– One supplied by the user
• Gridded Component typically represents a physical domain in which data is associated with one or more grids - for example, a sea ice model
• Coupler Component arranges and executes data transformations and transfers between one or more Gridded Components.
• Gridded Components and Coupler Components have standard methods, which include initialize, run, and finalize
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
ESMF Infrastructure Data Classes
• Model data is contained in a hierarchy of multi-use classes• The user can reference a Fortran array to an Array or Field,
or retrieve a Fortran array out of an Array or Field.• Array – holds a Fortran array (with other info, such as
halo size) • Field – holds an Array, an associated Grid, and
metadata• Bundle – collection of Fields on the same Grid bundled
together for convenience, data locality, latency reduction during communications
Supporting these data classes is the Grid class, which represents a numerical grid
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Application Driver• Small, generic program that contains the “main” for an ESMF application.
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
ESMF Communications
• Halo– Updates edge data for consistency between
partitions
• Redistribution – No interpolation, only changes how the data
is decomposed
• Regrid– Based on SCRIP package from Los Alamos – Methods include bilinear, conservative
• Bundle, Field, Array-level interfaces
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
ESMF Utilities• Time Manager• Configuration Attributes (replaces namelists)• Message logging • Communication libraries• Regridding library (parallelized, on-line SCRIP)• I/O (barely implemented)• Performance profiling (not implemented yet, may
simply use Tau)
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
ESMF Summary
• Developed for and by climate community
• Sandwich model– EMSF provides superstructure and
infrastructure, user provides filling
• Used for some applications, and increasingly, apps are written using it
• Mostly Fortran-based (user community requirement), and CCA compatible
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Summary – Material for the Test
• CCA Motivations: slides 25-27• Component based Software Engineering: slide 29• CCA Concepts: slides 34-50• What is Cactus: slides 54,55,57• Cactus Architecture: slides 58-65• Cactus, current capabilities: slides 66,67• What is ESMF: slides 70,71• Design concepts in ESMF: slides 73-77• ESMF Architectural Components: slides 78-85
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
88
URLs
• Common Component Architecture (CCA)– http://www.cca-forum.org/
• Cactus– http://www.cactuscode.org/
• Earth System Modeling Framework (ESMF)– http://www.esmf.ucar.edu/
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY