SC03, Nov 2003Y.He1 MPH: a Library for Coupling Climate Component Models on Distributed Memory...

21
SC03, Nov 2003 Y.He 1 MPH: a Library for Coupling Climate Component Models on Distributed Memory Architectures Chris Ding and Yun (Helen) He ([email protected] , [email protected] ) Lawrence Berkeley National Laboratory

Transcript of SC03, Nov 2003Y.He1 MPH: a Library for Coupling Climate Component Models on Distributed Memory...

SC03, Nov 2003 Y.He 1

MPH: a Library for Coupling Climate Component Models on Distributed Memory Architectures

Chris Ding and Yun (Helen) He

([email protected], [email protected])

Lawrence Berkeley National Laboratory

SC03, Nov 2003 Y.He 2

SC03, Nov 2003 Y.He 3

Motivation

Application problems grow in scale & complexity

Effective organization of simulation software system a major issue

Software lasts much longer than a computer!

SC03, Nov 2003 Y.He 4

Multi-Component Approach

Build from (semi-)independent programs Coupled Climate System = Atmosphere + Ocean + Sea-

Ice + Land-Surface + Flux-Coupler Components developed by different groups at different

institutions Maximum flexibility and independence Algorithm, implementation depends on individual groups,

practicality, time-to-completion, etc.

Components communicate through well-defined interface data structure.

SC03, Nov 2003 Y.He 5

Distributed Components on HPC Systems

Use MPI for high performance MPH: establish a multi-component environment

MPI Communicator for each components Component name registration Resource allocation for each component Support different job execution modes Stand-out / stand-in redirect Complete flexibility

SC03, Nov 2003 Y.He 6

A climate simulation system consists of many independently-developed componentson distributed memory multi-processor computer

Single-component executable: Each component is a stand-alone executable

Multi-component executable: Several components compiled into an executable

Different model integration modes: Single-Component executable Multi-Executable system (SCME) Multi-Component executable Single-Executable system (MCSE) Multi-Component executable Multi-Executable system (MCME) Multi-Instance executable Multi-Executable system (MIME)

SC03, Nov 2003 Y.He 7

Component Integration / Job Execution Modes

Multi-Component exec. Single-Executable system (MCSE): Each component is a module All components compiled into a single executable Many issues: name conflict, static allocations, etc. Stand-alone component Easy to understand and coordinate

SC03, Nov 2003 Y.He 8

Component Integration / Job Execution Modes

Single-Component exec. Multi-Executable system (SCME): Each component is an independent executable image Components run on separate subsets of SMP nodes Max flexibility in language, data structures, etc. Industry standard approach

SC03, Nov 2003 Y.He 9

Component Integration / Job Execution Modes

Multi-Component exec. Multi-executable system (MCME): Several components compiled into one executable Multiple executables form a single system Different executables run on different processors Different components within same executable could

run on separate subsets of processors Maximum flexibility Includes MCSE and SCME as special cases

SC03, Nov 2003 Y.He 10

Component Integration / Job Execution Modes

Multi-Instance exec. Multi-executable system (MIME): Same executable replicated multiple times on

different processor subsets Run multiple ensembles simultaneously as a single

job Ensemble statistics able to run on the fly Efficient usage of computer resource

SC03, Nov 2003 Y.He 11

Multi-Component Single-Executable (MCSE)

master.F:master.F: PCMPCM mph_exe_world = mph_exe_world = MPH_components_setupMPH_components_setup (name1=‘ (name1=‘atmosphereatmosphere’, ’, name2=‘name2=‘oceanocean’, name3=‘’, name3=‘couplercoupler’) ’)

if (PE_in_component (‘if (PE_in_component (‘oceanocean’, ’, commcomm)) call )) call ocean_v1ocean_v1 ( (commcomm)) if (PE_in_component (‘if (PE_in_component (‘atmosphereatmosphere’, ’, commcomm)) call )) call atmosphereatmosphere ( (commcomm)) if (PE_in_component (‘if (PE_in_component (‘couplercoupler’, ’, commcomm)) call )) call coupler_v2coupler_v2 ( (commcomm))

Component registration file:Component registration file: BEGINBEGIN Multi_Comp_StartMulti_Comp_Start atmosphereatmosphere 0 7 0 7 ocean ocean 8 13 8 13 couplercoupler 14 15 14 15 Multi_Comp_EndMulti_Comp_End ENDEND

SC03, Nov 2003 Y.He 12

Single-Component Multi-Executable (SCME)

CCSMCCSM Coupled System = Atmosphere + Ocean + Flux-Coupler

atm.F: atm.F: atm_world = MPH_components_setup(“atm_world = MPH_components_setup(“atmosphereatmosphere”)”) ocean.F: ocean.F: ocn_world = MPH_components_setup (“ocn_world = MPH_components_setup (“oceanocean”)”) coupler.F: coupler.F: cpl_world =cpl_world = MPH_components_setup (“MPH_components_setup (“couplercoupler”)”)

Component Registration File:Component Registration File: BEGINBEGIN

atmosphereatmosphere ocean ocean couplercoupler ENDEND

SC03, Nov 2003 Y.He 13

Multi-Component Multi-Executable (MCME)

Most FlexibleMost Flexible exe_world = exe_world = MPH_componentsMPH_components (name1=‘ (name1=‘oceanocean’, name2=‘’, name2=‘iceice’,…) ’,…) Component Registration File:Component Registration File: BEGINBEGIN coupler ! a single-component executablecoupler ! a single-component executable Multi_Comp_Start ! first multi-component executable Multi_Comp_Start ! first multi-component executable ocean 0 3ocean 0 3 ice 4 10ice 4 10 Multi_Comp_EndMulti_Comp_End Multi_Comp_Start ! second multi-component executableMulti_Comp_Start ! second multi-component executable atmosphere 0 10 atmosphere 0 10 land 11 13land 11 13 chemistry 14 25chemistry 14 25 Multi_Comp_EndMulti_Comp_End ENDEND

SC03, Nov 2003 Y.He 14

Multi-Instance Multi-Executable (MIME)

Ensemble SimulationsEnsemble Simulations exe_world = exe_world = MPH_componentsMPH_components (name1=‘ (name1=‘oceanocean’, name2=‘’, name2=‘iceice’,…) ’,…) Component Registration File:Component Registration File: BEGINBEGIN Multi_Instance_Start ! a multi-instance executable Multi_Instance_Start ! a multi-instance executable ocean1 0 15 ocean1 0 15 infile_1 outfile_1 logfile_1 alpha=3 debug=offinfile_1 outfile_1 logfile_1 alpha=3 debug=off ocean2 16 31 ocean2 16 31 infile_2 outfile_2 beta=4.5 debug=oninfile_2 outfile_2 beta=4.5 debug=on ocean3 32 47 ocean3 32 47 infile_3 dynamics=finite_volumeinfile_3 dynamics=finite_volume Multi_Instance_EndMulti_Instance_End statistics ! a single-component executablestatistics ! a single-component executable ENDEND Up to 5 strings in each line could be appended for passing parameters:Up to 5 strings in each line could be appended for passing parameters: call MPH_get_argument (“alpha”, alpha)call MPH_get_argument (“alpha”, alpha) call MPH_get_argument(field_num=2, field_val=output_file)call MPH_get_argument(field_num=2, field_val=output_file)

SC03, Nov 2003 Y.He 15

Joining two components

MPH_comm_join (“atmosphere”, “ocean”, comm_new) comm_new contains all nodes in “atmosphere”, “ocean”. “atmosphere” nodes rank 0~7, “ocean” nodes rank 8~11

MPH_comm_join (“ocean”, “atmosphere”, comm_new) “ocean” nodes rank 0~3, “atmosphere” nodes rank 4~11

Afterwards, data remapping with “comm_new”

Direct communication between two components MPI_send (…, MPH_global_id (“ocean”, 3), …)

for query / probe tasks

SC03, Nov 2003 Y.He 16

MPH Inquiry Functions

MPH_global_id() MPH_local_id() MPH_comp_name() MPH_total_components() MPH_exe_world() MPH_num_ensembles MPH_get_strings() MPH_get_argument() …

SC03, Nov 2003 Y.He 17

Status

Completed MPH1, MPH2, MPH3, MPH4 Software available free online: http://www.nersc.

gov/research/SCG/acpi/MPH Complete users manual

MPH runs on IBM SP SGI Origin HP Compaq clusters PC Linux clusters

SC03, Nov 2003 Y.He 18

MPH Users

MPH users NCAR CCSM CSU Icosahedra Grid Coupled Climate Model NCAR/WRF, for coupled models

People expressed clear interests in using MPH SGI/NASA, Irene Carpenter / Jim Laft, on SGI for coupled

models UK ECMWF, for ensemble simulations Germany, Johannes Diemer, for coupled model on HP clusters NOAA, for coupling models over grids

SC03, Nov 2003 Y.He 19

Related Work

MPH is similar to PVM, but much smaller, simpler, scalable, and support more execution/integration modes.

Software industry trend (CORBA, DCE, DCOM) Common Component Architecture (DOE project) Earth System Model Framework (ESMF)

SC03, Nov 2003 Y.He 20

Future Work

Flexible way to handle SMP nodes for MPI tasks Dynamic component model processor allocation or

migration Extension to do model integration over grids A C/C++ version

SC03, Nov 2003 Y.He 21

Summary Multi-Component Approach for large & complex

application software MPH glues together distributed components Main Functionality:

Name registration resource allocation inter-component communication query multi-component environment

Single-Component Multi-Executable (SCME) Multi-Component Single-Executable (MCSE) Multi-Component Multi-Executable (MCME) Multi-Instance Multi-Executable (MIME) Easily switch between different modes