“High Performance Computing and Simulation Symposium 2008” Ottawa, Canada, April 14-16, 2008

26
Solution of the Implicit Formulation of High Order Diffusion for the Canadian Atmospheric GEM Model “High Performance Computing and Simulation Symposium 20 08” Ottawa, Canada, April 14-16, 2008 Abdessamad Qaddouri & Vivian Lee Atmospheric Science & Technology

description

Solution of the Implicit Formulation of High Order Diffusion for the Canadian Atmospheric GEM Model. “High Performance Computing and Simulation Symposium 2008” Ottawa, Canada, April 14-16, 2008 Abdessamad Qaddouri & Vivian Lee Atmospheric Science & Technology. Outline. - PowerPoint PPT Presentation

Transcript of “High Performance Computing and Simulation Symposium 2008” Ottawa, Canada, April 14-16, 2008

Page 1: “High Performance Computing and Simulation Symposium 2008” Ottawa, Canada, April  14-16, 2008

Solution of the Implicit Formulation of High Order Diffusion for the Canadian

Atmospheric GEM Model

“High Performance Computing and Simulation Symposium 2008”

Ottawa, Canada, April 14-16, 2008

Abdessamad Qaddouri & Vivian LeeAtmospheric Science & Technology

Page 2: “High Performance Computing and Simulation Symposium 2008” Ottawa, Canada, April  14-16, 2008

Ottawa, Canada, April 14-16, 2008 2

Outline

• Introduction of GEM Model • High order Diffusion equation and solution • Parallelization of the solution• Numerical performance Tests• Conclusion

Page 3: “High Performance Computing and Simulation Symposium 2008” Ottawa, Canada, April  14-16, 2008

Ottawa, Canada, April 14-16, 2008 3

Numerical Weather Prediction (NWP)

• Physics• Applied Mathematics• Real-time applications• Computers at Canadian Meteorological centre (CMC) IBM P5+

NECSX-5/32M2

NECSX-4/80M3

NECSX-4/16

NEC SX-3/44R

Cray1S

CDC176

CrayXMP 416

CDC 7600

NEC SX-3/44

NEC SX-6/80M10

1

10

100

1000

10000

100000

1000000

10000000

1974 1976 1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006

MFL

OPF

s

CrayXMP 28

IBM P4

Page 4: “High Performance Computing and Simulation Symposium 2008” Ottawa, Canada, April  14-16, 2008

Ottawa, Canada, April 14-16, 2008 4

0 5 10 30 365deterministic

forecastsprobabilistic

forecasts(days)

902

Statist

ical

(4 tim

es pe

r year

)

1empiricalforecasts

2.5 km

resol

ution

(once

per d

ay)

15 km

resol

ution

(twice

per d

ay)

35 km

resol

ution

(once

per d

ay)

100 k

m resol

ution

(onc

e per

day)

250 k

m resol

ution

(twice

per m

onth)

250-4

00 km

resol

ution

(4 tim

es pe

r year

)Forecast lead time

Page 5: “High Performance Computing and Simulation Symposium 2008” Ottawa, Canada, April  14-16, 2008

Ottawa, Canada, April 14-16, 2008 5

Var

iabl

e

Uni

form

Rotated

LimitedArea

15km= 574x641x58

35km=800x600x58

2.5km=672x494x58

Page 6: “High Performance Computing and Simulation Symposium 2008” Ottawa, Canada, April  14-16, 2008

Ottawa, Canada, April 14-16, 2008 6

Hydrostatic Model

• Horizontal motion (momentum)

• Thermodynamics, hydrostatic and state

• Continuity and boundary conditions

lnH

H Hd v

d R T p fdt

V k V F

ln ln ( ) 1; ; d T d p gh pFdt dt p RT

ln 0; , 0 bottom top

d p ZD Z Zdt Z Z

Page 7: “High Performance Computing and Simulation Symposium 2008” Ottawa, Canada, April  14-16, 2008

Ottawa, Canada, April 14-16, 2008 7

Schematic for Semi lagrangian implicitMethod used for the integration of GEM Model

Discretization ...),,(

0)(

pTX

XdtdX

V

H

( )

( )

X X R

XR X

H

H 2),(),,()(~

)(~

tttdtd

rVrVrV

rVr

Trajectory

)()(

)(

)1()1()1(

)1()()(

kkk

kkk

XXX

XRXX

NH

N

L

L

Nonlinear IterationsDiffusion

on specific fields

Page 8: “High Performance Computing and Simulation Symposium 2008” Ottawa, Canada, April  14-16, 2008

Ottawa, Canada, April 14-16, 2008 8

Horizontal High order Diffusion

• Horizontal prognostic field

• Damping rate

121 ; 2, 4,6,8

mm m

t

Wave-length

Dam

ping

rate

Page 9: “High Performance Computing and Simulation Symposium 2008” Ottawa, Canada, April  14-16, 2008

Ottawa, Canada, April 14-16, 2008 9

Horizontal High order Diffusion…

• Horizontal prognostic field

• Implicit Discretization

121 ; 2, 4,6,8

mm m

t

1 1

2 2

/22 1 1

22

2 22

1 1

1 1

1with coscos

m n n nm m

m n n

t t

R

a

Page 10: “High Performance Computing and Simulation Symposium 2008” Ottawa, Canada, April  14-16, 2008

Ottawa, Canada, April 14-16, 2008 10

Horizontal High order Diffusion …

• Del 4 Horizontal Diffusion

• Spatial Discretization

2

2 0

R

,

, 0

with , ; R

AA

A

P P P P

P P P P

III

r

r

Page 11: “High Performance Computing and Simulation Symposium 2008” Ottawa, Canada, April  14-16, 2008

Ottawa, Canada, April 14-16, 2008 11

Spatial disretization

2 21 1

1 12 2 2

1 1 2

1 1 2

21

1 1 1 1

0 1 1 01

11

1

1 1 1 1

1 1

cos cossin sin

cos cos cossin sin sin

coss

;

Nj

Ni

Ni Ni Ni Ni

P

P1

2 21 1

1 1

in

cos cossin sin

Nj

Nj Nj

Nj Nj

Page 12: “High Performance Computing and Simulation Symposium 2008” Ottawa, Canada, April  14-16, 2008

Ottawa, Canada, April 14-16, 2008 12

Horizontal High order Diffusion …

• Fast Direct Solution

• Projection

1 1

1

; Z Z

with

Ni NiI I I I

ij i j ij i jI I

I II

NiI Ii i IIii

i

P P

P

0

with

A Z I Z r

A I Z

A ; I

I I I I I

I I I I

I IIP P P

Page 13: “High Performance Computing and Simulation Symposium 2008” Ottawa, Canada, April  14-16, 2008

Ottawa, Canada, April 14-16, 2008 13

Horizontal High order Diffusion …

• Direct Solution

• Matrix Form

, 1 , , , 1

, 1 , , , 1

1

1

A 0 A ( ) A 0

0 A ( ) A 0 A

r ; 1, .0

with

I I Ij j j j j j j j

I I Ij j j j j j j j

j Ij

j

j

Ij

j Ij

P

P

XX j NjX

XZ

BXM

Page 14: “High Performance Computing and Simulation Symposium 2008” Ottawa, Canada, April  14-16, 2008

Ottawa, Canada, April 14-16, 2008 14

Horizontal High order Diffusion …

• Block Tri-diagonal problem solution

• Solution

1 1

2 2 2

3

1 1

with1

11 1 1

M

M ( ) ( ); ; 2,

Nj Nj

Nj Nj

i i i i i

D EF D E

FD E

F D

L UD D F E i Nj

( ) ; ( ) * L Y B U X Y

Page 15: “High Performance Computing and Simulation Symposium 2008” Ottawa, Canada, April  14-16, 2008

Ottawa, Canada, April 14-16, 2008 15

Summary of the algorithm

• Analysis of the right hand side (FFT or MMM)

• Solution of (Nk*Ni) tri-diagonal Problems

• Synthesis of the solution (FFT or MMM)

,

1

r r ,

Ni

I Ij i i j

i

,

1

.

Ni

I Ii j i j

i

BXM

Page 16: “High Performance Computing and Simulation Symposium 2008” Ottawa, Canada, April  14-16, 2008

Ottawa, Canada, April 14-16, 2008 16

A Parallel algorithm

• Global Transposition (Ni/P,Nj/Q,Nk) (Nj/Q,Nk/P,Ni)• Analysis of the right hand side• Global Transposition (Nj/Q,Nk/P,Ni) (Nk/P,Ni/Q,Nj)• Solution of the block tridiagonal problems• Global Transposition (Nk/P,Ni/Q,Nj) (Nj/Q,Nk/P,Ni)• Synthesis of the solution• Global Transposition (Nj/Q,Nk/P,Ni) (Ni/P,Nj/Q,Nk)

Page 17: “High Performance Computing and Simulation Symposium 2008” Ottawa, Canada, April  14-16, 2008

Ottawa, Canada, April 14-16, 2008 17

35km mesoglobal runAt 72hr forecast

U component without diffusion

U component with DEL 6 diffusion

Page 18: “High Performance Computing and Simulation Symposium 2008” Ottawa, Canada, April  14-16, 2008

Ottawa, Canada, April 14-16, 2008 18

Table 1. Breakdown of timings in the major components of the Canadian 35Km mesoglobal operational model for an integration of 72 hours on 12 nodes (2 x 24 x 4)

Components Time(sec) Percentage

Rhs 14.08 1.48

Adv 247.71 26.01

Prep 14.24 1.49

Nli 33.11 3.48

Sol 71.06 7.46

Bac 13.4 1.41

Phy 435.19 45.7

Hzd 82.86 8.7 vspng 82.86 2.14

output 10.38 1.09

Others 9.91 1.04

Total 952.31 100

Page 19: “High Performance Computing and Simulation Symposium 2008” Ottawa, Canada, April  14-16, 2008

Ottawa, Canada, April 14-16, 2008 19

Table 2. MPI test runs for 35km mesoglobal (OpenMP=1);the number of calls to the diffusion is 964 timesSetupP x Q

Number ofPEs

Nodes DiffusionTime(sec)

RelativeIdealSpeedup

RelativeSpeedup

1x16 16 1 596.46 1 1

2x16 32 2 320.46 2 1.86

2x24 48 3 222.34 3 2.68

4x16 64 4 170.12 4 3.51

Page 20: “High Performance Computing and Simulation Symposium 2008” Ottawa, Canada, April  14-16, 2008

Ottawa, Canada, April 14-16, 2008 20

Table 3. MPI test runs for 17 Km mesoglobal (OpenMP=1); the number of calls to the diffusion is 964 times.SetupP x Q

Number ofPEs

Nodes DiffusionTime(sec)

RelativeIdealSpeedup

RelativeSpeedup

2x16 32 2 1769.48 1 1

2x24 48 3 1206.01 1.5 1.47

4x16 64 4 915.83 2 1.93

4x20 80 5 764.13 2.5 2.32

4x24 96 6 646.64 3 2.74

7x16 112 7 620.98 3.5 2.85

8x16 128 8 595.77 4 2.97

Page 21: “High Performance Computing and Simulation Symposium 2008” Ottawa, Canada, April  14-16, 2008

Ottawa, Canada, April 14-16, 2008 21

MPI Relative Speedup

•35km Mesoglobal FFT 17km Mesoglobal FFT

Page 22: “High Performance Computing and Simulation Symposium 2008” Ottawa, Canada, April  14-16, 2008

Ottawa, Canada, April 14-16, 2008 22

Table 4. OpenMP test runs for 35Km mesoglobal configured (1 x 16 x OpenMP) using FFT: the number of calls to the diffusion is 964 times.

OpenMP Nodes Diffusion Time(sec)

Relative Ideal Speedup

Relative Speedup

1 1 596.46 1 1

4 4 186.41 4 3.2

8 8 132.27 8 4.51

Page 23: “High Performance Computing and Simulation Symposium 2008” Ottawa, Canada, April  14-16, 2008

Ottawa, Canada, April 14-16, 2008 23

Table 5. OpenMP test runs for 35Km mesoglobal configured(1 x 16 x OpenMP) using Matrix multiplication: the number of calls to the diffusion is 1084 times.

OpenMP Nodes Diffusion Time(sec) Relative Ideal Speedup

Relative Speedup

1 1 2129.93 1 1

4 4 588.08 4 3.62

8 8 348.44 8 6.11

Page 24: “High Performance Computing and Simulation Symposium 2008” Ottawa, Canada, April  14-16, 2008

Ottawa, Canada, April 14-16, 2008 24

OpenMP relative Speedup

•35km Mesoglobal FFT 35km Mesoglobal MXM

Page 25: “High Performance Computing and Simulation Symposium 2008” Ottawa, Canada, April  14-16, 2008

Ottawa, Canada, April 14-16, 2008 25

Conclusion

• An efficient implementation of the parallel Fast Direct Solution for the implicit formulation of horizontal diffusion problem

• Comparison with iterative methods like preconditioned Krylov methods.

Page 26: “High Performance Computing and Simulation Symposium 2008” Ottawa, Canada, April  14-16, 2008

Ottawa, Canada, April 14-16, 2008 26

Thank You!

Merci!