S. Boeriu 1 and J.C. Bruch, Jr. 2 1 Center for Computational Science and Engineering
-
Upload
mary-figueroa -
Category
Documents
-
view
18 -
download
0
description
Transcript of S. Boeriu 1 and J.C. Bruch, Jr. 2 1 Center for Computational Science and Engineering
1
Performance analysis tools Performance analysis tools applied to a finite adaptive mesh applied to a finite adaptive mesh free boundary seepage parallel free boundary seepage parallel algorithmalgorithm
S. Boeriu1 and J.C. Bruch, Jr.2
1Center for Computational Science and Engineering2Department of Mechanical and Environmental Engineering
and Department of Mathematics
University of California, Santa Barbara
http://www.engineering.ucsb.edu/~hpscicom
2
AcknowledgementsAcknowledgements
This material is based upon work supported by the National Science Foundation under Grant #0086262. This research was supported in part by NSF cooperative agreement ACI-9619020 through computing resources provided by the National Partnership for Advanced Computational Infrastructure at the San Diego Supercomputer Center.
http://www.npaci.edu/Horizon/guide_linked/bh_tools_txt.html
3
Outline of PresentationOutline of Presentation
1. Introduction (Physical problem)2. Problem formulation3. Fixed domain formulation4. Numerical algorithm5. Test case6. Performance tools and considerations
a. VAMPIR b. PARAVER7. Diagnostic example8. Conclusions
4
Figure 1. Seepage through a rectangular dam.
Physical problemPhysical problem
5
Simplifying assumptionsSimplifying assumptions
i. The soil in the flowfield is homogeneous and isotropic
ii. Capillary and evaporation effects are neglected
iii. The flow obeys Darcy’s Law
iv. Two-dimensional
v. Steady state
6
Mathematical formulationMathematical formulation
Darcy’s Law: Potential Function:
Velocity Components:
Continuity Equation:
Irrotationality Condition:
Cauchy-Riemann Equations:
Laplace’s Equations:
0x yu v
2 20 , 0
( / )q K grad h K grad p g y
( / )K p g y
,x yu v
0y xu v ,x y y x
7Figure 2. Mathematical formulation of physical problem.
Problem formulationProblem formulation
8
Extension of solution domainExtension of solution domain
The solution domain is extended to the known region
Then extend continuously to be defined on by setting
' ' '
1 1( , ) : 0 ,0 ( ), ,0 FF F C
D x y x x y x x x x y y
D
( , ) {x y ( , )x y in
y in D
9
This yields
in the sense of distributions where
1 0D D
in D and in
( )y x yDin D
10
Fixed domain formulationFixed domain formulation
Figure 3. Fixed domain mathematical formulation.
11
Numerical AlgorithmNumerical Algorithm
A minimization problem can be formulated in terms of the functional
( ) ( , ) 2( , ) ,J a f K
where a is a bilinear form, continuous, symmetric, positive definite on R and i.e.,
f R
( , )
( , )
D
D
a dxdy
f f dxdy
12
The functional J has one and only one minimum on a closed convex set. The minimum is found using the following algorithm:
1( 1/ 2) ( 1) ( )
1 1
1 i Nn n n
i ij j ij jj j iii
a aa
( 1) ( ) ( 1/ 2) ( ) ( ) ( 1/ 2) ( )( ( )) max(0, ( ))n n n n n n ni i i i i i i iP
( , ), ( , ), ,
, 1,..., ,
.
Nij i j i i i
i
where a a N N f f N N is thecanonical basis of R
P is the projectionontheconvex set i N and N is the number
of nodes
13
Finite Element Error Analysis
Adaptive Mesh Finite Element Analysis (FEA)
General Equation for FEA:
Ku f
14
Error Analysis
Error Definition:
where is the approximation of the exact solution ;
is the calculated of an element (constant);
is the shape function;
and
q
q
N
qq̂
T
Sx y
15
Averaging Technique:
Error Estimate in an Element:
ˆ ˆq qe q q e where q N q
16
Error Norm of the Whole Computation Domain:
Percentage Error:
2
2
1
2
2 2
1
ˆ ˆ ˆ( ) ( )
ˆ ˆ
Tq q qL R
Ne
q qL ii
e e e dR
e e
20 2
1
ˆ100%
i
Neq
Ri
ewhere q q dR q
q
17
Local Mesh Refinement
Desired Criteria:
Desired Local Error Criteria:
Error Ratio:
New Element Size:
0max maxwhere is the desired error
12 2
max max maxˆ , maxq i
qe e e is the allowableelement error
Ne
max
ˆ, 1
q ii i
erefinetheelement
e
ii new
i
AA
18
Mesh Refinement
19
Test caseTest case
2 3y
1.85
1 40x
1 10y
0.0001Stopping error criterion
20
Results
21
22Figure 4. Domain decomposition for Pass 4 of Case 1.
23 Figure 5. Speedup for Case 1.
24
Performance tools and Performance tools and considerationsconsiderations
The parallel program is monitored while
it is executed. Monitoring produces
performance data that is interpreted in
order to reveal areas of poor performance.
The program is then altered and the
process is repeated until an acceptable
level of performance is reached.
25
VAMPIRVAMPIR (Visualization and Analysis of MPI Resources – 2.0)(Visualization and Analysis of MPI Resources – 2.0)
VAMPIR 2.0 is a post-mortem trace visualization tool from Pallas GmbH
http://www.pallas.com
It uses the profile extensions to MPI and permits analysis of the message events where data is transmitted between processors during execution of a parallel program. It has a convenient user-interface and an excellent zooming and filtering. Global displays show all selected processes.
26
• Global Timeline: detailed application execution over time axis
• Activity Chart: presents per-process profiling information
• Summaric Chart: aggregated profiling information
• Communication Statistics: message statistics for each process pair
• Global Communication Statistics: collective operations statistics
• I/O Statistics: MPI I/O operation statistics
• Calling Tree: global dynamic calling tree
27
28
29
30
31
32
33
34
35
36
PARAVERPARAVER(Parallel Program Visualization and Analysis Tool)(Parallel Program Visualization and Analysis Tool)
PARAVER is a flexible parallel program visualization and analysis tool based on an easy-to-use Motif GUI (graphical user interface)
PARAVER was developed to respond to the basic need to have a qualitative perception of the application behavior by visual inspection and then to be able to focus on the detailed quantitative analysis of the problems.
37
Paraver Paraver (Parallel Program Visualization and Analysis Tool)(Parallel Program Visualization and Analysis Tool)
Powerful flexible parallel program visualization tool based on an easy-to-use Motif GUI (graphical user interface)
Developed by : European Center for Parallelism of
Barcelona (CEPBA) Universitat Politecnica de Catalunya http://www.cepba.upc.es/
38
Paraver is designed to visualize and analyze - Communication and load balance - Combining OpenMP and MPI - Hardware performance and counters
Usage- Compile programs with special libraries
- Run programs to produce trace files - View and analyze traces - Designed to help in program understanding and optimization
39
40
41
42
43
44
45
46
47
Inefficient programming exampleInefficient programming example
Load imbalance (inefficient memory use)
TLB (translation lookaside buffer) misses
48Figure 6. Stage 1 – Processor 0 – Mesh Map
49
Figure 7. Stage 1 – Processor 3 – Mesh Map
50
Figure 8. Stage 1 – VAMPIR – Activity Chart
51
Figure 9. Stage 1 - PARAVER – Global Display
52
Figure 10. Stage 4 - VAMPIR – Activity Chart
53
Figure 11. Stage 4 - VAMPIR – Display Chart
54
Table 8. TLB misses.Table 8. TLB misses.
STAGES Proc. 0 Proc. 3
1 TLB misses 9,464 7,870
4 TLB misses 12,210 208,341
55Figure 12. Stage 4 - Processor 0 – Mesh Map
56Figure 13. Stage 4 – Processor 3 – Mesh Map
57
Table 9. Stage 4 timing of the SOR module.Table 9. Stage 4 timing of the SOR module.
Processor Time spent in SOR
0 0.3671
1 0.4068
2 0.6940
3 0.8393
58Figure 14. Stage 4 – VAMPIR – Activity Chart
59Figure 15. Stage 4 – VAMPIR – Display Chart
60
Figure 16. Stage 4 – PARAVER – Global Display
61
ConclusionsConclusions
A significant factor that affects the performance of a parallel application is the balance between communication and workload. The challenge of the message passing model is in reducing message traffic over the interconnection network. To fully understand the
performance behavior of such applications, analysis and
visualization tools are needed. Two such tools, VAMPIR
and PARAVER, were used to analyze the performance of
the seepage application. It was seen that optimization of
the parallel code can be carried out in an iterative process
involving these tools to investigate performance issues.
62
Web SitesWeb SitesProject site http://www.engineering.ucsb.edu/~hpscicom
San Diego Supercomputer Centerhttp://www.npaci.edu/Horizon/guide_linked/bh_tools_txt.html
VAMPIR http://www.pallas.comPARAVER
http://www.cepba.upc.es/