Final Presentation - uni-graz.at...Final Presentation Project Dirac Equation in (1+1)-D on a...

Final PresentationProject Dirac Equation in (1+1)-D

on a staggered gridwith perfectly matched boundary

High Performance Computing 1Seminar

Winter semester 2016/17

Short Recaptiulation

● Solution of the Dirac Equation on a staggered grid for spinors u and v

to avoid the Fermion-doubling problem, see e.g. [Gattringer/Lang: Quantum Chromodynamics on the Lattice, Springer 2010, p. 110f]

Perfectly Matched Layer Method

● Wave solution become unphysically reflected at the boundaries

● Perfectly Matched Boundary Method absorbs wave functions as the approach the boundary

● Method: Analytic continuation of the spatial coordinate into the complex plane:

● Exponential decay of wave function:

Dirac Equation with PML

Discretized Dirac Equation on the staggered grid

Numerical Solution for initial Gaussian wave package

Algorithm

● 4 equations for 2 spinors u, v, and 2 auxiliary fields psi_u, psi_v

● Corresponds to 4 for-loops (temporal integration) over the 4 quadratures

● Temporal quadrature for u and psi_u as well as v and psi_v can be combined: loop merging

Resolving Dependencies

First for-loop Second for-loop

Code optimization

● Version 1.0 original pure Python code

● Version 1.1 loop-merging● Version 2.0 transfer of

quadrature to Fortran subroutine (Python-Fortran wrapper code)

● Version 2.1 Implementation of OpenMP git version history

Loop merging

Interfacing Python with Fortran subroutine

● Program control (parameters such as grid size, inis, bcs) with Python

● Quadrature is transferred to Fortran subroutine called from Python (using f2py, generating shared object file, which can be accessed like a Python module)

● Result is returned to Python main script for postprocessing (plotting, movie, postprocessing the result, ...)

● Extreme speedup by transfer of the quadrature to Fortran

Compilation of Fortran source code with f2py:

#!/usr/bin/env python

import os

cmd0 = "rm solve_DiracEqn_on_the_lattice.so"

cmd1 = "f2py -c --fcompiler=gnu95 --f90flags='-fopenmp' -lgomp -m " \

+ "solve_DiracEqn_on_the_lattice solve_DiracEqn_on_the_lattice.f95"

cmd2 = "time ./DiracEqn1p1D_fortran_interface.py"

os.system(cmd0)

os.system(cmd1)

Calling the Fortran subroutine from Python

Computation time of pure Python code vs Python-Fortran interface code

Code Parallelization with OpenMP

● Implementation of OpenMP in Fortran● Only the quadrature (2 temporal do-

loops) are parallelized● Spatial grid size N_x = 512 chosen to

be multiple of 1, 2, 4, 8, 16 to allow for equal chunk sizes of spatial sections (figure on the right)

● Computation time measured only for this Fortran subroutine, not including the serial part

● Code run on 12-core machine 143.50.47.128 10 times for each setting of gridsize and number of cores – minimum taken for determining the speedup Thread no. 1 2 3 4 ...

Scheduled static do loops

● Spinors and auxiliary fields are shared between threads, only loop index is private

● Threads process chunks of size (N_x – 2)/num_cores

● Forcing ordered sequence not strictly needed

● omp do ordered (finally commented out here) does not lead to speedup while omp do schedule(static) does

Computation time vs. number of threads

Comparison of original pure Python code with OpenMP-Fortran version

Final Presentation - uni-graz.at...Final Presentation Project Dirac Equation in (1+1)-D on a...

Documents

Transcript of Final Presentation - uni-graz.at...Final Presentation Project Dirac Equation in (1+1)-D on a...