Implementation of Density Functional Theory based Electronic Structure Codes on Advanced Computing...
-
Upload
marybeth-oliver -
Category
Documents
-
view
217 -
download
3
Transcript of Implementation of Density Functional Theory based Electronic Structure Codes on Advanced Computing...
Implementation of Density Functional Theory based Electronic Structure Codes on Advanced Computing
Architectures
W.A. Shelton, E. Aprá, G.I. Fann and R.J. Harrison
Cray-X1E• 1024 Multi-streaming vector processor (MSP)
– Each MSP has 2 MB of cache and a peak computation rate of 12.8 GF– 4 single-streaming processors (SSPs) form a node with 16 Gbytes of
shared memory– Memory is physically distributed on individual modules– all memory is directly addressable to and accessible by any MSP in the
system through the use of load and store instructions
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Cray XT3
• 5294 nodes• 4-processors per node connected through
hypertransport 2.4-GHz AMD Opteron processor and 2 GB of memory
• SDRAM memory controller and function of Northbridge is pulled onto the Opteron die. Memory latency reduced to <60 ns
• Interface off the chip is an open standard (HyperTransport)
Six Network LinksEach >3 GB/s x 2
(7.6 GB/sec Peak for each link)
6.5 GB/secSustained
2.17 GB/secSustained
Applications• Porphyrin Functionalized
Nanotube • 1532 atoms• STO-3G basis set = 6380 basis
functions• Do the covalently attached
porphyrins undergo facile absorption of visible light and transferelectrons to the nanotube
• What type of efficiencies does one obtain
• Time for LDA energy + gradient using 300 processors = less than 1 hour
LSDA &Multiple Scattering Theory (MST)
1
2{( )1 } 1 ( ')2
Im ( ) Tr
Im (
( , '; )
( , '; )
( , '
( )
r )) ;( ) T1
eff G
G
Vm
dn f
d f G
π
ε δ
ε ε μ
μ σ
ε
ε εε
ε
π
+ ∇ − = −
= −
= −
∫
∫
r
m
r r
r r
r r
r
r
r) )) )
)
)
h
%
No
Yes
Mix
in & out
Initial guess
nin(r) , min(r)
Calculate
Veff[n,m]in
Solve Schrodinger
Equation
Recalculate
nout(r) , mout(r)
nin(r) = nout(r) ?
min(r) =mout(r) ?
Calculate
Total Energy
Multiple Scattering Theory (MST)J. Korringa, Physica 13, 392, (1947)W. Kohn, N. Rostoker, PR, 94, 1111,(1954)
MST Green function methodsB. Gyorffy, and M. J. Stott, “Band Structure
Spectroscopy of Metals and Alloys”, Ed. D.J. Fabian and L. M. Watson (Academic 1972)
S.J. Faulkner and G.M. Stocks, PR B 21, 3222, (1980)
Algorithm Design for future generation architectures
• More accurate• Spectral or pseudo-spectral accuracy
• Wider range of applicability
• Sparse representation• Memory requirements grow linearly
• Each processor can treat thousands of atoms• Make use of large number of processors
• Message-Passing• Each atom/node local message-passing is independent of the size of the system
• Time consuming step of model• Sparse linear solver
•Direct or preconditioned iterative approach
Multiple Scattering Theory
nR
n′R
nr
nr
r
′r
n
n′
decay slowly with increasing distancecontain free-electron singularities
Generalization of t-matrix. Converts incoming wave at site n into outgoing wave at site m in the presence of all the other sites
€
M(ε) =
t0−1 −G01(ε) −G02 (ε) −G03 (ε)
−G10 (ε) t1−1 −G12 (ε) −G1m(ε)
−G20 (ε) −G21(ε) t2−1 −G2m(ε)
−G30 (ε) −G31(ε) −G32 (ε) tm−1
⎡
⎣
⎢ ⎢ ⎢ ⎢
⎤
⎦
⎥ ⎥ ⎥ ⎥
Multiple scattering theory• Green function
• Scattering path matrix
€
τnm(ε) =tn(ε)δnm+ tnn≠m
∑ (ε)G( r Rn −
r Rm)τ
nm(ε)
τnm(ε) =(Mnm(ε))−1
ML ′ Lnm(ε) =tL ′ L
−1δL ′ L δnm−GL ′ L ( r Rn −
r Rm)
€
Gnm( r r, r′ r ;ε) = ZL
n
L ′ L
∑ ( r rn;ε)τL ′ L
nm(ε)Z ′ Lm(
r′ rm;ε)−ZL
n( r r<;ε)J ′ L
m( r′ r>;ε)δL ′ L
€
Gnm(ε) =−14π
eiε 1/ 2
r Rn−
r Rm
r Rn −
r Rm
Complex Energy Plane
The scattering properties at complex energy can be used to develop highly efficient real-space and k-space methods
Real ε
Im ε
εf
Semi-conductors and insulators could work well since they have no states at εf
Real ε
Im ε
εf
Scattering is local since there are no states near the bottom of the energy contourScattering is local since a large Im ε is equivalent to rising temperature which smears out the states
Near εf scattering is non-local (metal)
εf is the highest occupied electronic state in energy
Multiple Scattering Theory
€
τ−1L ′ L (ε) =t
L , ′ L ,0
−1 (ε) - GL ′ L(R
ij,ε)
Depends on constituent V(r)Independent of lattice
Depends on ε and Underlying lattice structure
Representation is ideal for disorder systems since t-matrixis site diagonal !!Coherent Potential Approximation (CPA)Non-local CPA (based Dynamical Cluster Approximation toDynamical mean-field theory
t-matrix• Solve for t-matrix inside
Voroni polyhedron– Using Calergo method
€
R l (r) =jl (κr)−κ s∫ 2V(s)R l (s) nl (κs) jl (κr)−jl (κs)nl (κr)[ ]ds
R l (r) =cl (r) jl (κr)−sl nl (κr)dcl (r)
dr=−κr2V(r)nl (κr) cl (r) jl (κr)−sl nl (κr)[ ]
dsl (r)dr
=−κr2V(r) jl (κr) cl (r) jl (κr)−sl nl (κr)[ ]
Equations solved for both regular and irregular solutionsSince each atom and is independent can solve in shared memory. Similar for the structure constants.
€
l
Parallel Implementation
• Green’s function
• Scattering path matrix: real space
t : scattering from single site
G: structure constant matrix
M=[t-1()-G(Rnm,]
τ=M-1
• Once M is fixed increasing N does not
affect the local calculation of M-1
€
Gnm( r r, r′ r ;ε) = ZL
n
L ′ L
∑ ( r rn;ε)τL ′ L
nm(ε)Z ′ Lm(
r′ rm;ε)−ZL
n( r r<;ε)J ′ L
m( r′ r>;ε)δL ′ L
Tight-Binding MST Representation
Tight Binding Multiple Scattering Theory
Embed a constant repulsive potential
Shifts the energy zero allowing for calculations at negative energy
Rapidly decaying interactions
Free electron singularities are not a problem
Sparse representation
€
G ,nms(ε)∝e−(Vs−ε)1/ 2 |
r R n−
r R m|
€
Gnm(ε)∝e−ε1/ 2
r R n−
r R m
r Rn −
r Rm
€
G =G0 +G0tG0 +G0tG0tG0 =G0 (I −tG0 )−1
G−1 =G0
−1 −tG r =G0 +G0t
rG0 +G0trG0t
rG0 =G0 (I −trG0 )−1
(G r )−1 =G0
−1 −tr → G0
−1 =(G r )−1 +tr
G−1 =(G r )−1 +tr −t → (G r )−1 −Δt
}
2 Ryd.
0Vr
Constant inside a sphere
€
r R ≤
r Rmt
€
r R >
r Rmt
Screened Structure Constants
€
Gs (ε) =[I −tsGfree(ε)]−1
€
G( r k,ε) = Gm,s (ε)ei
r k•
r Rm
m
∑
• Screened Structure Constants Gs
on the left unscreened on the right– Screened structure constants
rapidly go to zero, whereas the free space structure constants have hardly changed
• Linear solve using m atom cluster that is less than the n atom system
• Easy to perform Fourier transform–K-space method
Screened MST Methods
• Formulation produces a sparse matrix representation– 2-D case has tridiagonal structure with a few distant elements due to
periodicity– 3-D case has scattered elements
• Mainly due to mapping 3-D structure to a matrix (2-D)• A few elements due to periodic boundary conditions
• Require block diagonals of the inverse of τε matrix– Block diagonals represent the site τε matrix and are needed to
calculate the Green’s function for each atomic site• Sparse direct and preconditioned iterative methods are used to calculate
τii(ε)– SuperLU– Transpose free Quasi-Minimal Residual Method (TFQMR)
Screened KKR Accuracy
fcc Cubcc Cubcc Mohcp Co
Timing and Scaling of Scr-K KKR-CPA
Conclusion
• Initial benchmarking of the Screened KKR method– SuperLU N1.8 for finding the inverse of the upper left block of τ– TFQMR with block Jacobi preconditioner N1.06 for finding the inverse of
the upper left block of τ• Extremely high sparsity (97%-99% zeros increases with increasing
system size)• Large number of atoms on a single processor• Real-space/Scr-KKR hybrid may provide the most efficient parallel
approach for new generation architectres• Single code contains
– LSMS, KKR-CPA, Scr-LSMS and Scr-KKR-CPA
Local Poisson EquationDiscontinuous Galerkin approach
€
σh ⋅τK∫ dx=- uh∇⋅τ
K∫ dx+ ˆ uKK
∫ ⋅nKτ ,ds ∀τ ∈ ( )K∑
σh ⋅∇νK∫ dx=- fνK∫ dx+ ˆ σ KK∫ ⋅nKν ,ds ∀ν ∈ P(K )
€
−Δu=f inΩ
€
∇u=σ∇⋅σ =f
€
τ =tensor basis in d-dimension
ν =basis
uh =is ,the representation of u on an element
same forσh
√ uh =