High-Performance Simulations of Coherent Synchrotron ...

33
High-Performance Simulations of Coherent Synchrotron Radiation on Multicore GPU and CPU Platforms Balša Terzić, PhD Department of Physics, Old Dominion University Center for Accelerator Studies (CAS), Old Dominion University 2015 IPAC, Richmond, 4 May 2015 May 4, 2015 CSR Simulations on Multicore Platforms 1

Transcript of High-Performance Simulations of Coherent Synchrotron ...

Page 1: High-Performance Simulations of Coherent Synchrotron ...

High

-Per

form

ance

Sim

ulat

ions

of

Cohe

rent

Syn

chro

tron

Rad

iatio

n on

M

ultic

ore

GPU

and

CPU

Pla

tfor

ms

Balš

aTe

rzić

, PhD

Depa

rtm

ent o

f Phy

sics,

Old

Dom

inio

n U

nive

rsity

Cent

er fo

r Acc

eler

ator

Stu

dies

(CAS

), O

ld D

omin

ion

Uni

vers

ity

2015

IPAC

, Ric

hmon

d, 4

May

201

5

May

4, 2

015

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

1

Page 2: High-Performance Simulations of Coherent Synchrotron ...

Colla

bora

tors

May

4, 2

015

2

Cent

er fo

r Acc

eler

ator

Sci

ence

(CAS

) at O

ld D

omin

ion

Uni

vers

ity (O

DU):

Prof

esso

rs:

Phys

ics:

Alex

ande

r God

unov

Com

pute

r Sci

ence

: M

oham

mad

Zub

air,

Desh

Ranj

anPh

D st

uden

t: Co

mpu

ter S

cien

ce:

Kam

esh

Arum

ugam

Early

adv

ance

s on

this

proj

ect b

enef

ited

from

my

colla

bora

tion

with

Ru

iLi (

Jeffe

rson

Lab

)

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

Page 3: High-Performance Simulations of Coherent Synchrotron ...

Out

line

May

4, 2

015

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

3

•Co

here

nt S

ynch

rotr

on R

adia

tion

(CSR

)•

Phys

ical

pro

blem

•Co

mpu

tatio

nal c

halle

nges

•N

ew 2

D Pa

rtic

le-In

-Cel

l CSR

Cod

e•

Out

line

of th

e ne

w a

lgor

ithm

•Pa

ralle

l im

plem

enta

tion

CPU

/GPU

clu

ster

s•

Benc

hmar

king

aga

inst

ana

lytic

al re

sults

•St

ill to

Com

e

•Su

mm

ary

Page 4: High-Performance Simulations of Coherent Synchrotron ...

CSR:

Phy

sica

l Pro

blem

Be

am’s

self-

inte

ract

ion

due

to C

SR c

an le

ad to

a h

ost o

f adv

erse

ef

fect

s

Incr

ease

in e

nerg

y sp

read

Em

ittan

cede

grad

atio

n

Long

itudi

nal i

nsta

bilit

y (m

icro

-bun

chin

g)

Be

ing

able

to q

uant

itativ

ely

simul

ate

CSR

is th

e fir

st st

ep

tow

ard

miti

gatin

g its

adv

erse

effe

cts

It

is vi

tally

impo

rtan

t to

have

a tr

ustw

orth

y 2D

CSR

cod

e

May

4, 2

015

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

4

Page 5: High-Performance Simulations of Coherent Synchrotron ...

CSR:

Com

puta

tiona

l Cha

lleng

es

•Dy

nam

ics g

over

ned

by th

e Lo

rent

z for

ce:

•:

exte

rnal

EM

fiel

ds•

: se

lf-in

tera

ctio

n (C

SR) }re

tard

ed

pote

ntia

ls( r,t)

A( r,t)

( r'

,t')

J( r'

,t')

d r'

r r'

Char

ge d

ensit

y:N

eed

to tr

ack

the

entir

ehi

stor

y of

the

bunc

hCu

rren

t den

sity:

( r,t)

f( r, v,t)

d v

J( r,t)

vf( r, v,t)

d v

reta

rded

tim

et't

r r' c

Eself

1 c A t

Bself A

d dtm

e v

e E B

v c

E Eext Eself

B Bext Bself

Beam

dist

ribut

ion

func

tion

(DF)

:f( r, v,t)

Eext , Bext

Eself, Bself

May

4, 2

015

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

5

LARG

E CA

NCE

LLAT

ION

NU

MER

ICAL

NO

ISE

DUE

TO G

RADI

ENTS

ENO

RMO

US

COM

PUTA

TIO

NAL

AN

D M

EMO

RY LO

AD

ACCU

RATE

2D

INTE

GRAT

ION

Page 6: High-Performance Simulations of Coherent Synchrotron ...

CSR:

Com

puta

tiona

l Cha

lleng

es

O

ur n

ew c

ode

solv

es th

e m

ain

com

puta

tiona

l cha

lleng

es a

ssoc

iate

d w

ith th

e nu

mer

ical

sim

ulat

ion

of C

SR e

ffect

s

Enor

mou

s com

puta

tiona

l and

mem

ory

load

(s

torin

g an

d in

tegr

atio

n ov

er b

eam

’s hi

stor

y)Pa

ralle

l im

plem

enta

tion

on G

PU/C

PU p

latfo

rms

La

rge

canc

ella

tion

in th

e Lo

rent

z for

ceDe

velo

ped

high

-acc

urac

y, ad

aptiv

e m

ultid

imen

siona

l int

egra

tor f

or G

PUs

Sc

alin

g of

the

beam

self-

inte

ract

ion

Part

icle

-in-C

ell (

PIC)

cod

e•S

elf-i

nter

actio

n in

PIC

cod

es sc

ales

as g

rid re

solu

tion

squa

red

(Poi

nt-to

-poi

nt c

odes

: sca

les a

s num

ber o

f mac

ropa

rtic

less

quar

ed)

N

umer

ical

noi

seN

oise

rem

oval

usin

g w

avel

ets

May

4, 2

015

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

6

Page 7: High-Performance Simulations of Coherent Synchrotron ...

Nm

acro

part

icle

sat

t=t k

syst

em a

t t=t

k+∆t

Adva

nce

part

icle

s by ∆t

Stor

e di

strib

utio

n on

Nx×

N ygr

id

Npo

int-

part

icle

sat

t=t k

Bin

part

icle

s on

N x×N y

grid

Inte

rpol

ate

to o

btai

n fo

rces

on

eac

h pa

rticl

e

Inte

grat

e ov

er g

rid h

istor

ies t

o co

mpu

te re

tard

ed p

oten

tials

and

corr

espo

ndin

g fo

rces

on th

e N x×

N ygr

id

New

Cod

e: T

he B

ig P

ictu

re

May

4, 2

015

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

7

NO

N-S

TAN

DARD

FO

R PI

C CO

DES

Page 8: High-Performance Simulations of Coherent Synchrotron ...

New

Cod

e: C

ompu

ting

Reta

rded

Pot

entia

ls

•Ca

rry

out i

nteg

ratio

n ov

er h

istor

y:

•De

term

ine

limits

of i

nteg

ratio

n in

lab

fram

e:co

mpu

te R

max

and

(θm

ini , θ

max

i )

For e

ach

grid

poin

t, in

depe

nden

tly,

do th

e sa

me

inte

grat

ion

over

bea

m’s

hist

ory

Obv

ious

can

dida

te fo

rpa

ralle

l com

puta

tion

May

4, 2

015

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

8

Page 9: High-Performance Simulations of Coherent Synchrotron ...

•Pa

ralle

l com

puta

tion

on G

PUs

•Id

eally

suite

d fo

r alg

orith

ms w

ith h

igh

arith

met

ic op

erat

ion/

mem

ory

acce

ss ra

tio•

Sam

e In

stru

ctio

n M

ultip

le D

ata

(SIM

D)•

Seve

ral t

ypes

of m

emor

iesw

ith v

aryi

ng a

cces

s tim

es (g

loba

l, sh

ared

, reg

ister

s)•

Use

s ext

ensio

n to

exi

stin

g pr

ogra

mm

ing

lang

uage

s to

hand

le n

ew a

rchi

tect

ure

•GP

Us h

ave

man

y sm

alle

r cor

es (~

400-

500)

des

igne

d fo

r par

alle

l exe

cutio

n•

Avoi

d br

anch

ing

and

com

mun

icatio

n be

twee

n co

mpu

tatio

nal t

hrea

ds

CPU

GPU

Para

llel C

ompu

tatio

n on

GPU

s

Mor

e sp

ace

for A

LU,

less

for c

ache

an

d flo

w co

ntro

lGP

U:

grid

bl

ocks

th

read

s

Exam

ple:

NVI

DIA

GeFo

rce

GTX

480

GPU

has

448

cor

esM

ay 4

, 201

5 C

SR S

imul

atio

ns o

n M

ultic

ore

Plat

form

s9

Page 10: High-Performance Simulations of Coherent Synchrotron ...

Para

llel C

ompu

tatio

n on

GPU

s

Com

putin

g th

e re

tard

ed p

oten

tials

requ

ires i

nteg

ratin

g ov

er

the

entir

e bu

nch

hist

ory

–ve

ry sl

ow!M

ust p

aral

leliz

e.

In

tegr

atio

n ov

er a

grid

is id

eally

suite

d fo

r GPU

s

No

need

for c

omm

unic

atio

n be

twee

n gr

idpo

ints

Sa

me

kern

elex

ecut

ed fo

r all

Ca

n re

mov

e al

l bra

nche

s fro

m th

e al

gorit

hm

W

e de

signe

d a

new

ada

ptiv

e m

ultid

imen

siona

l int

egra

tion

algo

rithm

opt

imize

d fo

r GPU

s[A

rum

ugam

, God

unov

, Ran

jan,

Terz

ić&

Zub

air2

013a

,b]

N

VIDI

A’s C

UDA

fram

ewor

k (e

xten

sion

to C

++)

Ab

out 2

ord

ers o

f mag

nitu

de sp

eedu

p ov

er a

seria

l im

plem

enta

tion

U

sefu

l bey

ond

this

proj

ect

May

4, 2

015

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

10

Page 11: High-Performance Simulations of Coherent Synchrotron ...

Perf

orm

ance

Com

paris

on: C

PU V

s. G

PU

Com

paris

on: 1

CPU

vs.

1 G

PU;

8 CP

Us v

s. 4

GPU

s (on

e co

mpu

te n

ode)

1

GPU

ove

r 50

x fa

ster

than

1 C

PU

Both

line

arly

scal

e w

ith m

ultic

ores

: 4 G

PUs 2

5x fa

ster

than

8 C

PUs

Hy

brid

CPU

/GPU

impl

emen

tatio

n m

argi

nally

bet

ter t

han

GPU

s alo

ne

Exec

utio

n tim

e re

duce

sas t

he n

umbe

r of p

oint

-par

ticle

s gro

ws

M

ore

part

icle

s, le

ss n

umer

ical

noi

se, f

ewer

func

tion

eval

uatio

ns n

eede

d fo

r hig

h-ac

cura

cy in

tegr

atio

n

May

4, 2

015

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

11

Page 12: High-Performance Simulations of Coherent Synchrotron ...

GPU

Clu

ster

Impl

emen

tatio

n

The

high

er th

e re

solu

tion,

the

larg

er th

e fra

ctio

n of

tim

e sp

ent

on c

ompu

ting

inte

gral

s (an

d th

eref

ore

the

spee

dup)

We

expe

ct th

e sc

alin

g at

larg

er re

solu

tions

to b

e ne

arly

line

ar

1 st

ep o

f the

sim

ulat

ion

on a

128

x128

grid

and

32

GPU

s: ~

10

s

May

4, 2

015

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

12

1 2 4 8 16

32 1

2 4

8 1

6 3

2

Speedup

Num

ber o

f GPU

sGrid

Res

olut

ion

128

x 12

8

Grid

Res

olut

ion

64 x

64

N=10

2400

0

Page 13: High-Performance Simulations of Coherent Synchrotron ...

Benc

hmar

king

Aga

inst

Ana

lytic

1D

Resu

lts•

Anal

ytic

stea

dy st

ate

solu

tion

avai

labl

e fo

r a ri

gid

line

Gaus

sian

bunc

h [D

erbe

nev

& S

hilts

ev19

96, S

LAC-

Pub

7181

]

•Ex

celle

nt a

gree

men

t bet

wee

n an

alyt

ic a

nd c

ompu

ted

solu

tions

pro

vide

sapr

oof o

f con

cept

for t

he n

ew c

ode

N=5

1200

0N

x=Ny=6

4

May

4, 2

015

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

13

LON

GITU

DIN

ALTR

ANSV

ERSE

-7-6-5-4-3-2-1 0 1

-4-2

0 2

4

Effective Transverse CSR Force [keV/m]

s/ s

anal

ytic

com

pute

d+ +

-500

-400

-300

-200

-100 0

100

200

-4-2

0 2

4

Effective Longitudinal CSR Force [keV/m]

s/ s

anal

ytic

com

pute

d+ +

Page 14: High-Performance Simulations of Coherent Synchrotron ...

Larg

e Ca

ncel

latio

n in

the

Lore

ntz F

orce

•Tr

aditi

onal

ly d

iffic

ult t

o tr

ack

larg

e qu

antit

ies w

hich

mos

tly c

ance

l out

:

•Hi

gh a

ccur

acy

of th

e im

plem

enta

tion

able

to tr

ack

accu

rate

ly th

ese

canc

ella

tions

ove

r 5 o

rder

s of m

agni

tude

4×10

76×

102

N=1

2800

0N

x=Ny=3

2

Effe

ctiv

e Lo

ngitu

dina

l For

ce:

ϕ−β s

Αs

s s

May

4, 2

015

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

14

Page 15: High-Performance Simulations of Coherent Synchrotron ...

Effo

rts C

urre

ntly

Und

erw

ay

Co

mpa

re to

2D

sem

i-ana

lytic

al re

sults

(chi

rped

bun

ch)

[Li 2

008,

PR

STAB

11,

024

401]

Co

mpa

re to

oth

er 2

D co

des (

for i

nsta

nce

Bass

iet a

l. 20

09)

Si

mul

ate

a te

st c

hica

ne

Fu

rthe

r Afie

ld:

Va

rious

bou

ndar

y co

nditi

ons

Sh

ield

ing

U

se w

avel

ets t

o re

mov

e nu

mer

ical

noi

se (i

ncre

ase

effic

ienc

y an

d ac

cura

cy)

Ex

plor

e th

e ne

ed a

nd fe

asib

ility

of g

ener

alizi

ng th

e co

de fr

om 2

D to

3D

May

4, 2

015

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

15

Page 16: High-Performance Simulations of Coherent Synchrotron ...

Sum

mar

y

Pres

ente

d th

e ne

w 2

D PI

C co

de:

Re

solv

es tr

aditi

onal

com

puta

tiona

l diff

icul

ties b

y op

timizi

ng o

ur a

lgor

ithm

on

a G

PU p

latfo

rm

Proo

f of c

once

pt: e

xcel

lent

agr

eem

ent w

ith a

naly

tical

1D

resu

lts

O

utlin

ed o

utst

andi

ng is

sues

that

will

soon

be

impl

emen

ted

Cl

osin

g in

on

our g

oal

Ac

cura

te a

nd e

ffici

ent c

ode

whi

ch fa

ithfu

lly si

mul

ates

CSR

effe

cts

May

4, 2

015

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

16

Page 17: High-Performance Simulations of Coherent Synchrotron ...

May

4, 2

015

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

17

Back

up S

lides

Page 18: High-Performance Simulations of Coherent Synchrotron ...

Impo

rtan

ce o

f Num

eric

al N

oise

•Si

gnal

-to-n

oise

ratio

in P

IC si

mul

atio

ns sc

ales

as N

ppc1/

2

[Ter

zić, P

ogor

elov

& B

ohn

2007

, PR

STAB

10,

034

021]

•Th

en th

e nu

mer

ical

noi

se sc

ales

as N

ppc-1

/2(N

ppc:

avg.

# o

f par

ticle

s per

cel

l)

May

4, 2

015

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

18

128

x 12

8 gr

id

Less

num

eric

al n

oise

= m

ore

accu

rate

and

fast

er si

mul

atio

ns[T

erzić

, Pog

orel

ov&

Boh

n 20

07, P

R ST

AB 1

0, 0

3402

1; Te

rzić

& B

assi

2011

, PR

STAB

14,

070

701]

Exec

utio

n tim

e fo

r int

egra

lev

alua

tion

also

scal

es a

s Npp

c-1/2

Page 19: High-Performance Simulations of Coherent Synchrotron ...

W

hen

the

signa

l is k

now

n, o

ne c

an

com

pute

Sig

nal-t

o-No

ise R

atio

(SNR

):

N ppc: a

vg. #

of p

artic

les p

er c

ell

Npp

c= N/

N cells

2D su

perim

pose

d Ga

ussia

ns o

n 25

6×25

6 gr

id

Wav

elet

den

oisin

gyi

elds

a re

pres

enta

tion

whi

ch is

:

-Ap

prec

iabl

y m

ore

accu

rate

than

non

-den

oise

dre

pres

enta

tion

-Sp

arse

(if c

leve

r, w

e ca

n tr

ansla

te th

is sp

arsit

yin

to c

ompu

tatio

nal e

ffici

ency

)

Wav

elet

Den

oisi

ngan

d Co

mpr

essi

on

CO

MPA

CT:

onl

y 0.

12%

of c

oeffs

AN

ALY

TIC

AL

Npp

c=3

SNR

=2.0

2N

ppc=

205

SNR=1

6.89

WAV

ELET

TH

RES

HO

LDIN

GD

EN

OIS

ED

CO

MPA

CT:

onl

y 0.

12%

of c

oeffs

SNR

q i2

i1

Ngrid q iq i

2

i1

Ngrid

q iexact

q igrid

SNRNppc

Npp

c=3

SNR=1

6.83

May

4, 2

015

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

19

Page 20: High-Performance Simulations of Coherent Synchrotron ...

Perf

orm

ance

Com

paris

on: G

PU V

s. H

ybrid

CPU

/GPU

Co

mpa

rison

: 1 C

PU v

s. 1

GPU

; 8

CPU

s vs.

4 G

PUs (

one

com

pute

nod

e)

Hybr

id C

PU/G

PU im

plem

enta

tion

mar

gina

lly b

ette

r tha

n GP

Us a

lone

May

4, 2

015

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

20

Page 21: High-Performance Simulations of Coherent Synchrotron ...

Brea

kdow

n of

Com

puta

tions

May

4, 2

015

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

21

Page 22: High-Performance Simulations of Coherent Synchrotron ...

New

Cod

e: C

ompu

tatio

n of

CSR

Effe

cts

3 co

ordi

nate

fram

es

for e

asie

r com

puta

tion

Com

putin

g re

tard

ed p

oten

tials

:M

ajor

com

puta

tiona

l bot

tlene

ck

May

4, 2

015

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

22

Page 23: High-Performance Simulations of Coherent Synchrotron ...

New

Cod

e: P

artic

le-In

-Cel

l

•Gr

id re

solu

tion

is sp

ecifi

ed a

prio

ri(fi

xed

grid

)•

N X: #

of g

ridpo

ints

inX

•N Y

:# o

f grid

poin

ts in

Y•

N grid

=NX×

N Yto

tal g

ridpt

s•

Grid

:

•In

clin

atio

n an

gleα

•Po

int-

part

icle

s dep

osite

d on

the

grid

via

dep

ositi

on sc

hem

e

•Gr

id is

det

erm

ined

so a

s to

tight

ly e

nvel

ope

all p

artic

les

Min

imizi

ng n

umbe

r of e

mpt

y ce

lls ⇒op

timizi

ng sp

atia

l res

olut

ion

X ij,Y

ij

j1,Ny

i1,Nx

May

4, 2

015

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

23

Page 24: High-Performance Simulations of Coherent Synchrotron ...

New

Cod

e: F

ram

es o

f Ref

eren

ce

•Ch

oosin

g a

corr

ect c

oord

inat

e sy

stem

is o

f cru

cial

impo

rtan

ce•

To si

mpl

ify c

alcu

latio

ns u

se 3

fram

es o

f ref

eren

ce:

•Fr

enet

fram

e (s

, x)

s–al

ong

desig

n or

bit

x–

devi

atio

n no

rmal

todi

rect

ion

of m

otio

n-

Part

icle

pus

h

•La

b fr

ame

(X, Y

)-

Inte

grat

ion

rang

e-

Inte

grat

ion

of re

tard

ed

pote

ntia

ls

•Gr

id fr

ame

(X~,

Y~)

Scal

ed &

rota

ted

lab

fram

eal

way

s [-0

.5,0

.5] ×

[-0.5

,0.5

]-

Part

icle

dep

ositi

on-

Grid

inte

rpol

atio

n-

Hist

ory

of th

e be

am

May

4, 2

015

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

24

Page 25: High-Performance Simulations of Coherent Synchrotron ...

Sem

i-Ana

lytic

2D

Resu

lts: 1

D M

odel

Bre

aks D

own

•An

alyt

ic s

tead

y st

ate

solu

tion

is ju

stifi

ed fo

r [D

erbe

nev

& S

hilts

ev 19

96]

•Li

, Leg

g, T

erzić,

Bis

ogna

no &

Bos

ch 2

011:

x

Rz2

1/

3

1

1D &

2D

dis

agre

e in

:M

agni

tude

of C

SR fo

rce

Loca

tion

of m

axim

um fo

rce

Mod

el b

unch

com

pres

sor (

chic

ane)

E =

70 M

eVσ z

0= 0

.5 m

mu

= -10

.56

m-1

ener

gy c

hirp

L b=

0.3

mL B

= 0.

6 m

L d=

0.4

m

⇒1D CSR

mod

el is

inad

equa

te

Prel

imin

ary

sim

ulat

ions

sho

wgo

od a

gree

men

t bet

wee

n 2D

se

mi-a

naly

tic re

sults

and

resu

ltsob

tain

ed w

ith o

ur c

ode

May

4, 2

015

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

25

Page 26: High-Performance Simulations of Coherent Synchrotron ...

O

rtho

gona

l bas

is of

func

tions

com

pose

d of

scal

ed a

nd tr

ansla

ted

vers

ions

of

the

sam

e lo

caliz

ed m

othe

r wav

eletψ

(x) a

nd th

e sc

alin

g fu

nctio

n ϕ

(x):

Ea

ch n

ew re

solu

tion

leve

l kis

orth

ogon

al to

the

prev

ious

leve

ls

Co

mpa

ct su

ppor

t: fin

ite d

omai

n ov

er w

hich

non

zero

In

ord

er to

att

ain

orth

ogon

ality

of d

iffer

ent s

cale

s,th

eir s

hape

s are

stra

nge

-Sui

tabl

e to

repr

esen

t irr

egul

arly

shap

ed fu

nctio

ns

Fo

r disc

rete

sign

als (

grid

ded

quan

titie

s), f

ast

Disc

rete

Wav

elet

Tra

nsfo

rm (D

FT) i

s an

O(M

N)

oper

atio

n, M

size

of th

e w

avel

et fi

lter,

Nsig

nal s

ize

Wav

elet

s

Dau

bach

ies

4thor

der w

avel

et

ik (x

)2k

/2

(2kxi),

k,iZ

f(x)s 00

00(x

)d ik

ik

ik (x

),

May

4, 2

015

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

26

Page 27: High-Performance Simulations of Coherent Synchrotron ...

W

avel

et b

asis

func

tions

hav

e co

mpa

ct su

ppor

t ⇒signa

l loc

alize

d in

spac

eW

avel

et b

asis

func

tions

hav

e in

crea

sing

reso

lutio

n le

vels ⇒signal

loca

lized

in fr

eque

ncy

⇒Simulta

neou

s loc

aliza

tion

in sp

ace

and

freq

uenc

y(F

FT o

nly

freq

uenc

y)

W

avel

et b

asis

func

tions

cor

rela

te w

ell w

ith v

ario

us si

gnal

type

s (in

clud

ing

signa

ls w

ith si

ngul

ariti

es, c

usps

and

oth

er ir

regu

larit

ies)

⇒Com

pact

and

acc

urat

e re

pres

enta

tion

of d

ata

(com

pres

sion)

W

avel

et tr

ansf

orm

pre

serv

es h

iera

rchy

of s

cale

s

In

wav

elet

spac

e, d

iscre

tized

ope

rato

rs (L

apla

cian

) are

also

spar

se a

nd h

ave

an

effic

ient

pre

cond

ition

er⇒Solv

ing

som

e PD

Es is

fast

er a

nd m

ore

accu

rate

Pr

ovid

e a

natu

ral s

ettin

g fo

r num

eric

al n

oise

rem

oval

⇒Wave

let d

enoi

sing

Wav

elet

thre

shol

ding

: If

|w

ij|<T

, se

t wij=

0.

[Ter

zić, P

ogor

elov

& B

ohn

2007

, PR

STAB

10,

034

201]

[Ter

zić&

Bas

si20

11, P

R ST

AB 1

4, 0

7070

1]

Adva

ntag

es o

f Wav

elet

For

mul

atio

n

May

4, 2

015

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

27

Page 28: High-Performance Simulations of Coherent Synchrotron ...

Wav

elet

Com

pres

sion

[Fro

m Te

rzić

& B

assi

2011

, PR

STAB

14,

070

701]

Mod

ulat

ed fl

at-to

p pa

rtic

le d

istrib

utio

nFr

actio

n of

non

-zer

o co

effic

ient

sre

tain

ed a

fter w

avel

et th

resh

oldi

ng

1% 0.1%

May

4, 2

015

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

28

Page 29: High-Performance Simulations of Coherent Synchrotron ...

CSR:

Poi

nt-t

o-Po

int A

ppro

ach

•Po

int-t

o-Po

int a

ppro

ach

(2D

):[L

i 199

8]

•Ch

arge

den

sity

is s

ampl

ed w

ith N

Gau

ssia

n-sh

aped

2D

mac

ropa

rtic

les

(2D

dis

trib

utio

n w

ithou

t ver

tical

spr

ead)

•Ea

ch m

acro

part

icle

inte

ract

s w

ith e

ach

mac

ropa

rtic

le th

roug

hout

his

tory

•Ex

pens

ive:

com

puta

tion

of re

tard

ed p

oten

tials

and

sel

f fie

lds

~ O

(N2 )

⇒small

num

ber N

⇒poo

r spa

tial r

esol

utio

n⇒diffi

cult

to s

ee s

mal

l-sca

le s

truc

ture

•W

hile

use

ful i

n ob

tain

ing

low

-ord

er m

omen

ts o

f the

bea

m,

Poin

t-to

-Poi

nt a

ppro

ach

is n

ot o

ptim

al fo

r stu

dyin

g CS

R

DF

Char

ge d

ensi

ty

Curr

ent d

ensi

ty

Gau

ssia

n m

acro

part

icle

f( r, v,t)q

n m( r r 0(i

) (t))

i1N

( v

v 0(i) (t

))

( r,t)q

n m( r r 0(i

) (t))

i1N

J( r,t)q

0(i

) (t)n

m( r r 0(i

) (t))

i1N

n m( r r 0(i

) (t))

12m2

exp

(xx 0

(t))

2

(yy 0

(t))

2

2m2

May

4, 2

015

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

29

Page 30: High-Performance Simulations of Coherent Synchrotron ...

CSR:

Par

ticle

-In-C

ell A

ppro

ach

•Pa

rtic

le-In

-Cel

l app

roac

h w

ith re

tard

ed p

oten

tials

(2D)

:

•Ch

arge

and

cur

rent

den

sitie

s are

sam

pled

with

Npo

int-

char

ges (δ-

func

tions

)an

d de

posit

ed o

n a

finite

grid

usin

g a

depo

sitio

n sc

hem

e

•Tw

o m

ain

depo

sitio

n sc

hem

es-

Nea

rest

Grid

Poi

nt (N

GP)

(con

stan

t: de

posit

s to

1Dpo

ints

)-

Clou

d-In

-Cel

l (CI

C)(li

near

: dep

osits

to 2

Dpo

ints

)Th

ere

exist

hig

her-

orde

r sch

emes

•Pa

rtic

les d

o no

t dire

ctly

inte

ract

with

eac

h ot

her,

but o

nly

thro

ugh

a m

ean-

field

of th

e gr

idde

d re

pres

enta

tion

p(X⃗)

x⃗ k⃗

NG

P

CIC

p(x)

•–

grid

poin

tloc

atio

nx

–m

acro

part

icle

loca

tion

DF (K

limon

tovi

ch)

Char

ge d

ensit

y

Curr

ent d

ensit

y

f( r, v,t)q

( r

r 0(i) (t

))i

1N ( v

v 0(i) (t

))

( x k

,t)q

( x k x 0(i

) (t)

X)

hh

i1N

p( X

)d X

J( x k

,t)q

0(i) (t

)( x k x 0(i

) (t)

X)

hh

i1N

p( X

)d X

May

4, 2

015

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

30

Page 31: High-Performance Simulations of Coherent Synchrotron ...

CSR:

P2P

Vs.

PIC

•Co

mpu

tatio

nal c

ost f

or P

2P:

Tota

l cos

t ~ O

(N2 )

•In

tegr

atio

n ov

er h

istor

y (y

ield

s sel

f-for

ces)

: O

(N2 ) o

pera

tion

•Co

mpu

tatio

nal c

ost f

or P

IC:

Tota

l cos

t ~ O

(Ngr

id2 )

•Pa

rtic

le d

epos

ition

(yie

lds g

ridde

d ch

arge

& c

urre

nt d

ensit

ies)

: O(N

) ope

ratio

n•

Inte

grat

ion

over

hist

ory

(yie

lds r

etar

ded

pote

ntia

ls): O

(Ngr

id2 ) o

pera

tion

•Fi

nite

diff

eren

ce (y

ield

s sel

f-for

ces o

n th

e gr

id):

O(N

grid

) ope

ratio

n•

Inte

rpol

atio

n (y

ield

s sel

f-for

ces a

ctin

g on

eac

h of

N p

artic

les)

: O(N

) ope

ratio

n•

Ove

rall

~ O

(Ngr

id2 )+

O(N

) ope

ratio

ns•

But i

n re

alist

ic si

mul

atio

ns:

N grid

2 >> N

, so

the

tota

l cos

t is ~

O(N

grid

2 )•

Favo

rabl

e sc

alin

g al

low

s for

larg

er N

, and

reas

onab

le g

rid re

solu

tion

⇒Impro

ved

spat

ial r

esol

utio

n

•Fa

ir co

mpa

rison

: P2

P w

ith N

mac

ropa

rtic

les a

ndPI

C w

ith N

grid

=N

May

4, 2

015

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

31

Page 32: High-Performance Simulations of Coherent Synchrotron ...

CSR:

P2P

Vs.

PIC

•Di

ffere

nce

in sp

atia

l res

olut

ion:

An

illus

trat

ive

exam

ple

•An

alyt

ical

dist

ribut

ion

sam

pled

with

N =

N XN Ym

acro

part

icle

s(as

in P

2P)

•O

n a

N x×N Y

grid

(as i

n PI

C)

•2D

grid

: N X=N

Y=32

•PI

C ap

proa

ch p

rovi

des s

uper

ior s

patia

l res

olut

ion

to P

2P a

ppro

ach

•Th

is m

otiv

ates

us t

o us

e a

PIC

code

EXAC

TP2

P N

=322

SNR=

2.53

PIC

N=5

0x32

2SN

R=13

.89

Sign

al-to

-Noi

se R

atio

SNR

q i2

i1

Ngrid q iq i

2

i1

Ngrid

q iexact

q igrid

May

4, 2

015

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

32

Page 33: High-Performance Simulations of Coherent Synchrotron ...

Inte

grat

e ov

er p

artic

le h

isto

ries

to c

ompu

te re

tard

ed p

oten

tials

an

d co

rres

pond

ing

forc

eson

eac

h m

acro

part

icle

syst

em a

t t=t

k+∆t

Adva

nce

part

icle

s by

∆t

Nm

acro

part

icle

sat

t<t k

Nm

acro

part

icle

sat

t=t k

Out

line

of th

e P2

P Al

gorit

hm

May

4, 2

015

CSR

Sim

ulat

ions

on

Mul

ticor

e Pl

atfo

rms

33