On the Design of Nonconforming High-Resolution Finite ... · July 30 - August 1, 2012. Motivation...

On the Design of NonconformingHigh-Resolution Finite Element

Schemes for Transport ProblemsMatthias Möller

Institute of Applied Mathematics (LS3)TU Dortmund, Germany

Modeling and Simulation of Transport PhenomenaMoselle Valley, GermanyJuly 30 - August 1, 2012

Motivation

Objective: apply AFC schemes to nonconforming finite elements

■ Do the algebraic design criteria apply to nonconforming elements?■ Is the accuracy of solutions comparable to P1/Q1 approximations?■ How to implement essential boundary conditions?■ Is there any benefit from using nonconforming elements?

Algebraic Flux Correction, abbr. AFC, family of high-resolution schemesfor convection-dominated transport and anisotropic diffusion problems.

■ universal stabilization approach based on algebraic design criteria■ approved for conforming (multi-)linear finite element schemes■ found complicated to extend to higher-order finite elements

kij = �vj ·Z

⌦'ir'j dx, sij = �d

Z

⌦r'i ·r'j dx

Model problem

■ Convection-diffusion equation

■ Fletcher‘s group representation

■ Semi-discrete high-order scheme

u(x, t) ⇡X

j'j(x)uj(t), f(u) ⇡

Xj'j(x)f(uj)

Z

⌦w [u̇+r · f(u)] dx = 0, f(u) = vu� dru

Xjmij u̇j =

Xjkijuj +

Xjsijuj

kij = �vj ·Z

⌦'ir'j dx,

Model problem

■ Convection-diffusion equation

■ Fletcher‘s group representation

■ Semi-discrete high-order scheme

u(x, t) ⇡X

j'j(x)uj(t), f(u) ⇡

Xj'j(x)f(uj)

Z

⌦w [u̇+r · f(u)] dx = 0, f(u) = vu� dru

MCu̇ = KuXjmij u̇j =

Xj 6=i

kij(uj � ui) + �iui

�i =X

jkij

Family of AFC schemes Kuzmin et al.

MLu̇ = [K+D]u+ F(u̇,u)

artificial diffusion operator

antidiffusive correctionlumped mass matrix

linearized flux-correction

nonlinear flux-correction

Family of AFC schemes Kuzmin et al.

NL-GP NL-FCT NL-TVD Low-order

MLu = MLuL + F̄ (u̇L, uL)Lin-FCT

F (u) = �DuF (u̇, u) = [ML �MC ]u̇�Du F ⌘ 0

MLu̇ = [K+D]u+ F(u̇,u)

AFC-LPT. . .

Review of algebraic design principles

■ Jameson‘s Local Extremum Diminishing criterionIF

THEN local solution maxima/minima do not increase/decrease

■ Semi-discrete high-resolution AFC scheme

positive not negative

miu̇i =X

j 6=i�ij(uj � ui)

miu̇i =X

j 6=i(kij + dij)(uj � ui) + �iui +

X

j 6=i↵ijfij

not negative by construction

controlled byflux limiter

mi =X

jmij

need to check the positivity of mass matrix coefficients for each finite element by hand!

Finite element spaces

■ Parametric quadrilateral finite elements

■ Bilinear one-to-one mappingQ(T ) = {q = q̂ � �1T , q̂ 2 Q̂(T̂ )}

T : T̂ := [�1, 1]⇥ [�1, 1] 7! T 2 Th

Q̂1(T̂ ) = spanh1, x̂, ŷ, x̂ŷi

Rannacher& Turek ‘92

Qrot1,npar(T ) = spanh1, ⇠, ⌘, ⇠2 � ⌘2i

Q̂

rot

1

(T̂ ) = spanh1, x̂, ŷ, x̂2 � ŷ2i

Q1(T )

Qrot1

(T )

Nonconforming shape functions

■ Midpoint based variant

'̂1(x̂) =1

4� 1

2ŷ � �(x̂2 � ŷ2), '̂3(x̂) =

1

4+

1

2ŷ � �(x̂2 � ŷ2)

'̂2(x̂) =1

4+

1

2x̂+ �(x̂2 � ŷ2), '̂4(x̂) =

1

4� 1

2x̂+ �(x̂2 � ŷ2)

■ Mean value based variant� =

1

4: '̂i(m̂j) = �ij � =

3

8: |�̂j |�1

Z

�̂j

'̂i(x̂) d� = �ij

Matrix analysis - part 1

■ Consistent mass matrix on reference element

■ Positivity criterion

■ midpoint based variant has negative matrix coefficients■ mean value based variant has positive matrix coefficients■ lower bound is invariant to shape of quadrilateral element

M̂ =

0

BBBB@

1645�

2 + 724 �1645�

2 + 181645�

2 � 124 �1624�

2 + 18� 1624�

2 + 181645�

2 + 724 �1645�

2 + 181645�

2 � 1241645�

2 � 124 �1624�

2 + 181645�

2 + 724 �1645�

2 + 18� 1645�

2 + 181645�

2 � 124 �1624�

2 + 181645�

2 + 724

1

CCCCA

0.3423 ⇡ 116

p30 < |�| < 3

16

p10 ⇡ 0.5929

(� = 14 )

(� = 38 )


0 10 20 30 40 50

0

5

10

15

20

25

30

35

40

45

50

nz = 369

Q1 FE


0 10 20 30 40 50

0

5

10

15

20

25

30

35

40

45

50

nz = 369

Q1 FE

0 10 20 30 40 50 60 70 80

0

10

20

30

40

50

60

70

80

nz = 530

Q1rot FE


0 10 20 30 40 50

0

5

10

15

20

25

30

35

40

45

50

nz = 369

Q1 FE

0 10 20 30 40 50 60 70 80

0

10

20

30

40

50

60

70

80

nz = 530

Q1rot FE

3456789

101112

0 10 20 30 40 50 60 70 80

Number of non-zero entries per matrix row

Number of matrix row

Q1Q1rot

Storage format:■ CRS, CCS

Storage format:■ ELLPACK

Solid body rotation

■ Velocity field

■ Grid size

■ Stochastic grid disturbance

■ Time step in Crank-Nicolson

■ Initial = exact solution at

v(x, y) = (0.5� y, x� 0.5)

�t = 1.28 · h

t = 2⇡k, k 2 N

� 2 {0%, 1%, 5%}

h = 1/2l, l = 5, 6, . . .

u̇+r · (vu) = 0 in (0, 1)2

u = 0 on �inflow

SBR: low-order solutions (0% mesh perturbation)

Q1

Qrot,MVal1,npar

Qrot,MVal1

Qrot,MPt1

54%

63% 64%

64%

SBR: low-order solutions (5% mesh perturbation)

Q1

Qrot,MVal1,npar

Qrot,MVal1

Qrot,MPt1

54%

64%

64%

64%

SBR: NL-FCT solutions (0% mesh perturbation)

Q1

Qrot,MVal1,npar

Qrot,MVal1

Qrot,MPt1

SBR: NL-FCT solutions (5% mesh perturbation)

Q1

Qrot,MVal1,npar

Qrot,MVal1

Qrot,MPt1

SBR: L2-error (5% mesh perturbation)

0,01

0,1

1

5 6 7 8

conforming FE

Refinement level

Low-order Lin-FCT NL-FCT NL-GP NL-TVD

0,01

0,1

1

5 6 7 8

nonconforming FE

Refinement level

Rotation of a Gaussian hill

~Q1 FE NL-FCTtime t=5/2π

u̇+r · (vu� dru) = 0 in (�1, 1)2, v(x, y) = (�y, x), d = 0.001

RGH: L2-error (5% mesh perturbation)

0,001

0,01

0,1

1

10

5 6 7 8 9

conforming FE

Refinement level


0,001

0,01

0,1

1

10

5 6 7 8 9

nonconforming FE

Refinement level

RGH: dispersion-error (5% mesh perturbation)

-0,5

0,5

1,5

2,5

3,5

4,5

5,5

5 6 7 8 9

conforming FE

Refinement level


-0,5

0,5

1,5

2,5

3,5

4,5

5,5

5 6 7 8 9

nonconforming FE

Refinement level

conforming FE nonconforming FE

good accuracy good

small numerical diffusion small(er)

smaller #DOFs, #edges larger

irregular sparsity pattern regular

Taxonomy of finite elements

Boundary conditions

■ Y. Basilevs, T. Hughes, Weak imposition of Dirichlet boundary conditionsin fluid mechanics, Computers & Fluids 32 (1) 2007, 12-26

■ consistent, adjoint-consistent■ consistent, adjoint-inconsistent

■ E. Burman, A penalty free non-symmetric Nitsche type method for the weak imposition of boundary conditions, eprint arXiv:1106.5612v2 (Nov 2011)

Convection-diffusion equation hyperbolic limit

r · (vu� dru) = f in ⌦u = uD on �D

(dru) · n = g on �N

r · (vu) = f in ⌦(vu) · n = h on �in

d ! 0

� = 1

� = �1�b =

Cd

hb

� = �1, � ⌘ 0

Weak imposition of boundary conditions

Z

⌦�rwh · (vuh � druh)dx+

Z

�wh(vuh) · nds

�Z

�D

wh(druh) · nds�Z

�D

(�drwh) · nuhds

�Z

�D\�in(vwh) · nuhds+

NebX

b=1

Z

�D\�b�bwhuhds

=

Z

⌦whfdx+

Z

�N

whgds�Z

�D

�(drwh) · nuDds

�Z

�D\�in(vwh) · nuDds+

NebX

b=1

Z

�D\�b�bwhuDds

� = ±1, �b =Cd

hb

Weak imposition of boundary conditions

Z

⌦�rwh · (vuh � druh)dx+

Z

�wh(vuh) · nds

�Z

�D\�in(vwh) · nuhds

=

Z

⌦whfdx

�Z

�D\�in(vwh) · nuDds

Z

�out

whv · nuhds

� = 1, �b =4 · 0.001

hbFEM-TVDQrot

1

� = �1, �b =4 · 0.001

hbFEM-TVDQrot

1

� = �1, �b ⌘ 0FEM-TVDQrot1

1D convection-diffusion equation

� = 1

�b =4 · 0.001

h

� = �1

�b =4 · 0.001

h

� = �1�b ⌘ 0

1ux

� 0.001uxx

= 0

u(0) = 1, u(1) = 0

FEM-TVDQ1

Summary

■ Do the algebraic design criteria apply to nonconforming elements? Yes, for the integral mean value based variant of nonconforming rotated bilinear finite elements.

■ Is the accuracy of solutions comparable to P1/Q1 approximations? Yes, accuracy and numerical dissipation are comparable.

■ How to implement essential boundary conditions? Apply consistent, adjoint-consistent Nitsche-type method.

■ Is there any benefit from using nonconforming elements? Regular sparsity structure beneficial for parallel HPC.

Outlook

■ Extend AFC schemes to other nonconforming finite elements■ NC-Quad (Z. Cai, J. Douglas, Jr., X. Ye)■ Composite Crouzeix-Raviart (F. Schieweck)

■ Generalize consistent, adjoint-consistent approach towards the implementation of Dirichlet boundary conditions to periodic ones

■ Improve (parallel) efficiency by exploring the regular sparsity structure■ use variant of ELLPACK matrix storage format■ reduce communication costs due to weaker coupling

Acknowledgment

AFC schemes

■ D. Kuzmin (University of Erlangen-Nuremberg)Featflow2

■ M. Köster, P. Zajac (TU Dortmund)

Source code freely available at:

http://www.featflow.de/en/software/featflow2.html

http://www.featflow.de/en/software/featflow2.htmlhttp://www.featflow.de/en/software/featflow2.html

Edge-based solvers for the compressible Euler equations on multicores and GPUs

Example: edge-based flux-assembly with Q1 FE

0

10

20

30

40

50

0K 1K 10K 100K 1.000K

Band

wid

th (

GB/

s)

#edges per color group

1 OMP 2 OMP 4 OMP8 OMP 12 OMP Xeon X5680(2x6) CUDA C2070

best GPU performancefor large problem sizes

best CPU performanceif data fits into cache

Example: edge-based flux assembly with Q1 FE

0

25

50

75

100

125

150

CUDA OMP 6

Acc

umul

ated

tim

e (m

s)

w/ transf.„55 ms“

132 ms

23 ms

5,7x

0

10

20

30

40

50

0K

125K

250K

375K

500K

Band

wid

th (

GB/

s)

Color groups 1-14

OMP 6 on Core i7 EPCUDA on C2070#edges per color group

6-7 GB/s

Fi =N

colorsX

k=1

X

ij2CGk

�ij(Uj � Ui)

0

25

50

75

100

125

150

175

200

CUDA OMP 6

Acc

umul

ated

tim

e (m

s)

w/ transf.„95 ms“

191 ms

32 ms

5,9x

Example: edge-based flux assembly with Q1rot FE

0

10

20

30

40

50

60

0K

375K

750K

1.125K

1.500K

Band

wid

th (

GB/

s)

Color groups 1-9

OMP 6 on Core i7 EPCUDA on C2070#edges per color group

6-7 GB/s

On the Design of Nonconforming High-Resolution Finite ... · July 30 - August 1, 2012. Motivation...

Documents

Transcript of On the Design of Nonconforming High-Resolution Finite ... · July 30 - August 1, 2012. Motivation...