On the Design of Nonconforming High-Resolution Finite ... · July 30 - August 1, 2012. Motivation...

37
On the Design of Nonconforming High-Resolution Finite Element Schemes for Transport Problems Matthias Möller Institute of Applied Mathematics (LS3) TU Dortmund, Germany Modeling and Simulation of Transport Phenomena Moselle Valley, Germany July 30 - August 1, 2012

Transcript of On the Design of Nonconforming High-Resolution Finite ... · July 30 - August 1, 2012. Motivation...

  • On the Design of NonconformingHigh-Resolution Finite Element

    Schemes for Transport ProblemsMatthias Möller

    Institute of Applied Mathematics (LS3)TU Dortmund, Germany

    Modeling and Simulation of Transport PhenomenaMoselle Valley, GermanyJuly 30 - August 1, 2012

  • Motivation

    Objective: apply AFC schemes to nonconforming finite elements

    ■ Do the algebraic design criteria apply to nonconforming elements?■ Is the accuracy of solutions comparable to P1/Q1 approximations?■ How to implement essential boundary conditions?■ Is there any benefit from using nonconforming elements?

    Algebraic Flux Correction, abbr. AFC, family of high-resolution schemesfor convection-dominated transport and anisotropic diffusion problems.

    ■ universal stabilization approach based on algebraic design criteria■ approved for conforming (multi-)linear finite element schemes■ found complicated to extend to higher-order finite elements

  • kij = �vj ·Z

    ⌦'ir'j dx, sij = �d

    Z

    ⌦r'i ·r'j dx

    Model problem

    ■ Convection-diffusion equation

    ■ Fletcher‘s group representation

    ■ Semi-discrete high-order scheme

    u(x, t) ⇡X

    j'j(x)uj(t), f(u) ⇡

    Xj'j(x)f(uj)

    Z

    ⌦w [u̇+r · f(u)] dx = 0, f(u) = vu� dru

    Xjmij u̇j =

    Xjkijuj +

    Xjsijuj

  • kij = �vj ·Z

    ⌦'ir'j dx,

    Model problem

    ■ Convection-diffusion equation

    ■ Fletcher‘s group representation

    ■ Semi-discrete high-order scheme

    u(x, t) ⇡X

    j'j(x)uj(t), f(u) ⇡

    Xj'j(x)f(uj)

    Z

    ⌦w [u̇+r · f(u)] dx = 0, f(u) = vu� dru

    MCu̇ = KuXjmij u̇j =

    Xj 6=i

    kij(uj � ui) + �iui

    �i =X

    jkij

  • Family of AFC schemes Kuzmin et al.

    MLu̇ = [K+D]u+ F(u̇,u)

    artificial diffusion operator

    antidiffusive correctionlumped mass matrix

  • linearized flux-correction

    nonlinear flux-correction

    Family of AFC schemes Kuzmin et al.

    NL-GP NL-FCT NL-TVD Low-order

    MLu = MLuL + F̄ (u̇L, uL)Lin-FCT

    F (u) = �DuF (u̇, u) = [ML �MC ]u̇�Du F ⌘ 0

    MLu̇ = [K+D]u+ F(u̇,u)

    AFC-LPT. . .

  • Review of algebraic design principles

    ■ Jameson‘s Local Extremum Diminishing criterionIF

    THEN local solution maxima/minima do not increase/decrease

    ■ Semi-discrete high-resolution AFC scheme

    positive not negative

    miu̇i =X

    j 6=i�ij(uj � ui)

    miu̇i =X

    j 6=i(kij + dij)(uj � ui) + �iui +

    X

    j 6=i↵ijfij

    not negative by construction

    controlled byflux limiter

    mi =X

    jmij

    need to check the positivity of mass matrix coefficients for each finite element by hand!

  • Finite element spaces

    ■ Parametric quadrilateral finite elements

    ■ Bilinear one-to-one mappingQ(T ) = {q = q̂ � �1T , q̂ 2 Q̂(T̂ )}

    T : T̂ := [�1, 1]⇥ [�1, 1] 7! T 2 Th

    Q̂1(T̂ ) = spanh1, x̂, ŷ, x̂ŷi

    Rannacher& Turek ‘92

    Qrot1,npar(T ) = spanh1, ⇠, ⌘, ⇠2 � ⌘2i

    rot

    1

    (T̂ ) = spanh1, x̂, ŷ, x̂2 � ŷ2i

    Q1(T )

    Qrot1

    (T )

  • Nonconforming shape functions

    ■ Midpoint based variant

    '̂1(x̂) =1

    4� 1

    2ŷ � �(x̂2 � ŷ2), '̂3(x̂) =

    1

    4+

    1

    2ŷ � �(x̂2 � ŷ2)

    '̂2(x̂) =1

    4+

    1

    2x̂+ �(x̂2 � ŷ2), '̂4(x̂) =

    1

    4� 1

    2x̂+ �(x̂2 � ŷ2)

    ■ Mean value based variant� =

    1

    4: '̂i(m̂j) = �ij � =

    3

    8: |�̂j |�1

    Z

    �̂j

    '̂i(x̂) d� = �ij

  • Matrix analysis - part 1

    ■ Consistent mass matrix on reference element

    ■ Positivity criterion

    ■ midpoint based variant has negative matrix coefficients■ mean value based variant has positive matrix coefficients■ lower bound is invariant to shape of quadrilateral element

    M̂ =

    0

    BBBB@

    1645�

    2 + 724 �1645�

    2 + 181645�

    2 � 124 �1624�

    2 + 18� 1624�

    2 + 181645�

    2 + 724 �1645�

    2 + 181645�

    2 � 1241645�

    2 � 124 �1624�

    2 + 181645�

    2 + 724 �1645�

    2 + 18� 1645�

    2 + 181645�

    2 � 124 �1624�

    2 + 181645�

    2 + 724

    1

    CCCCA

    0.3423 ⇡ 116

    p30 < |�| < 3

    16

    p10 ⇡ 0.5929

    (� = 14 )

    (� = 38 )

  • Matrix analysis - part 2

    0 10 20 30 40 50

    0

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    nz = 369

    Q1 FE

  • Matrix analysis - part 2

    0 10 20 30 40 50

    0

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    nz = 369

    Q1 FE

    0 10 20 30 40 50 60 70 80

    0

    10

    20

    30

    40

    50

    60

    70

    80

    nz = 530

    Q1rot FE

  • Matrix analysis - part 2

    0 10 20 30 40 50

    0

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    nz = 369

    Q1 FE

    0 10 20 30 40 50 60 70 80

    0

    10

    20

    30

    40

    50

    60

    70

    80

    nz = 530

    Q1rot FE

    3456789

    101112

    0 10 20 30 40 50 60 70 80

    Number of non-zero entries per matrix row

    Number of matrix row

    Q1Q1rot

    Storage format:■ CRS, CCS

    Storage format:■ ELLPACK

  • Solid body rotation

    ■ Velocity field

    ■ Grid size

    ■ Stochastic grid disturbance

    ■ Time step in Crank-Nicolson

    ■ Initial = exact solution at

    v(x, y) = (0.5� y, x� 0.5)

    �t = 1.28 · h

    t = 2⇡k, k 2 N

    � 2 {0%, 1%, 5%}

    h = 1/2l, l = 5, 6, . . .

    u̇+r · (vu) = 0 in (0, 1)2

    u = 0 on �inflow

  • SBR: low-order solutions (0% mesh perturbation)

    Q1

    Qrot,MVal1,npar

    Qrot,MVal1

    Qrot,MPt1

    54%

    63% 64%

    64%

  • SBR: low-order solutions (5% mesh perturbation)

    Q1

    Qrot,MVal1,npar

    Qrot,MVal1

    Qrot,MPt1

    54%

    64%

    64%

    64%

  • SBR: NL-FCT solutions (0% mesh perturbation)

    Q1

    Qrot,MVal1,npar

    Qrot,MVal1

    Qrot,MPt1

  • SBR: NL-FCT solutions (5% mesh perturbation)

    Q1

    Qrot,MVal1,npar

    Qrot,MVal1

    Qrot,MPt1

  • SBR: L2-error (5% mesh perturbation)

    0,01

    0,1

    1

    5 6 7 8

    conforming FE

    Refinement level

    Low-order Lin-FCT NL-FCT NL-GP NL-TVD

    0,01

    0,1

    1

    5 6 7 8

    nonconforming FE

    Refinement level

  • Rotation of a Gaussian hill

    ~Q1 FE NL-FCTtime t=5/2π

    u̇+r · (vu� dru) = 0 in (�1, 1)2, v(x, y) = (�y, x), d = 0.001

  • RGH: L2-error (5% mesh perturbation)

    0,001

    0,01

    0,1

    1

    10

    5 6 7 8 9

    conforming FE

    Refinement level

    Low-order Lin-FCT NL-FCT NL-GP NL-TVD

    0,001

    0,01

    0,1

    1

    10

    5 6 7 8 9

    nonconforming FE

    Refinement level

  • RGH: dispersion-error (5% mesh perturbation)

    -0,5

    0,5

    1,5

    2,5

    3,5

    4,5

    5,5

    5 6 7 8 9

    conforming FE

    Refinement level

    Low-order Lin-FCT NL-FCT NL-GP NL-TVD

    -0,5

    0,5

    1,5

    2,5

    3,5

    4,5

    5,5

    5 6 7 8 9

    nonconforming FE

    Refinement level

  • conforming FE nonconforming FE

    good accuracy good

    small numerical diffusion small(er)

    smaller #DOFs, #edges larger

    irregular sparsity pattern regular

    Taxonomy of finite elements

  • Boundary conditions

    ■ Y. Basilevs, T. Hughes, Weak imposition of Dirichlet boundary conditionsin fluid mechanics, Computers & Fluids 32 (1) 2007, 12-26

    ■ consistent, adjoint-consistent■ consistent, adjoint-inconsistent

    ■ E. Burman, A penalty free non-symmetric Nitsche type method for the weak imposition of boundary conditions, eprint arXiv:1106.5612v2 (Nov 2011)

    Convection-diffusion equation hyperbolic limit

    r · (vu� dru) = f in ⌦u = uD on �D

    (dru) · n = g on �N

    r · (vu) = f in ⌦(vu) · n = h on �in

    d ! 0

    � = 1

    � = �1�b =

    Cd

    hb

    � = �1, � ⌘ 0

  • Weak imposition of boundary conditions

    Z

    ⌦�rwh · (vuh � druh)dx+

    Z

    �wh(vuh) · nds

    �Z

    �D

    wh(druh) · nds�Z

    �D

    (�drwh) · nuhds

    �Z

    �D\�in(vwh) · nuhds+

    NebX

    b=1

    Z

    �D\�b�bwhuhds

    =

    Z

    ⌦whfdx+

    Z

    �N

    whgds�Z

    �D

    �(drwh) · nuDds

    �Z

    �D\�in(vwh) · nuDds+

    NebX

    b=1

    Z

    �D\�b�bwhuDds

    � = ±1, �b =Cd

    hb

  • Weak imposition of boundary conditions

    Z

    ⌦�rwh · (vuh � druh)dx+

    Z

    �wh(vuh) · nds

    �Z

    �D\�in(vwh) · nuhds

    =

    Z

    ⌦whfdx

    �Z

    �D\�in(vwh) · nuDds

    Z

    �out

    whv · nuhds

  • � = 1, �b =4 · 0.001

    hbFEM-TVDQrot

    1

  • � = �1, �b =4 · 0.001

    hbFEM-TVDQrot

    1

  • � = �1, �b ⌘ 0FEM-TVDQrot1

  • 1D convection-diffusion equation

    � = 1

    �b =4 · 0.001

    h

    � = �1

    �b =4 · 0.001

    h

    � = �1�b ⌘ 0

    1ux

    � 0.001uxx

    = 0

    u(0) = 1, u(1) = 0

    FEM-TVDQ1

  • Summary

    ■ Do the algebraic design criteria apply to nonconforming elements? Yes, for the integral mean value based variant of nonconforming rotated bilinear finite elements.

    ■ Is the accuracy of solutions comparable to P1/Q1 approximations? Yes, accuracy and numerical dissipation are comparable.

    ■ How to implement essential boundary conditions? Apply consistent, adjoint-consistent Nitsche-type method.

    ■ Is there any benefit from using nonconforming elements? Regular sparsity structure beneficial for parallel HPC.

  • Outlook

    ■ Extend AFC schemes to other nonconforming finite elements■ NC-Quad (Z. Cai, J. Douglas, Jr., X. Ye)■ Composite Crouzeix-Raviart (F. Schieweck)

    ■ Generalize consistent, adjoint-consistent approach towards the implementation of Dirichlet boundary conditions to periodic ones

    ■ Improve (parallel) efficiency by exploring the regular sparsity structure■ use variant of ELLPACK matrix storage format■ reduce communication costs due to weaker coupling

  • Acknowledgment

    AFC schemes

    ■ D. Kuzmin (University of Erlangen-Nuremberg)Featflow2

    ■ M. Köster, P. Zajac (TU Dortmund)

    Source code freely available at:

    http://www.featflow.de/en/software/featflow2.html

    http://www.featflow.de/en/software/featflow2.htmlhttp://www.featflow.de/en/software/featflow2.html

  • Edge-based solvers for the compressible Euler equations on multicores and GPUs

  • Example: edge-based flux-assembly with Q1 FE

    0

    10

    20

    30

    40

    50

    0K 1K 10K 100K 1.000K

    Band

    wid

    th (

    GB/

    s)

    #edges per color group

    1 OMP 2 OMP 4 OMP8 OMP 12 OMP Xeon X5680(2x6) CUDA C2070

    best GPU performancefor large problem sizes

    best CPU performanceif data fits into cache

  • Example: edge-based flux assembly with Q1 FE

    0

    25

    50

    75

    100

    125

    150

    CUDA OMP 6

    Acc

    umul

    ated

    tim

    e (m

    s)

    w/ transf.„55 ms“

    132 ms

    23 ms

    5,7x

    0

    10

    20

    30

    40

    50

    0K

    125K

    250K

    375K

    500K

    Band

    wid

    th (

    GB/

    s)

    Color groups 1-14

    OMP 6 on Core i7 EPCUDA on C2070#edges per color group

    6-7 GB/s

    Fi =N

    colorsX

    k=1

    X

    ij2CGk

    �ij(Uj � Ui)

  • 0

    25

    50

    75

    100

    125

    150

    175

    200

    CUDA OMP 6

    Acc

    umul

    ated

    tim

    e (m

    s)

    w/ transf.„95 ms“

    191 ms

    32 ms

    5,9x

    Example: edge-based flux assembly with Q1rot FE

    0

    10

    20

    30

    40

    50

    60

    0K

    375K

    750K

    1.125K

    1.500K

    Band

    wid

    th (

    GB/

    s)

    Color groups 1-9

    OMP 6 on Core i7 EPCUDA on C2070#edges per color group

    6-7 GB/s