Math 321 Lecture 5 An introduction to ﬁnite elements./greg/math321/Lec5.pdf · Math 321 Lecture 5...

Math 321 Lecture 5

An introduction to finite elements.

Consider the one-dimensional second order elliptic equation

−d2φ

dx2+ q(x) φ(x) = f(x)

subject to the boundary conditions

φ(0) = a

dφ

dx(x = 1) = b

This is a two point boundary value problem. The boundary condition at x = 0 is a conditionon the function, and is called a Dirichlet condition, while the boundary condition at x = 1 isa condition on the derivative, and is called a Neumann condition.

Many methods of solving this equation involve writing φ(x) as some sort of expansion:

φ(x) =m∑

j=1

cj φj(x)

One common approximation is a Fourier series

φ(x) =12

c0 + c1 cos(2πx) + c2 cos(4πx) + . . .

+ d1 sin(2πx) + d2 sin(4πx) + . . .

where the functions φj(x) are the trigonometric polynomials.

In these lectures we want to represent φ(x) by a straight line approximation on intervals.

1

It might not be obvious in this representation what the functions φj are. They are the so-calledhat functions.

φ2(x) is 0 except on the intervals (x1, x2) and (x2, x3).

φ2(x1) = φ2(x3) = 0φ2(x2) = 1

If you look at the interval (x2, x3), the only functions which are non-zero on that interval areφ2(x) and φ3(x). If

φ(x2) = c2,

φ(x3) = c3

then the function c2 φ2(x) + c3 φ3(x) is a linear approximation to φ(x) on the interval (x2, x3)which is both linear and matches the end point values.

There are two basic approaches to the solution of the differential equation.

The first we will consider is the Rayleigh Ritz approach, where we assert that solving thedifferential system:

−d2φ

dx2+ q(x) φ(x) = f(x)

subject to the boundary conditions

φ(0) = a

dφ

dx(x = 1) = b

is equivalent to minimising the functional

I(φ) =∫ 1

0dx

[(dφ

dx)2 + q(x) φ2(x)− 2f(x) φ(x)

]− 2 b φ(1)

for functions which satisfy φ(0) = a.

If you consider

I(φ + ε v) =∫ 1

0dx (φ

′(x) + ε v

′(x))2

+∫ 1

0dx q(x) (φ(x) + ε v(x))2

− 2∫ 1

0dx f(x) (φ(x) + ε v(x))

− 2 b (φ(1) + ε v(1))

this functional can be minimised by differentiating with respect to ε and setting the result 0at ε = 0:

dI

dε= 2

∫ 1

0dx v

′(x) (φ

′(x) + ε v

′(x))

2

+ 2∫ 1

0dx q(x) v(x) (φ(x) + ε v(x))

− 2∫ 1

0dx f(x) v(x)

− 2 b v(1)

Setting the result 0 at ε = 0 yields∫ 1

0dx

[v′(x) φ

′(x) + v(x) q(x) φ(x)− v(x) f(x)

]− b v(1) = 0

Integrating the first term by parts gives us∫ 1

0dx v

′(x) φ

′(x) =

[v(x) φ

′(x)

]1

0−

∫ 1

0dx v(x) φ

′′(x)

so the functional I(φ) is minimised by φ(x) if∫ 1

0dx v(x)

[−φ

′′(x) + q(x) φ(x)− f(x)

]= 0

andv(1)

[φ′(1)− b

]= 0

which is to say that both the differential equation, and the Neumann boundary condition atx = 1, are satisfied. The Dirichlet boundary condition is treated differently, in that we imposethat condition on any of the functions that we choose to minimise the functional I(φ) ie

φ(0) = a, and v(0) = 0

We have replaced the original problem of solving a differential equation by a problem of min-imising an integral. In an important sense this is a step forward because only φ

′(x) occurs in

the integral, whereas φ′′(x) occurs in the differential equation, so the space of trial functions

from which we try to minimise the integral need not be as smooth as we would need when con-sidering the differential equation. In particular, we can consider piecewise linear trial functionscomposed from hat functions in minimising the integral, even though such trial functions havedicontinuities in the first derivative, which means that the second derivative does not exist, atthe points x1, x2, . . ..

If we sayφ(x) = a φ0(x) + c1 φ1(x) + c2 φ2(x) + c3 phi3(x) + c4 φ4(x)

the functional I(φ(x))

I(φ) =∫ 1

0dx

[(dφ

dx)2 + q(x) φ2(x)− 2f(x) φ(x)

]− 2 b φ(1)

becomes just a a function of the unknown coefficients c1 . . . c4

I(c) = cT M c− 2 FT c + Q(a)

where the matrix M , called the mass matrix, is given by

Mij =∫ 1

0dx

[φ′i(x) φ

′j(x) + q(x) φi(x) φj(x)

]and

Fi =∫ 1

0dx φi(x) f(x)

3

If we look closely at the elements of M and F for the basis functions we have chosen, viz. thehat functions we see that the element Mij is non-zero only if j = i− 1, or j = i, or j = i + 1.The integrals can be evaluated one interval (xk, xk+1) at a time, and the contributions to Mand F assembled. ( This is called the assembly phase of a finite element calculation). Lookingat the first two integrals,∫ x1

0dx

[(a φ

′0(x) + c1 φ

′1(x))2 + q(x) (a φ0(x) + c1 φ1(x))2 − 2 f(x)((a φ0(x) + c1 φ1(x))

]+

∫ x2

x1

dx[(c1 φ

′1(x) + c2 φ

′2(x))2 + q(x) (c1 φ1(x) + c2 φ2(x))2 − 2 f(x)((c1 φ1(x) + c2 φ2(x))

]

we can see that the first contributes to Q(a), M11, M12 and F1, and the second contributesin turn to M11, M12. M22, F1 and F2. Each of these intervals is called an element, which iswhere the finite element method gets its name, although of course it is more commonly appliedin two and three dimensions than in the one dimensional introduction given here.

From the properties noted we can see that the matrix M is tridiagonal. It is also clear from itsdefinition that it is symmetric, because the indices i and j are interchangeable in the definition.Finally we can note that if the relation q(x) ≥ 0 holds, as it does in many practical applications,the matrix M is symmetric, positive definite, and tridiagonal. We can also see that to obtainthe vector c which minimises

I(c) = cT M c− 2 FT c + Q(a)

we simply differentiate with respect to the ci to obtain

M c = F

where the matrix M has exactly the structure we discussed in connection with canonical splines,and can be solved by Cholesky decomposition, specialised to a tridiagonal matrix.

The solution of a problem by the finite element method therefore consists of two distinct phases:

1. The assembly phase, consisting of integration on each element to compute the contribu-tions to the relevant Mij and Fi, and

2. solution of the matrix equation M c = F

The integration on each element is essentially the same

On the interval (xi−1, xi) of width hi,

φi−1(x) = (1− x− xi−1

hi) =

xi − x

hi

which is 1 at xi−1 and 0 at xi, and

φi(x) =x− xi−1

hi

which is 0 at xi−1 and 1 at xi.

4

The contribution to Mi− 1, i− 1 from this interval is∫ xi

xi−1

dx (φ′i−1(x) φ

′i−1(x) + q(x) φi−1(x) φi−1(x)),

to Mi−1,i is ∫ xi

xi−1

dx (φ′i−1(x) φ

′i(x) + q(x) φi−1(x) φi(x)),

to Mi, i is ∫ xi

xi−1

dx (φ′i(x) φ

′i(x) + q(x) φi(x) φi(x)),

to Fi−1 is ∫ xi

xi−1

dx φi−1(x) f(x),

and to Fi is ∫ xi

xi

dx φi−1(x) f(x),

You already have a numerical integration technique, Gauss quadrature, which can be used todo each of the integrals. ∫ b

adx f(x) =

N∑k=1

wk f(xk)

If q(x) happens to be a polynomial function, you can choose the number of integration pointsN in each interval so that the integrals are exact. It is not likely that q(x) will be a polynomialin many applications, but N = 3 would probably be sufficient for each interval, as if we wantimproved accuracy we are more likely to increase the number of intervals, making each smaller.

We need to look at the first interval in some detail, because of the boundary condition φ(0) = a.∫ x1

x0

dx[(a φ

′0(x) + c1 phi

′1(x))2 + q(x) (a φ0(x) + c1 φ1(x))2

]contributes to Q(a) :

a2∫ x1

x0

dx[(φ

′0(x))2 + q(x) (φ0(x))2

],

to M11 : ∫ x1

x0

dx[(φ

′1(x))2 + q(x) (φ1(x))2

],

and to F1 : ∫ x1

x0

dx[−a((φ

′0(x) φ

′1(x) + q(x) φ0(x) φ1(x)) + f(x) φ1(x)

].

The organisational aspects of the calculation are as follows:

1. Evaluate the integrals on(x0, x1) and add the contributions to M11 and F1.

2. Evaluate the integrals on (x1, x2), and add contributions to M11, M12, M22, F1 and F2.

3. Evaluate the integrals on (x2, x3), and add contributions to M22, M23, M33, F2 and F3.

4. Continue through the intervals . . .

5. Evaluate the integrals on (xn−1, xn), and add contributions to Mn−1,n−1, Mn−1,n, Mn,n, Fn−1

and Fn.

5

6. Finally, add the contribution Fn = Fn + b from the boundary condition φ′(xn) = b.

The above is a toy finite element program to illustrate some of the principles of productionfinite element systems. The production systems have the following features:

1. Higher order polynomials. These would certainly be needed needed if we were dealingwith higher than second order differential equations. A well known example in onedimension is the Hermite cubic, which is similar to the cubic spline we have discussed,but not quite as smooth.

The function is a different cubic on each interval, and both the function and its firstderivative are continuous at the node xi, but not the second derivative. It can be writtenin terms of two basis functions per node xi. The basis function S1(x) corresponding tothe function value at x1 satisfies the following equations:

S′1(x0) = S

′1(x1) = S

′1(x2) = 0

S1(x0) = S1(x2) = 0S1(x1) = 1

and the basis function S1d(x) corresponding to the derivative at x1 satisfies

S1d(x0) = S1d(x1) = S1d(x2) = 0S′1d(x0) = S

′1d(x2) = 0

S′1d(x1) = 1

6

and on the interval (xi, xi+1), the approximation to φ(x) can be written

S(x) = φi Si(x) + φ′i Sid(x) + φi+1 Si+1(x) + φ

′i+1 Si+1,d(x)

The advantage of using higher order elements is a faster rate of convergence.

| φ(x)− φapprox(h) | ∼ h for linear elements| φ(x)− φapprox(h) | ∼ h3 for Hermite cubic elements

The diadvantages, if you have to write the program, are higher degree polynomials in theintegrals, which might require higher order Gaussian quadrature, and a more complicatedmatrix structure, although still sparse and clustered around the diagonal.

2. Higher dimensions. Many production problems are two or three dimensional. Inthe case of two dimensional problems with straight line boundaries, the area can besubdivided into polygons, often triangles, with basis functions which are non-zero onneighbouring polygons and zero elsewhere.

A similar problem in two dimensions to the one dimensional one we have discussed abovewould be

−∇2φ + q φ = f,

or

−(∂2φ

∂x2+

∂2φ

∂y2) + q(x, y) φ(x, y) = f(x, y)

with either φ(x, y) specified on the boundary, a Dirichlet condition, or ∂φ∂n , (the normal

derivative), specified on the boundary, a Neumann condition.

The corresponding functional is

I(φ) =∫

dy

∫dx

[∇φ . ∇φ + q φ2 − 2 f φ

]=

∫dy

∫dx

[(∂φ

∂x)2 + (

∂φ

∂y)2 + q(x, y) φ2(x, y)− 2 f(x, y) φ(x, y)

]The whole area is divided into rectangles or triangles:

On any triangle, the simplest (linear) approximation to φ(x, y) is

φ(x, y) = a + b x + c y

7

where the constants a, b, c can be related to the values φi at the vertices by

φ1 = a + b x1 + c y1

φ2 = a + b x2 + c y2

φ3 = a + b x3 + c y3

The approximation to φ(x, y) across the whole area is continuous along the seams betweenadjacent triangles.

Several things are now more difficult.

(a) Doing the integrals. Now you have to do two dimensional integrals over each triangleor rectangle. This will become even more demanding if you have to do volumeintegrals in a three dimensional problem.

(b) Keeping track of the non-zero elements in the matrix. Mij will be non-zero wheneververtex i and vertex j share a common triangle, and zero otherwise. Grid generationand ordering the vertices becomes an important part of the whole assembly process.

(c) Solving the system of linear equations. The matrix is still sparse, having many morezero elements than non-zero elements, but it is much harder to exploit the sparsenesswhen the pattern is irregular. The ordering of the vertices determines the amountof infill during the matrix factorisation.

Curved boundaries require isoperimetric elements, which essentially map polygons ontoshapes with curved boundaries. Three dimensional problems require volumetric elementssuch as tetrahedra, or rectangular paralleopipeds depending on the regularity of theshape, while some calculations are actually performed on the surfaces of three dimensionalobjects, such as aeroplanes, and require sophisticated mesh generation techniques.

3. Problems where the Ritz method does not work. The reason that we were able topass easily from the differential equation to the minimisation of a functional was becausethe differential equation was self-adjoint, which means it can be written in the form

d

dx(p(x) φ

′(x)) + q(x) φ(x) = f(x)

If we don’t have that property, a technique which can be used on a differential equation

L φ(x) = f(x)

8

where L is a differential operator, is the Galerkin technique. We write as before

φ(x) =N∑

i=1

ci φi(x)

and then generate a matrix equation for the coefficients ci, by multiplying by a set of testfunctions vj(x) and integrating to give

(vj(x), L(N∑

i=1

ci φi(x))) = (vj(x), f(x)) for j = 1 . . . N

where the scalar product

(u(x), v(x)) =∫ 1

0dx u(x) v(x)

If the equation is self-adjoint, and the test functions vj(x) are the same as the basisfunctions φj(x), the Galerkin approach yields the same matrix equation to solve as doesthe Ritz method.

4. Time dependent problems. A typical problem might be

∂

∂tφ(x, t) = L φ(x, t) + f(x, t)

where L is a spatial differential operator. These can be attacked by writing

φ(x, t) =N∑

j=1

cj(t) φj(x),

substituting into the differential equation, multiplying by a set of test functions vi(x),and integrating.

(vi(x),N∑

j=1

c′j(t) φj(x)) = (vi(x), L

N∑j=1

cj(t) φj(x)) + (vi(x), f(x, t)), i = 1 . . . N

which we can write as a set of coupled first order ordinary differential equations

A c′(t) = B c(t) + F(t)

with

aij =∫

dx vi(x) φj(x),

bij =∫

dx vi(x) L φj(x)

Fi(t) =∫

dx vi(x) f(x, t)

5. Eigenvalue problems. A typical eigenvalue problem might be to resolve the resonantfrequencies of a bridge, such as the millenium footbridge across the Thames in Londonor the Tacoma Narrows bridge in the USA, or buildings being constructed in earthquakezones. Essentially the time dependent behaviour of the structure is dependent on theeigenvalues of the spatial operator L, so we are looking for the solution of eigenvalueproblems of the type

L φ(x) = λ φ(x)

9

where, as before, L is a spatial differential operator. Again we can write

φ(x) =N∑

j=1

cj φj(x),

substitute into the eigenvalue equation, multiply by the test functions vi(x) and integrate:

(vi(x), LN∑

j=1

cj φj(x)) = λ (vi(x),N∑

j=1

cj φj(x)), i = 1 . . . N

which gives rise to the generalised eigenvalue problem

A c = λ B c

with

aij =∫

dx vi(x) L φj(x),

bij =∫

dx vi(x) φj(x)

6. Sheer size of problems Test problems with 15000 unknowns can be routinely solvedon workstations. Production problems with 300,000 unknowns are quite likely to arise inengineering practice.

References

UoW library has many reference books on finite elements. My personal favourite is

Strang, Gilbert, and Fix, George J. 1973. An analysis of the finite element method Prentice-Hall Englewood Cliffs.

10

Math 321 Lecture 5 An introduction to ﬁnite elements./greg/math321/Lec5.pdf · Math 321 Lecture 5...

Documents

Transcript of Math 321 Lecture 5 An introduction to ﬁnite elements./greg/math321/Lec5.pdf · Math 321 Lecture 5...