The finite element method

This is an edited section from my thesis “Solving Poisson’s equation with Dataflow computing”. It might be a bit more formal than most of my other posts.

Introduction

The finite element method is a method to solve boundary value problems. Boundary value problems pose the problem of finding a function u:ΩRu : \Omega \rightarrow \mathbb{R} that satisfies a given differential equation on Ω\Omega and given boundary conditions on the boundary Ω\partial \Omega of the domain. We will use Poisson’s problem on a two-dimensional domain with Dirichlet boundary conditions:

For a given fL2(Ω)f \in L^2(\Omega), uΩ:ΩRu_{\partial \Omega} : \partial \Omega \rightarrow \mathbb{R}, and Ω\Omega, find a function u:ΩRu : \Omega \rightarrow \mathbb{R} such that

The Δ\Delta represents a differential operator called the Laplace operator. For twice-differentiable functions g:ARg : A \rightarrow \mathbb{R} on a domain ARdA \subset \mathbb{R}^d, the Laplace operator is defined as
Δg=i=1d2gxi2\Delta g = \sum_{i = 1}^d \frac{\partial^2 g}{\partial x_i^2}

The finite element method is a numerical method. This means that the solution uu is approximated by a function uhuu^h \approx u. The function uhu^h is taken to be in some finite-dimensional approximation space VhV^h. The space VhV^h is spanned by a basis ψ0,ψ1,...,ψN1:ΩR\psi_0, \psi_1, ..., \psi_{N - 1} : \Omega \rightarrow \mathbb{R}. When the basis functions ψ0\psi_0, …, ψN1\psi_{N - 1} are fixed, uhu^h can be expressed by a vector u=(u0,u1,...uN1)RN\mathbf{u} = (u_0, u_1, ... u_{N - 1})^\top \in \mathbb{R}^N:
uh(ξ)=i=0N1uiψi(ξ)for ξΩu^h(\boldsymbol{\xi}) = \sum_{i = 0}^{N - 1} u_i \psi_i(\boldsymbol{\xi}) \quad \text{for $\boldsymbol{\xi} \in \Omega$}
The elements u0,u1,...,uN1Ru_0, u_1, ..., u_{N - 1} \in \mathbb{R} are known as degrees of freedom.

Conceptually, the finite element method can be separated into several stages:

  1. A method to convert the boundary value problem to a discrete system of symbolic equations. The most popular method is the Bubnov-Galerkin method, which is often simply called the Galerkin method.
  2. The discretization strategy, which describes how to obtain an actual system of equations for a given symbolic system of equations. This involves picking the basis functions, and evaluating or approximating the integrals in the symbolic system. Also, one or more methods of refinement are usually provided, which can be used to increase the number of basis functions and thus the dimension of VhV^h, so that uhVhu^h \in V^h can approximate uu better.
  3. The solver, which solves the system of equations. If the boundary value problem is linear, the system of equations will be linear too, and linear solvers like the conjugate gradient method, the generalized minimum-residual method, or the stabilized biconjugate gradient method are popular choices. Nonlinear systems are harder to solve, and require more advanced methods like the Newton-Rhapson method or the Picard method. Typical nonlinear solvers iteratively linearize the system, and solve the linearized system with a linear solver.

Galerkin method

The Galerkin method derives a system of symbolic equations for a given boundary value problem. The Galerkin method is based on the weak formulation of the boundary value problem. First, let us define the Sobolev space H1(Ω)H^1(\Omega) of functions f:ΩRf : \Omega \rightarrow \mathbb{R} for which the first partial derivatives are square-integrable. The weak formulation of a differential equation can be obtained by multiplying both sides by a test function vV={ vH1(Ω):vΩ=0 }v \in V = \{\ v \in H^1(\Omega) : v|_{\partial \Omega} = 0\ \}, integrating over Ω\Omega, and demanding equality regardless of the choice of vVv \in V. Applying this to αΔu+βu=f\alpha \Delta u + \beta u = f gives
vV:Ωα(Δu)v+βuv dΩ=Ωfv dΩ\forall v \in V: \int_\Omega \alpha(\Delta u)v + \beta uv\ \text{d}\Omega = \int_\Omega f v\ \text{d}\Omega

One can then apply integration by parts and use vΩ=0v|_{\partial \Omega} = 0 to make the left-hand side symmetric:
vV:Ωα(uv)+βuv dΩ=Ωfv dΩ\forall v \in V: \int_\Omega -\alpha (\nabla u \cdot \nabla v) + \beta u v\ \text{d}\Omega = \int_\Omega f v\ \text{d}\Omega

With this equation, the complete weak formulation becomes:

For a given f:ΩRf : \Omega \rightarrow \mathbb{R}, u0:ΩRu_0 : \partial \Omega \rightarrow \mathbb{R}, and Ω\Omega, find a function uH1(Ω)u \in H^1(\Omega) such that:
vV:Ωα(uv)+βuv dΩ=Ωfv dΩ\forall v \in V: \int_\Omega -\alpha (\nabla u \cdot \nabla v) + \beta u v\ \text{d}\Omega = \int_\Omega f v\ \text{d}\Omega
u=u0 on Ωu = u_0 \text{ on } \partial \Omega

The Galerkin method replaces VV by a finite-dimensional approximation VhV^h, spanned by basis functions {ψ0,ψ1,...,ψN1}\{ \psi_0, \psi_1, ..., \psi_{N - 1} \}, and seeks an approximate solution uhVhu^h \in V^h. Ignoring the boundary conditions for now, we get the system

i=0,1,...,N1:Ωα(uhψi)+βuhψi dΩ=Ωfψi dΩ\forall i = 0, 1, ..., N - 1: \int_\Omega -\alpha (\nabla u^h \cdot \nabla \psi_i) + \beta u^h \psi_i\ \text{d}\Omega = \int_\Omega f \psi_i\ \text{d}\Omega

So uhVhu^h \in V^h can be expressed as uh=j=0N1ujψju^h = \sum_{j = 0}^{N - 1} u_j \psi_j. Using this, we get
i=0,1,...,N1:Ωα((j=0N1ujψj)ψi)+β((j=0N1ujψj)ψi dΩ=Ωfψi dΩ\forall i = 0, 1, ..., N - 1: \int_\Omega -\alpha (\nabla (\sum_{j = 0}^{N - 1} u_j \psi_j) \cdot \nabla \psi_i) + \beta (\nabla (\sum_{j = 0}^{N - 1} u_j \psi_j) \psi_i\ \text{d}\Omega = \int_\Omega f \psi_i\ \text{d}\Omega

Bringing the summation outside the integral yields
i=0,1,...,N1:j=0N1(Ωα(ψiψj)+βψiψj dΩ)uj=Ωfψi dΩ\forall i = 0, 1, ..., N - 1: \sum_{j = 0}^{N - 1} \left( \int_\Omega -\alpha (\nabla \psi_i \cdot \nabla \psi_j) + \beta \psi_i \psi_j\ \text{d}\Omega \right) u_j = \int_\Omega f \psi_i\ \text{d}\Omega

Which can be written as a linear system Au=b\mathbf{A} \mathbf{u} = \mathbf{b} of size N×NN \times N, where the finite element matrix ARN×N\mathbf{A} \in \mathbb{R}^{N \times N} is defined as
Ai,j=Ωα(ψiψj)+βψiψj dΩ\mathbf{A}_{i, j} = \int_\Omega -\alpha (\nabla \psi_i \cdot \nabla \psi_j) + \beta \psi_i \psi_j\ \text{d}\Omega

and the right-hand side vector bRN\mathbf{b} \in \mathbb{R}^N is defined as
bi=Ωfψi dΩb_i = \int_\Omega f \psi_i\ \text{d}\Omega

Furthermore, we define the stiffness matrix S\mathbf{S} as
Si,j=Ωψiψj dΩ\mathbf{S}_{i, j} = \int_\Omega \nabla \psi_i \cdot \nabla \psi_j\ \text{d}\Omega

and the mass matrix M\mathbf{M} as
Mi,j=Ωψiψj dΩ\mathbf{M}_{i, j} = \int_\Omega \psi_i \psi_j\ \text{d}\Omega

We can now express the finite element matrix A\mathbf{A} in terms of the stiffness matrix S\mathbf{S} and mass matrix M\mathbf{M} as
A=αS+βM\mathbf{A} = -\alpha \mathbf{S} + \beta \mathbf{M}

The Dirichlet boundary conditions can be implemented by a Dirichlet lift. Each degree of freedom ui0u_{i_0}, ui1u_{i_1}, …, uim1u_{i_{m - 1}} which corresponds to a basis function which is nonzero on the boundary Ω\partial \Omega is prescribed in such a way that k=0m1uikψikuΩ\sum_{k = 0}^{m - 1} u_{i_k} \psi_{i_k} \approx u_{\partial \Omega} on Ω\partial \Omega. As such, the boundary condition is approximately satisfied. To pick u0,u1,...,um1u_0, u_1, ..., u_{m - 1} in such a way, one can use L2L^2 projection on the boundary, or make sure that uhu^h interpolates uΩu_{\partial \Omega} in a set of points x0\mathbf{x_0}, x1\mathbf{x_1}, … on the boundary.

Assembly

Suppose that the mass matrix M\mathbf{M} is to be assembled. One could separately evaluate the integrals Ωψiψj dΩ\int_\Omega \psi_i \psi_j\ \text{d}\Omega. However, the basis functions are usually defined on sections of the domain called elements. The domain can be partitioned into MM elements e0,e1,...,eM1e_0, e_1, ..., e_{M - 1} so that we have
Ωψiψj dΩ=k=0M1ekψiψj dΩ\int_\Omega \psi_i \psi_j\ \text{d}\Omega = \sum_{k = 0}^{M - 1} \int_{e_k} \psi_i \psi_j\ \text{d}\Omega
The usual approach is to loop over the elements and evaluate ekψiψj dΩ\int_{e_k} \psi_i \psi_j\ \text{d}\Omega for the basis functions ψi,ψj\psi_i, \psi_j which are nonzero on that element. Typically, the basis functions will have local support, and there will be a small number of nonzero basis functions on each element. For an element on which there are nn nonzero basis functions ψk0,ψk1,...,ψkn1\psi_{k_0}, \psi_{k_1}, ..., \psi_{k_{n - 1}}, an element matrix ERn×n\mathbf{E} \in \mathbb{R}^{n \times n} with entries Ei,j=ψkiψkj dΩ\mathbf{E}_{i, j} = \int \psi_{k_i} \psi_{k_j}\ \text{d}\Omega is assembled. After the element matrix is assembled, each entry Ei,j\mathbf{E}_{i, j} is added to Mki,kj\mathbf{M}_{k_i, k_j}.

Pseudocode for the assembly is as follows. Note that there are many details omitted (for example, how to evaluate the integral in the assembly of the element matrix).

let ψ[0], ψ[1], ..., ψ[N − 1] be the basis functions
let e[0], e[1], ..., e[M − 1] be the elements
let glob_mat be an N × N matrix with each element equal to zero
for k = 0 .. M − 1
	let n be the number of basis functions with support on e[k]
	let s[k, 0], s[k, 1], ..., s[k, n − 1] be the indices of the basis
		functions with support on e[k]
	let elem_mat be an n × n matrix with each element equal to zero
	
	// assemble element matrix elem_mat
	for i = 0 .. n − 1
		for j = 0 .. n − 1
			elem_mat[i, j] += integral of ψ[s[k, i]] ψ[s[k, j]] over e[k]
		end for
	end for
	
	// scatter element matrix elem_mat to global matrix glob_mat
	for i = 0 .. n − 1
		for j = 0 .. n − 1
			glob_mat[s[k, i], s[k, j]] += elem_mat[i, j]
		end for
	end for
end for