The proofs here are based on the lecture notes from a course in approximation theory [1]. I think most of the results and proofs are standard, so I expect most of the texts on this subject to contain similar results and proofs.
Suppose μ(x) is a nonnegative weight function and we want to compute the integral ∫f(x)μ(x)dx
for many different functions f.
Depending on what f and μ are, we might be able to evaluate the integral analytically (i.e. by integrating and obtaining an exact answer). However, in computational contexts it is often both more efficient and simpler to use a quadrature rule.
Definition 1: A quadrature rule is a collection of weights w1,w2,...,wn and positions x1,x2,...,xn such that k=q∑nwkf(xk)≈∫f(x)μ(x)dx
Since the approximation given by the quadrature rule only depends on the value of f on x1,x2,...,xn, the approximation will be better for smoother functions, for which the value of f doesn’t fluctuate too wildly, so the behavior is captured well by its value on some positions.
In practice, we often require the quadrature rule to be exact for some set S of functions. Often, we take S as the set of polynomials up to some maximum degree. We call this the exactness conditions. We can write the exactness conditions as a finite number of equations by taking a basis f1,f2,... of S. By the linearity of the integral the quadrature rule is also exact for any linear combination of f1,f2,... which means that the quadrature rule is exact for any f∈S.
In the following, we will denote the space of polynomials with degree up to n as Pn.
Quadrature rules, the easy way
It is not too complicated to derive quadrature rules that are exact for any polynomial p∈Pn for a given n. Give a function f and d+1 distinct points x0,x1,...,xd we can use Lagrange interpolation to construct a polynomial fd of degree d that interpolates f at x0,x1,...,xd. That is, fd(xk)=f(xk)for k=0,1,...,d
The polynomial fd is defined as fd(x)=∑k=0nf(xk)lk(x), where lk are the Lagrange polynomials defined by lk(x)=(xk−x0)(xk−x1)⋯(xk−xk−1)(xk−xk+1)⋯(xk−xd)(x−x0)(x−x1)⋯(x−xk−1)(x−xk+1)⋯(x−xd)
We can now integrate the interpolation polynomial fd of f as ∫fd(x)μ(x)dx=∫k=0∑nf(xk)lk(x)μ(x)dx
Bringing the summation and the terms f(xk) outside of the integral, we see ∫fd(x)μ(x)dx=k=0∑nf(xk)∫lk(x)μ(x)dx
So if we set wk=∫lk(x)μ(x)dx, we have ∫fd(x)μ(x)dx=k=0∑nwkf(xk)
Now, since fd interpolates f, we expect that fd≈f, so that ∫f(x)μ(x)dx≈∫fd(x)μ(x)dx=k=0∑nwkf(xk)
So we have found a quadrature rule that is exact for any f∈Pd.
It is also possible to find the weights in a more computational way. If we want the quadrature rule to be exact for a function space of dimension d, we can pick a basis of d functions, pick d distinct positions x1,x2,...,xd, write out the exactness conditions, and solve the resulting linear system of equations for the weights.
For example, suppose that we want the quadrature rule to be exact for any p∈Pd. We pick d+1 basis functions b0,b1,...,bd (for example bk(x)=xk would work) and distinct positions x0,x1,...,xd. Then the exactness requirements become j=0∑dwjbk(xj)=∫bk(x)μ(x)dxfor k=0,1,...,d
which is a linear system of d+1 equations in the d+1 unknowns w0,w1,...wd. In particular, if we use bk(x)=xk the resulting system is a Vandermonde system.
For now, we have regarded the positions x1,x2,...,xd as given. It is usually a good choice to pick them as Chebyshev nodes, since this ensures that the polynomial fd that interpolates f is actually close to f – it minimizes oscillatory behavior of fd.
If we change our point of view and interpret the positions x1,x2,...,xd as unknowns we can interpret the exactness requirements as a nonlinear system in 2d unknowns. If we expect this system to behave like a non-singular linear one (which is not a given, but let’s be optimistic), we would expect to be able to solve a system of 2d equations with it. This is by analogy with a linear system, where a non-singular system has a unique solution if there are as many unknowns as equations.
So, this would mean that for each n there exists a quadrature rule with n points that is exact for P2n−1. These quadrature uses roughly half the number of points of the quadrature rules derived via the method presented earlier, at the cost of not having the freedom of choosing the positions x0,x2,...,xn−1 freely.
It is possible to derive quadrature rules like this using orthogonal polynomials.
Orthogonal polynomials
Consider the following inner product on a space of functions ⟨f,g⟩=∫f(x)g(x)μ(x)dx
Like before, μ(x) is a nonnegative weight function. This inner product makes the function space into a vector space.
The usual notions of orthogonality and orthonormality apply.
Definition 2: A sequence of polynomials p0,p1,... is orthogonal if
deg(pk)=k
⟨pi,pj⟩=0 whenever i=j
Definition 3: A sequence of polynomials p0,p1,... is orthonormal if it is orthogonal and ⟨pk,pk⟩=1 for every k=0,1,...
We will use the following lemmas, which are easy to understand and prove.
Lemma 4: If p0,p1,... is a sequence of orthogonal polynomials and p is a polynomial with deg(p)<n, then ⟨p,pn⟩=0
Proof: The polynomials p0,p1,...,pn−1 form a basis of Pn. Since p∈Pn, we can write p(x)=∑k=0n−1ckpk(x). It follows that ⟨p,pn⟩=⟨∑k=0n−1ckpk,p⟩=∑k=0n−1ck⟨pk,pn⟩. Since p0,p1,... is an orthogonal sequence of polynomials and k<n, all terms in the summation are zero, so it follows that ⟨p,pn⟩=0.
□
The following lemma states that polynomials can be easily expressed as a linear combination of orthonormal polynomials.
Lemma 5: If p0,p1,... is a sequence of orthonormal polynomials and p is a polynomial with deg(p)=n, then p(x)=k=0∑n⟨p,pk⟩pk(x)
Proof: Since p0,p1,...,pn is a basis of Pn and p∈Pn, we can write p as p(x)=∑k=0nckpk(x) for some c0,c1,...,cn∈R. Taking the inner product of p and pk, we see ⟨p,pk⟩=⟨j=0∑ncjpj,pk⟩=j=0∑ncj⟨pj,pk⟩=ck
So ck=⟨p,pk⟩. Substituting this in $p(x) = \sum_{k = 0}^n c_k p_k(x) gives the required result.
□
A sequence of orthonormal polynomials can be computed by using the Gram-Schmidt method, but it is more efficient to use the following result.
Theorem 6: If p0,p1,... is a sequence of polynomials then ⟨pn,xpn+1⟩pn+1(x)=xpn(x)−⟨pn,xpn−1⟩pn−1(x)−⟨pnxpn⟩pn(x)
Proof: From lemma 5 we have xpn(x)=∑k=0n+1⟨xpn,pk⟩pk. Now ⟨xpn,pk⟩=∫xpn(x)pk(x)μ(x)dx=⟨pn,xpk⟩
So we have xpn=k=0∑n+1⟨pn,xpk⟩pk
By lemma 4, ⟨pn,xpk⟩ is zero when deg(xpk)=k+1<n, so k=n−1,n,n+1 are the only k≤n+1 for which ⟨pn,xpk⟩ is nonzero. So we have xpn=⟨pn,xpn−1⟩pn−1(x)+⟨pn,xpn⟩pn(x)+⟨pn,xpn+1⟩pn+1(x)
□
Using theorem 6, we can compute the orthonormal polynomial pn+1 from the previous polynomials as follows. First, we compute a polynomial qn+1(x) which is orthogonal to p0,p1,...,pn as q(x)=xpn(x)−⟨pn,xpn−1⟩pn−1(x)−⟨pn,xpn⟩pn(x)
Then, we compute pn+1 by normalizing qn+1: pn+1(x)=⟨qn+1,qn+1⟩qn+1(x)
There is the following nice result about the roots of orthogonal polynomials.
Lemma 7: Let p0,p1,... be a sequence of orthonal polynomials with respect to the inner product ⟨f,g⟩=∫f(x)f(x)μ(x)dx, and let the support of μ be contained in [a,b]. Then pn has n distinct real roots x1,x2,...,xn∈(a,b).
Proof: Let x1,x2,...,xm be the zeroes of pn that have odd multiplicity and lie in (a,b). Define q(x)=(x−x1)(x−x2)⋯(x−xm). Now q(x)pn(x) does not change sign on [a,b], so ⟨q,pn⟩=0. It follows from m≤n and lemma 4 that deg(q)=n. So m=n, which implies that pn has n distinct roots in (a,b).
Now we are ready to prove the following theorem.
Theorem 8: Let p0,p1,... be a sequence of orthonal polynomials with respect to the inner product ⟨f,g⟩=∫f(x)f(x)μ(x)dx and let x1,x2,...,xn be the zeroes of pn. Define wk=∫lk(x)μ(x)dx for k=1,2,...,n, where l1,l2,...,ln are the Lagrange polynomials for x1,x2,...,xn. Then k=1∑nwkf(xk)=∫f(x)μ(x)dx
for f∈P2n−1.
Proof: Let fn∈Pn be the polynomial of degree n that interpolates f at x1,x2,...,xn. If we can show that ∫(f(x)−fn(x))μ(x)dx=0, it follows that ∫f(x)μ(x)dx=∫fn(x)μ(x)dx.
First, we note that fn(x) assumes the same values as f(x) for x=x1,x2,...,xn. So f(x)−fn(x) has zeroes at x1,x2,...,xn. Since x1,x2,...,xn are the zeroes from pn, the function f(x)−fn(x) has the same zeroes as pn and must be some multiple of pn, say f(x)−fn(x)=cpn(x). Then we can write f(x)−fn(x)=cpn(x)q(x) for some q∈Pn−1.
Now, we can write the integral as ∫(f(x)−fn(x))μ(x)dx=c∫pn(x)q(x)μ(x)dx=c⟨pn,q⟩. Since deg(q)<n we have ⟨pn,q⟩=0 by lemma 4. So we see that ∫f(x)μ(x)dx=∫fn(x)μ(x)dx. It was shown ealier that ∫fn(x)μ(x)dx=∑k=1nwkf(xk), so it follows that ∫f(x)μ(x)dx=∫fn(x)μ(x)dx=k=1∑nwkf(xk)
□
Sometimes, we want to compute the integral ∫p(x)q(x)μ(x)dx
for many different p and q. Using theorem 6 and 8 and a root-finding method, we can compute a different quadrature rule for every q. The inconvenient thing about this method is that the quadrature points x1,x2,... will be different for every q.
In [2], it is proposed to instead pick one set of quadrature points and re-use these for every q. This means we have to use more quadrature points per q, but since we can use the same set of quadrature points we only have to evaluate p on this one set, which is more efficient in the end. It also means that we can use the ‘simple’ way of computing quadrature rules outlined in the section ‘quadrature rules, the easy way’.
References
[1] Approximation Theory. Walter Groenevelt. Lecture notes WI4415, TU Delft, 2016.
[2] Matrix-free weighted quadrature for a computationally efficient isogeometric k-method. Giancarlo Sangalli and Mattia Tani, 2017. https://arxiv.org/abs/1712.08565