In this article, I’ll prove the Euler-Lagrange equation and give some examples of applications. I’ll use some other theorems without proof: Fermat’s theorem, the fundamental lemma of the calculus of variations, the multivariate chain rule, and integrations by parts.
Theorem: Consider the functional I defined by I(f)=∫abL(f,f′,x)dx
If f is a differentiable function for which this functional has a local extremum, then it must satisfy the Euler-Lagrange equation: ∂f∂L=dxd∂f′∂L
Proof: Define a mapping h that maps a real number to a function as h(α)=f+αη
where η is any differentiable function with η(a)=η(b)=0. Suppose that f is a differential function for which I has a local extremum. Then, by Fermat’s theorem, we have that dαdI(h(α))∣α=0=0 when f is a local extremum of I.
Working out dαdI(h(α)) gives dαdI(h(α))=dαd∫abL(h,h′,x)dx=∫abdαdL(h,h′,x)dx
Technically, I have to show that switching the order of integration and differentiation is allowed, but I’ll omit this step for now.
Using the multivariate chain rule, we see this last expression equals ∫ab∂h∂Ldαdh+∂h′∂Ldαdh′dx, so we have dαdI(h(α))=∫ab∂h∂Ldαdh+∂h′∂Ldαdh′dx
Note that the term ∂x∂Ldαdx does not occur since dαdx=0. Now we use that dαdh=η and dαdh′=η′, and obtain
dαdI(h(α))=∫ab∂h∂Lη(x)+∂h′∂Lη′(x)dx
Now, consider the integral over only the second term. By using integration by parts, we see ∫ab∂h′∂Lη′(x)dx=∂h′∂Lη(b)−∂h′∂Lη(a)−∫abdxd∂h′∂Lη(x)dx
Using that η(a)=η(b)=0, this simplifies to just ∫ab∂h′∂Lη′(x)dx=−∫abdxd∂h′∂Lη(x)dx
Substituting this back in the equation for dαdI(h(α)) yields dαdI(h(α))=∫ab(∂h∂L−dxd∂h′∂L)η(x)dx
We have h(0)=f, so evaluating the whole thing at α=0, we get dαdI(h(α))=∫ab(∂f∂L−dxd∂f′∂L)η(x)dx
Remembering that dαdI(h(α))=0 by Fermat’s theorem, we have ∫ab(∂f∂L−dxd∂f′∂L)η(x)dx=0
for any differentiable η with η(a)=η(b)=0. By the fundamental lemma of the calculus of variations, it follows that ∂f∂L−dxd∂f′∂L=0, which yields ∂f∂L=dxd∂f′∂L
when rearranged. □
If the integrand L(f,f′,x) does not depend on f or on x, the Euler-Lagrange equation can be simplified. Let’s first consider the case where the integrand does not depend on f:
Corollary: Consider the functional I defined by I(f)=∫abL(f′,x)dx
If f is a differentiable function for which this functional has a local extremum, then ∂f′∂L
is constant.
Proof: Note that this is a special case of the Euler-Lagrange equation where we have ∂f∂L=0. Substituting this in the Euler-Lagrange equation, we get dxd∂f′∂L=0. It follows that ∂f′∂L is constant. □
The case where L does not depend on x is similar, but less straightforward to prove.
Corollary: Consider the functional I(f)=∫abL(f′,f)dx
If f is a differentiable function for which this functional has a local extremum, then ∂f′∂Lf′−L
is constant.
Proof: By the multivariate chain rule, we have dxdL=∂f∂Ldxdf+∂f′∂Ldxdf′
Substituting the Euler-Lagrange equation ∂f∂L=dxd∂f′∂L in this expression yields dxdL=(dxd∂f′∂L)dxdf+∂f′∂Ldxdf′
Writing f′ for dxdf, we can use the product rule: (dxd∂f′∂L)f′+∂f′∂L(dxdf′)=dxd(∂f′∂Lf′)
Substituting this, we obtain dxdL=dxd(∂f′∂Lf′)
After subtracting dxdL from both sides we have dxd(∂f′∂L−L)=0
This implies that ∂f′∂L−L
is constant. □
Examples
The Euler-Lagrange equation is a helpful tool, but it usually requires some work to arrive at a solution. Here, I show some well-known applications of the Euler-Lagrange equation.
The shortest path between two points is a straight line
This intuitively obvious statement is not trivial to prove. I’ll give a proof using the Euler-Lagrange equation. I’ll take as a given that the length of a curve y(x) from x=a to x=b is given by ∫ab1+(dxdy)2dx
An intuitive idea for why this holds can be obtained by considering that the length of a linear line segment is Δx2+Δy2, where Δx and Δy are the differences in the x-coordinate and y-coordinate. Letting Δx→0, we get dx2+dy2, and integrating yields the expression ∫abdx2+dy2=∫ab1+(dxdy)2dx
for the length of the differentiable curve y from (a,y(a) to (b,y(b)).
Setting I(f)=∫ab1+(dxdy)2dx, the Euler-Lagrange equation now gives dxd∂y∂1+(y′)2=0
Working out the derivatives gives dxd∂y∂1+(y′)2=dxd1+(y′)2y′=1+(y′)23y′′=0
Since the denominator is always positive, it follows that y′′=0, and y must be of the form y(x)=ax+b
Brachistochrone
A brachistochrone curve through two points A and B on a plane is defined as the curve that minimizes the time that it takes from a point to slide from A to B from a standstill, neglecting friction. Of course, this assumes that the height of B is less than the height of A.
Tackling this problem requires some physics. The potential energy, which is the energy that the particle gets from its height is mgh, where m is the mass of the particle, g is some gravitational constant, and h is the height of the particle. The kinetic energy, that the particle gets from its speed, is 21mv2. Since we pretend there’s no friction and ignore other types of energy, the sum of these two energies is constant by the law of conservation of energy. If we assume that the mass of the particle m is constant as well, we find 21v2+gh=c
for some constant c.
Re-arranging, we find v=2gd−h with d=gb. Since g is a constant, the speed depends only on the height h of the ball, which is an interesting result in itself. It is particularly convenient to assume that the height A is zero, so that y=d−h. Note that the y axis points downwards in this case. In this case, we obtain: v=2gy
Now, as we used before, the length of an infinitesimal curve segment is 1+(y′)2dx. The speed at y is 2gy. The time that is takes a ball to roll over the line segment is simply the length of the segment divided by its speed. So we can express the time it takes to roll from A to B as ∫xAxB2gy1+(y′)2dx
With this we finally have the integral to minimize, and we can use the Euler-Lagrange equation with L=2gy1+(y′)2. In fact, the integrand does not depend on x, so we can use a simplified version. From this, we gather that ∂y′∂Ly′−L=C
for some constant C. Now, first, compute ∂y′∂L: ∂y′∂L=2gy11+(y′)2y′
Substituting this in the previous equation yields
2gy1(1+(y′)2(y′)2−1+(y′)2)=C
Simplifying gives 2gy11+(y′)2−1=C
y(1+(y′)2)=−2gC1
y(1+(y′)2)=r2
with r=2gC1. Now, write y′=dxdy. We can then rewrite to y(dx2+dy2)=r2dx2
Now, we want to write x and y as a function of a parameter t. Divide both sides by dt2 to get y((dtdx)2+(dtdy)2)=r2(dtdx)2