Appendix A3 — Calculus of Variations and the Pontryagin Maximum Principle

What this appendix is for

Part 2 §1 begins:

"The Pontryagin Maximum Principle introduces a covector $\lambda$ in the cotangent bundle... (After several lines of unjustified algebra)... the Hamiltonian equations on $\mathfrak{se}(2)^{\ast}$ read $\dot h_1 = h_2 h_3, \dot h_2 = -h_1 h_3, \dot h_3 = 0$."

This appendix supplies the missing derivation. It assumes Appendix A1 (Lie groups + $\mathfrak{se}(2)$) and A2 (distributions + Chow's theorem). The phase portraits and pendulum-period plots borrow directly from the physical-pendulum example in the elliptic project: the same Jacobi-elliptic period $4K(k^2)$ controls both that page's nonlinear pendulum and our SE(2) elastica problem.

The Euler–Lagrange equations in one paragraph

A Lagrangian $L(q, \dot q, t)$ on configuration space defines the action

\[S[\gamma] \;:=\; \int_{t_0}^{t_1} L(q(t), \dot q(t), t)\,dt.\]

Stationary paths under fixed-endpoint variations $\delta q(t_0) = \delta q(t_1) = 0$ satisfy

\[\boxed{\;\frac{d}{dt}\frac{\partial L}{\partial \dot q^i} \;=\; \frac{\partial L}{\partial q^i}\;}\qquad (i = 1, \ldots, n).\]

Derivation in three lines:

\[\delta S = \int_{t_0}^{t_1} \Bigl(\frac{\partial L}{\partial q^i}\delta q^i + \frac{\partial L}{\partial \dot q^i}\delta \dot q^i\Bigr) dt = \int_{t_0}^{t_1}\Bigl(\frac{\partial L}{\partial q^i} - \frac{d}{dt}\frac{\partial L}{\partial \dot q^i}\Bigr)\delta q^i \,dt + \Bigl[\frac{\partial L}{\partial \dot q^i}\delta q^i\Bigr]_{t_0}^{t_1}.\]

The boundary term vanishes; for $\delta S = 0$ for all $\delta q$, the integrand has to vanish, giving E-L.

Legendre transform → Hamiltonian

Define the conjugate momentum $p_i := \partial L / \partial \dot q^i$ and the Hamiltonian by Legendre transform

\[H(q, p, t) \;:=\; p_i \dot q^i - L(q, \dot q, t).\]

(Eliminating $\dot q$ in favour of $p$.) The E-L equations become Hamilton’s equations

\[\dot q^i \;=\; \frac{\partial H}{\partial p_i}, \qquad \dot p_i \;=\; -\frac{\partial H}{\partial q^i}.\]

A direct consequence: $H$ is conserved on solutions when it has no explicit $t$-dependence ($\dot H = \partial H / \partial t$ along solutions). This is what makes “energy is conserved” a theorem and not just a slogan.

perturbation amplitude $\eta$ +0.00 frequency $n$ 2 Lagrangian $L = \tfrac12 \dot q^2$; straight line is the geodesic

Figure A3.1. The action $S[q] = \int_0^1 \tfrac12 \dot q^2 \,dt$ for a free particle going from $q = 0$ at $t = 0$ to $q = 1$ at $t = 1$. The straight-line solution $q(t) = t$ has $S = 1/2$ (the minimum). Perturb it by $\delta q(t) = \eta \sin(n \pi t)$ and the action becomes $S = \tfrac12 + \tfrac14 (\eta n \pi)^2$ — strictly larger for any $\eta \neq 0$. The right panel plots $S(\eta)$ as a parabola; the left panel shows the path being perturbed. Critical paths satisfy E-L, here $\ddot q = 0$, hence the straight line.

Optimal control on a manifold

We now generalise. The state is a point $g \in M$ on a smooth manifold; the control is $u(t) \in U \subseteq \mathbb R^k$; the dynamics are

\[\dot g \;=\; f(g, u),\]

and we minimise

\[J \;=\; \int_0^T L(g, u)\,dt, \qquad g(0), g(T) \text{ fixed}.\]

In our setting $M = \mathrm{SE}(2)$, $u = (u_1, u_2)$, $f(g, u) = u_1 X_1(g) + u_2 X_2(g)$, $L = \sqrt{u_1^2 + u_2^2}$, and $T$ is free (we minimise total length, so $T$ itself is the cost when $u_1^2 + u_2^2 = 1$).

The Pontryagin Maximum Principle

Pontryagin Maximum Principle (Pontryagin et al. 1962)

If $(g^{\ast}, u^{\ast})$ is an optimal pair, there exist a constant $\nu \in \{0, 1\}$ and a curve $\lambda^{\ast} : [0, T] \to T^{\ast}M$ with $\lambda^{\ast}(t) \in T^{\ast}_{g^{\ast}(t)} M$, not both zero, such that the Pontryagin Hamiltonian $$\mathcal H(g, \lambda, u, \nu) \;:=\; \langle \lambda, f(g, u)\rangle - \nu L(g, u)$$ is maximised over admissible $u$ at every $t$: $$\mathcal H(g^{\ast}(t), \lambda^{\ast}(t), u^{\ast}(t), \nu) \;=\; \max_{u \in U} \mathcal H(g^{\ast}(t), \lambda^{\ast}(t), u, \nu),$$ and $(g^{\ast}, \lambda^{\ast})$ obey the Hamilton equations of $\mathcal H$ in $T^{\ast}M$. Normal extremals: $\nu = 1$. Abnormal: $\nu = 0$.

The key new object is the costate $\lambda \in T^{\ast}M$. Heuristically, it is the Lagrange multiplier enforcing the dynamic constraint $\dot g = f$. In a coordinate chart $\lambda = \lambda_i \,dq^i$ and Hamilton’s equations read $\dot q^i = \partial \mathcal H / \partial \lambda_i$, $\dot \lambda_i = -\partial \mathcal H / \partial q^i$.

Maximising over $u$ for the SR problem

Sub-Riemannian length functional has $L = \sqrt{u_1^2 + u_2^2}$. A standard trick: parametrise by arc length, so $u_1^2 + u_2^2 = 1$ throughout. The controls live on the unit circle, and the cost becomes simply $T$ (the total time = total arc length).

The Pontryagin Hamiltonian on $T^{\ast}\mathrm{SE}(2)$ with controls $u_1 X_1 + u_2 X_2$ is

\[\mathcal H = u_1 \langle \lambda, X_1\rangle + u_2 \langle \lambda, X_2\rangle - \nu.\]

Define $h_i := \langle \lambda, X_i\rangle$ (this is the contraction of the covector $\lambda$ with the LI vector field $X_i$). Then $\mathcal H = u_1 h_1 + u_2 h_2 - \nu$ (the last term is constant in $u$).

Maximising $u_1 h_1 + u_2 h_2$ over $u_1^2 + u_2^2 \leq 1$:

\[\max_{u \in S^1} (u_1 h_1 + u_2 h_2) \;=\; \sqrt{h_1^2 + h_2^2},\]

attained at $u^{\ast} = (h_1, h_2) / \sqrt{h_1^2 + h_2^2}$.

Substituting back, the maximised Hamiltonian is

\[\mathcal H^{\ast}(g, \lambda) \;=\; \sqrt{h_1^2 + h_2^2} - \nu \;=\; \sqrt{h_1^2 + h_2^2} - 1.\]

Standard rescaling: the trajectories of the maximised SR Hamiltonian are the same as those of $\tfrac12(h_1^2 + h_2^2)$ (squaring is allowed because the Hamiltonian is conserved, so $h_1^2 + h_2^2 = $ const along solutions). We use the squared form:

\[\boxed{\;\mathcal H_n \;=\; \tfrac12 (h_1^2 + h_2^2).\;}\]

This is the “normal Hamiltonian” of Part 2 §1.

Lie–Poisson reduction on $\mathfrak{se}(2)^{\ast}$

Hamilton’s equations for $\mathcal H_n$ on $T^{\ast}\mathrm{SE}(2)$ are coupled equations in $(g, \lambda)$. But the LI vector fields make $T^{\ast}\mathrm{SE}(2)$ trivialise:

\[T^{\ast}\mathrm{SE}(2) \;\xrightarrow{\sim}\; \mathrm{SE}(2) \times \mathfrak{se}(2)^{\ast}, \qquad \lambda \mapsto (g, \mu)\]

where $\mu = (h_1, h_2, h_3) := (\langle\lambda, X_1\rangle, \langle\lambda, X_2\rangle, \langle\lambda, X_3\rangle)$ in the basis dual to ${E_1, E_2, E_3}$. In this trivialisation the Hamiltonian flow on $T^{\ast}G$ for a left-invariant Hamiltonian (depending only on $\mu$) decouples:

the $\mu$-component evolves on $\mathfrak g^{\ast}$ alone, by the Lie–Poisson equation $\dot\mu = \mathrm{ad}^{\ast}_{dH(\mu)}\,\mu$;
the $g$-component is then recovered by a reconstruction $\dot g = g \cdot dH(\mu)$.

For matrix Lie groups the Lie–Poisson equation in the $h_i$ coordinates reads

\[\dot h_i \;=\; -\sum_{j, k} c^k_{ij}\,h_k\,\frac{\partial H}{\partial h_j},\]

where $c^k_{ij}$ are the structure constants $[E_i, E_j] = c^k_{ij} E_k$. For $\mathfrak{se}(2)$ in the basis we computed in A1,

\[[E_1, E_2] = 0, \quad [E_3, E_1] = E_2, \quad [E_3, E_2] = -E_1,\]

and (with the LI-vector-field bracket sign convention used in Part 1, $[X_3, X_1] = X_2$) one obtains

\[\dot h_1 \;=\; h_2\,h_3, \qquad \dot h_2 \;=\; -h_1\,h_3, \qquad \dot h_3 \;=\; 0.\]

These are exactly the equations Part 2 §1 wrote down. Note $h_3$ is a Casimir — constant on every coadjoint orbit.

What the equations are

Set $\omega_0 := h_3$ (constant) and $c := h_1^2 + h_2^2$ (also constant since $\dot c = 2(h_1 \dot h_1 + h_2 \dot h_2) = 2(h_1 h_2 h_3 - h_2 h_1 h_3) = 0$). Parametrise

\[h_1 = \sqrt c \cos\phi, \qquad h_2 = \sqrt c \sin\phi.\]

Then $\dot h_1 = -\sqrt c \sin\phi \dot\phi = h_2 h_3 = \sqrt c \sin\phi \cdot \omega_0$, giving $\dot\phi = -\omega_0$. So $\phi(t) = \phi_0

\omega_0 t$ — a uniform rotation on the cylinder. Translating to the $\theta$-coordinate of the SE(2) trajectory:
$u_1^{\ast} = \cos\phi$ (forward), $u_2^{\ast} = \sin\phi$ (rotation rate).
$\dot \theta = u_2^{\ast} = \sin\phi$.

Differentiating once more gives the pendulum equation for the curvature (after rescaling),

\[\ddot\theta + \omega_0\,\dot\theta\,\cos(\,\cdot\,) + \cdots \;\to\; \frac{d^2 \varphi}{ds^2} + \sin\varphi = 0,\]

with $\varphi$ a rescaled pendulum angle and $s$ a rescaled arc length. This is the same nonlinear-pendulum ODE that Part 2 §2 announced as “the equation for a nonlinear pendulum” — but now it is derived, not postulated.

energy $E$ +0.40 show $\ddot\varphi + \sin\varphi = 0$, energy $E = \tfrac12\dot\varphi^2 - \cos\varphi$

Figure A3.2. Phase portrait of the planar pendulum in coordinates $(\varphi, \dot\varphi)$ — same diagram that drives the elliptic project's physical-pendulum example. Three regimes correspond exactly to the three SE(2) elastica families of Part 2: libration ($-1 < E < 1$, blue closed orbits) — inflectional elastica; separatrix ($E = 1$, red curve) — Euler spiral; rotation ($E > 1$, green orbits) — non-inflectional elastica. The energy $E$ slider also controls the modulus $k = \sqrt{(E+1)/2}$ for libration; $k = 1$ at the separatrix; $k > 1$ in the rotation regime where the period becomes $2K(1/k^2)/k$. The period diverges at $E = 1$ (both sides) — this is the $K(k^2) \to \infty$ asymptotic of Appendix A4.

The reconstruction equation

Once $\mu(t) = (h_1(t), h_2(t), h_3)$ is known, the SE(2) trajectory itself is obtained from

\[\dot g(t) \;=\; g(t)\,\xi(t), \qquad \xi(t) := u_1^{\ast}(t) E_1 + u_2^{\ast}(t) E_2,\]

with $u^{\ast}(t) = (h_1(t), h_2(t)) / \sqrt c$ from the maximisation. In the $(x, y, \theta)$ chart this is exactly Part 1’s Frenet–Serret integration:

\[\dot x = u_1^{\ast} \cos\theta, \qquad \dot y = u_1^{\ast} \sin\theta, \qquad \dot \theta = u_2^{\ast}.\]

Setting $u_1^{\ast} = 1$ (unit-speed normalisation), we get $u_2^{\ast} = \dot\theta = \kappa$, which is the curvature of the projected plane curve. The relation $\kappa = h_2 / \sqrt c$ ties the costate to the curvature directly: $\kappa$ inherits the $h_2$-pendulum dynamics and so becomes a Jacobi sn (libration), sech (separatrix), or dn (rotation) — Part 2 §2.

$\omega_0 = h_3$ +0.60 $\sqrt c$ (orbit radius) 1.00 Left: costate on cylinder. Right: reconstructed plane curve.

Figure A3.3. Left: the costate $(h_1, h_2, h_3)$ winds on a cylinder $h_1^2 + h_2^2 = c$ at the constant rate $\dot\phi = -h_3$. Right: the plane projection of the resulting SE(2) geodesic, integrated by the reconstruction equation $\dot g = g\cdot\xi(t)$. Vary $h_3$ and the curve interpolates between near-circular (small $h_3$) and the elastica regime; vary $\sqrt c$ and the curve scales without changing shape — confirming Part 2's observation that only the dimensionless ratio $h_3 / \sqrt c$ determines which elastica family you land in.

Connection to the elliptic project

The phase portrait of Figure A3.2 is computationally identical to the one in elliptic project’s physical-pendulum example — both compute level sets of $E = \tfrac12\dot\varphi^2 - \cos\varphi$ and trace closed orbits with period $4K(k^2)$. The SE(2) elastica problem is, structurally, the same ODE as a nonlinear pendulum: this appendix’s job has been to make that equivalence inevitable, by deriving the pendulum equation from the PMP on $\mathrm{SE}(2)$ rather than postulating it.

Once you have the pendulum, the period-and-amplitude analysis is the business of Appendix A4 (Jacobi elliptic functions) and the closed-form geodesic endpoints are the business of Appendix A5 (the SR exponential map).

Code

# Verify the Lie–Poisson equations on se(2)* numerically
# from a random initial costate (h1, h2, h3) and check that:
#   - h3(t) is constant
#   - h1²+h2² is constant
#   - h1(t) = sqrt(c) cos(-h3 t + φ0)
import numpy as np
from scipy.integrate import solve_ivp

def lie_poisson_se2(t, h):
    h1, h2, h3 = h
    return [h2*h3, -h1*h3, 0.0]

h0 = [0.6, 0.4, 0.7]
sol = solve_ivp(lie_poisson_se2, [0, 8], h0, rtol=1e-10, atol=1e-12,
                t_eval=np.linspace(0, 8, 400))

# Check Casimirs
c = sol.y[0]**2 + sol.y[1]**2
print(f"max |c - c0| / |c0| = {np.max(np.abs(c - c[0]))/c[0]:.2e}")  # ~ 1e-10
print(f"max |h3 - h30|       = {np.max(np.abs(sol.y[2] - h0[2])):.2e}")  # ~ 1e-12

# Match the closed form
phi0 = np.arctan2(h0[1], h0[0])
phi_t = phi0 - h0[2] * sol.t
h1_pred = np.sqrt(c[0]) * np.cos(phi_t)
print(f"max |h1 - prediction| = {np.max(np.abs(sol.y[0] - h1_pred)):.2e}")  # ~ 1e-10

# Reconstruct the SE(2) trajectory from the costate solution
# (same algorithm used by elliptic-core.js's integrateElastica routine)
def reconstruct_se2(h_t, t_arr):
    """Given h(t) sampled at t_arr, return (x, y, theta)."""
    c = h_t[0,0]**2 + h_t[1,0]**2
    sqrt_c = np.sqrt(c)
    u1 = h_t[0] / sqrt_c
    u2 = h_t[1] / sqrt_c
    x, y, th = 0.0, 0.0, 0.0
    out = [(0,0,0)]
    for i in range(len(t_arr) - 1):
        dt = t_arr[i+1] - t_arr[i]
        thmid = th + 0.5 * u2[i] * dt
        x  += u1[i] * np.cos(thmid) * dt
        y  += u1[i] * np.sin(thmid) * dt
        th += u2[i] * dt
        out.append((x, y, th))
    return np.array(out)

What we covered, and what comes next

The Pontryagin Maximum Principle takes an optimal-control problem and produces a Hamiltonian system on $T^{\ast}M$. For the SE(2) sub-Riemannian length problem, maximisation over $u_1, u_2$ on the unit circle gives the normal Hamiltonian $\mathcal H_n = \tfrac12(h_1^2 + h_2^2)$. Lie–Poisson reduction on $\mathfrak{se}(2)^{\ast}$ collapses the $T^{\ast}\mathrm{SE}(2)$ flow to the costate equations $\dot h_1 = h_2 h_3, \dot h_2 = -h_1 h_3, \dot h_3 = 0$ — exactly the equations Part 2 §1 wrote down. The substitution $h_1 = \sqrt c \cos\phi, h_2 = \sqrt c \sin\phi$ turns the costate into a uniformly-rotating phase, and differentiating once more gives the nonlinear pendulum equation for the curvature.

Appendix A4 will solve the pendulum equation in closed form using Jacobi elliptic functions and the AGM, recovering the period $4K(k^2)$ and the explicit $\kappa(s) = 2k\,\mathrm{sn}(s\mid k^2)$ formula of Part 2.

References

L. S. Pontryagin, V. G. Boltyanskii, R. V. Gamkrelidze, E. F. Mishchenko (1962). The Mathematical Theory of Optimal Processes. Wiley–Interscience. The original PMP source.
A. A. Agrachev, Yu. L. Sachkov (2004). Control Theory from the Geometric Viewpoint. Springer. Chapter 12 derives the SR geodesic equations on Lie groups by exactly this route.
J. E. Marsden, T. S. Ratiu (1999). Introduction to Mechanics and Symmetry. Springer. Chapters 13–14 for Lie–Poisson reduction.
V. I. Arnold (1989). Mathematical Methods of Classical Mechanics. Springer GTM 60. Appendix 2 has Lie–Poisson dynamics.
Yu. L. Sachkov (2010). "Conjugate and cut time in the sub-Riemannian problem on the group of motions of a plane." ESAIM: COCV 16: 1018–1039. Uses the costate ODE derived here.
Elliptic project — Physical Pendulum. Phase-portrait diagram identical to Figure A3.2; same $K(k^2)$ period.