The Illusion That Started Everything
Look at the figure below. You see a bright white equilateral triangle floating above three black discs. The triangle has sharp edges and appears slightly brighter than the background. None of that is actually printed on the page.
Kanizsa (1955) — the white triangle does not exist
This is the Kanizsa triangle (Kanizsa 1955), the canonical example of modal completion — the visual system’s tendency to infer the presence of a bounding contour from a set of locally consistent cues. The phenomenon is not subtle: the perceived edges are sharp, the interior surface appears measurably lighter than the background (Cornsweet 1970), and the effect survives rotation, scaling, and partial occlusion.
The question is: what is the computational rule that produces this completion? A satisfying answer did not arrive until 2003.
Orientation Columns in V1
The primary visual cortex (V1, also called area 17 or the striate cortex) is the first cortical stage of visual processing. Hubel and Wiesel (1959, 1962) received the Nobel Prize for discovering that neurons in V1 respond selectively to oriented edges: a neuron fires most vigorously when a bar of light oriented at a particular angle passes through a small region of the visual field.
Each V1 neuron can be characterised by three numbers: \(\text{position }(x, y) \in \mathbb{R}^2 \quad\text{and}\quad \text{preferred orientation }\theta \in [0, \pi).\)
Neurons with the same preferred orientation cluster in orientation columns running perpendicular to the cortical surface. Mapped over a patch of cortex, the preferred orientations rotate continuously — completing a full $\pi$-rotation over a distance of about 1 mm. The resulting pattern of orientation preferences is called an orientation map, and it exhibits a characteristic structure of pinwheels: point singularities around which the preferred orientation rotates by $\pm\pi/2$.
The figure below shows a schematic orientation map: each short line segment represents one cortical location $(x,y)$, oriented at the local preferred angle $\theta(x,y)$. Colours encode the orientation angle on a $[0, \pi)$ hue wheel.
A key discovery from Bosking et al. (1997) is that the long-range horizontal connections in V1 — axons that reach across several millimetres of cortex — connect neurons with the same preferred orientation that lie roughly along the direction of that orientation. In other words, a neuron at $(x,y)$ preferring angle $\theta$ connects strongly to neurons at $(x’, y’)$ where the displacement $(x’-x, y’-y)$ is approximately parallel to $\theta$.
This is the empirical association field (Field, Hayes & Hess 1993): two oriented line elements are perceptually grouped together if they lie along a smooth curve that is tangent to both elements.
Petitot’s Insight: V1 Is a Contact Bundle
Jean Petitot (1999, 2003) made the key observation: the data $(x, y, \theta)$ encoding every neuron in V1 — together with the horizontal connectivity pattern just described — is not a 2D image map. It is a 3-dimensional contact manifold.
Specifically: the set of triples $(x, y, \theta) \in \mathbb{R}^2 \times S^1$ is the total space of a circle bundle over the retinal plane. It carries a contact structure $\xi = \ker(\sin\theta\,dx - \cos\theta\,dy)$: the constraint that a curve $(x(t), y(t), \theta(t))$ can only move horizontally (forward in direction $\theta$, or rotate in place) — it cannot slide sideways.
This constraint is satisfied by any curve whose position $(x,y)$ moves in the direction the neuron prefers: \(\dot x = u_1 \cos\theta, \quad \dot y = u_1 \sin\theta, \quad \dot\theta = u_2.\)
The bundle $(\mathbb{R}^2 \times S^1, \xi)$ is canonically isomorphic (as a contact manifold) to the unit cotangent bundle $ST^*\mathbb{R}^2$, and its symmetry group is SE(2) — the Lie group of orientation-preserving rigid motions of the plane.
The Lie Group SE(2)
What SE(2) is, concretely
A point of $\mathrm{SE}(2)$ is a triple $(x, y, \theta)$: a position $(x, y) \in \mathbb{R}^2$ together with an orientation $\theta \in S^1$. This is exactly the data describing one V1 neuron in Figure 2 — its retinal position and its preferred edge orientation. As a $3\times3$ matrix the configuration $g$ acts on the plane as a rotation followed by a translation:
\[g \;=\; \begin{pmatrix} \cos\theta & -\sin\theta & x \\ \sin\theta & \cos\theta & y \\ 0 & 0 & 1 \end{pmatrix}.\]The two motions that are directly available
The horizontal connectivity of V1 (Bosking et al. 1997) lets the cortex move its neural locus in only two ways:
- $X_1$ — slide the locus along the current preferred orientation: the position changes, the orientation stays the same.
- $X_2$ — rotate the preferred orientation in place: the position stays the same, the orientation changes.
In $(x, y, \theta)$ coordinates these are the two left-invariant vector fields
\[X_1 \;=\; \cos\theta\,\partial_x + \sin\theta\,\partial_y, \qquad X_2 \;=\; \partial_\theta.\]The third direction — sliding the locus perpendicular to the current orientation — is forbidden:
\[X_3 \;:=\; -\sin\theta\,\partial_x + \cos\theta\,\partial_y \;\notin\; \mathrm{span}\{X_1, X_2\}.\]V1 has no “move my edge sideways” connection at this level.
How sideways motion still happens — the Lie bracket
The cortex can nevertheless reach a perpendicularly-displaced neuron, by a short loop of the two moves it does have. The figure below shows the canonical four-step manoeuvre
\[+\varepsilon X_1 \;\to\; +\varepsilon X_2 \;\to\; -\varepsilon X_1 \;\to\; -\varepsilon X_2 .\]Each leg is small (length $\varepsilon$). The net effect is to leave the orientation unchanged but to displace the position by an amount of order $\varepsilon^2$ in the perpendicular direction:
\[\Phi^{X_2}_{-\varepsilon} \!\circ\! \Phi^{X_1}_{-\varepsilon} \!\circ\! \Phi^{X_2}_{\varepsilon} \!\circ\! \Phi^{X_1}_{\varepsilon}\;(g_0) \;=\; g_0 \,+\, \varepsilon^2\, X_3 \,+\, O(\varepsilon^3).\]The infinitesimal limit of this loop is exactly the Lie bracket of $X_1$ and $X_2$:
\[[X_1, X_2] \;=\; X_3 \;=\; -\sin\theta\,\partial_x + \cos\theta\,\partial_y.\]Why the green field stroke is tangent at the endpoints but not in between. The green stroke at the moving point is $\theta_{\mathrm{field}}(x(s), y(s))$ — the orientation that the cortex assigns to the neuron sitting at the planar position $(x(s), y(s))$. The blue arrow $T$ is the orientation that the geodesic happens to have at this $s$. These are two different things:
- The orientation field $\theta_{\mathrm{field}}$ is a fixed scalar function on the $(x, y)$ retinal plane — it labels each cortical column with its preferred orientation.
- The geodesic's tangent $\theta(s)$ is the third coordinate of a curve in the 3-D contact bundle $\mathrm{SE}(2)$. It is determined by Pontryagin's maximum principle (i.e. by the SR Hamiltonian flow, which gives $\kappa(s) = d\theta/ds$ as one of the elastica functions) — not by what the field happens to read at $(x(s), y(s))$.
Maxwell-pair mode. When the dropdown is set to the Maxwell pair, two frames advance in lock-step along the two members of the pair $\gamma_A, \gamma_B$ (frame labels become $T_A, N_A$ and $T_B, N_B$). Both close back to the source after the same arc length $L = 4K(k_c^2)/\omega$, but their final tangents differ by exactly $\pi$: $\theta_A(L) - \theta_B(L) = \pi \pmod{2\pi}$. That is the whole point of the pair. The two geodesics are reversed-heading partners — they leave the source in opposite directions and traverse $\sigma$-mirrored figure-8 loops. Because a V1 neuron's preferred orientation is a line, identified mod $\pi$, both endpoint headings $+\theta_{\mathrm{src}}$ and $-\theta_{\mathrm{src}}$ represent the same neuron. The pair therefore gives two genuinely distinct closed SR-shortest paths from the source neuron to itself, with identical arc length — a Maxwell pair on $\mathrm{SE}(2)$ (Sachkov, ESAIM:COCV 2008, Fig. 34).
Hörmander condition ⇒ V1 is a contact manifold
Because $X_3 = [X_1, X_2]$ is not in $\mathrm{span}{X_1, X_2}$ but the three together ${X_1,\, X_2,\, [X_1, X_2]}$ span the full tangent space $T_g\,\mathrm{SE}(2) \cong \mathbb{R}^3$ at every $g$, the 2-plane field
\[\mathcal{H} \;:=\; \mathrm{span}\{X_1, X_2\}\]satisfies the Hörmander (bracket-generating) condition. By the Chow–Rashevskii theorem (1938, 1938) any two configurations in $\mathrm{SE}(2)$ can therefore be joined by a horizontal path — a curve whose velocity lies in $\mathcal{H}$ at every point. Geometrically, $\mathcal{H}$ is a contact structure on $\mathrm{SE}(2)$, and Petitot’s key insight is that the V1 cortex is this contact manifold.
The Sub-Riemannian Metric and the Minimisation Problem
We equip the horizontal distribution $\mathcal{H}$ with the left-invariant inner product: \(\langle u_1 X_1 + u_2 X_2,\; u_1 X_1 + u_2 X_2 \rangle = u_1^2 + u_2^2.\)
This defines a sub-Riemannian (SR) metric on SE(2): the length of a horizontal curve is $\int_0^T \sqrt{u_1^2 + u_2^2}\,dt$, and the SR distance between two points is the infimum of lengths over all horizontal paths.
The visual completion problem now takes a precise form:
The spatial projection $(x(t), y(t))$ of the solution is the perceptually completed contour that the visual system infers between two oriented line elements $(x_0,\theta_0)$ and $(x_1,\theta_1)$.
The simplification $u_1 = 1$ (unit forward speed) reduces the problem to minimising \(\int_0^L \kappa^2(s)\,ds\) where $L$ is arc length and $\kappa = u_2/u_1 = \dot\theta$ is the signed curvature. This is the Euler elastica functional: the total squared bending energy of the projected curve.
From Petitot’s Model to the Open Problem
Petitot’s paper established the model. The subsequent mathematical programme — carried out primarily by Sachkov and collaborators — aimed to characterise the global structure of this SR geometry: which geodesics are actually optimal (globally length-minimising), and up to what arc length?
In every statement that follows, “length” means the sub-Riemannian length — the only length the visual cortex’s contact-bundle geometry actually defines. For a horizontal curve $\gamma : [0, T] \to \mathrm{SE}(2)$ with controls $(u_1, u_2)$ along the horizontal frame ${X_1, X_2}$,
\[L_{\mathrm{SR}}(\gamma) \;=\; \int_{0}^{T} \sqrt{u_1^{2} + u_2^{2}}\,dt.\]Under the standard unit-speed parametrisation $u_1^{2} + u_2^{2} = 1$, this collapses to $L_{\mathrm{SR}}(\gamma) = T$ — the SR arc-length parameter. Crucially, Euclidean arc length on the projected plane curve $(x(s), y(s))$ is not what gets minimised; only horizontal motion (forward + rotation) costs anything, and “sliding sideways” is forbidden, not free.
This requires understanding two geometric loci, both defined with respect to $L_{\mathrm{SR}}$:
-
The conjugate locus: points beyond which a geodesic is no longer locally length-minimising. These correspond to where two infinitesimally nearby geodesics with the same initial conditions reconverge in the phase space — i.e. the SR exponential map ceases to be a local diffeomorphism.
-
The cut locus (or Maxwell locus): points beyond which the geodesic is no longer globally length-minimising — because at least one other horizontal curve from the same start point reaches the same end point with equal or smaller $L_{\mathrm{SR}}$. At a Maxwell point, two distinct globally-optimal geodesics meet with exactly the same SR length.
For a sub-Riemannian manifold as symmetric as SE(2), the cut and Maxwell loci coincide (this is part of what Sachkov and I proved in arXiv:0807.4731). The first point on the cut locus along a given geodesic is the cut time $t_\mathrm{cut}$, and it equals the first time the exponential map is no longer injective.
The beautiful — and still incomplete — result is:
The four parts of this series develop the full story:
- This article: the model, the geometry, the problem statement.
- Part 2 (Euler’s Elastica): deriving the pendulum equation and solving it with Jacobi elliptic functions $\mathrm{sn}(s\mid k^2)$, $\mathrm{cn}(s\mid k^2)$, $\mathrm{dn}(s\mid k^2)$.
- Part 3 (Maxwell Strata): characterising the locus where two geodesics of equal length meet; the role of the discrete symmetry group of SE(2); proof that the first Maxwell time is $4K(k^2)/\omega_0$.
- Part 4 (The Open Problem): what is proved, what is conjectured, and the remaining analytic difficulty near the boundary of the abnormal set.
References
- J. Petitot (2003). "The neurogeometry of pinwheels as a sub-Riemannian contact structure." Journal of Physiology–Paris 97(2–3): 265–309.
- I. Moiseev & Yu. L. Sachkov (2010). "Maxwell strata in sub-Riemannian problem on the group of motions of a plane." ESAIM: COCV 16(2): 380–399. arXiv:0807.4731
- G. Citti & A. Sarti (2006). "A cortical based model of perceptual completion in the roto-translation space." J. Math. Imaging Vision 24(3): 307–326.
- D. H. Hubel & T. N. Wiesel (1962). "Receptive fields, binocular interaction and functional architecture in the cat's visual cortex." J. Physiology 160: 106–154.
- D. J. Field, A. Hayes & R. F. Hess (1993). "Contour integration by the human visual system: Evidence for a local 'association field'." Vision Research 33(2): 173–193.
- W. H. Bosking et al. (1997). "Orientation selectivity and the arrangement of horizontal connections in tree shrew striate cortex." J. Neuroscience 17(6): 2112–2127.
- G. Kanizsa (1955). "Margini quasi-percettivi in campi con stimolazione omogenea." Rivista di Psicologia 49: 7–30.
- Yu. L. Sachkov (2011). "Cut locus and optimal synthesis in the sub-Riemannian problem on the group of motions of a plane." ESAIM: COCV 17(4): 293–321. arXiv:0903.0727