GPU Asteroid Swarm
via Keplerian Ellipses & Elliptic Integrals

Over a million catalogued small bodies orbit the Sun. Their first-order motion is governed by Keplerian ellipses, and every geometric quantity on those ellipses — arc length, perimeter, mean-anomaly mapping — reduces to an incomplete elliptic integral. When we want to propagate the whole population across thousands of epochs at once, the workload becomes massively parallel. This page uses the elliptic package to run that pipeline on GPU.

Part of moiseevigor/elliptic Data: JPL SBDB · MPC Validation: JPL Horizons
What this page is. A throughput benchmark for batched Keplerian / osculating-ellipse propagation, with elliptic integrals driving the geometric work. What it is not. A full n-body integrator, a chaos-aware long-term solver, or a substitute for JPL Horizons / SPICE when high-precision ephemerides are required.

The Dataset — Real Small Bodies

Before we benchmark anything we need a population to propagate. Public small-body catalogs list over a million asteroids with measured orbital elements; we pull the currently known, bound population ($e < 1$) from the JPL Small-Body Database Query API and keep only the six Keplerian elements plus the epoch:

$$\{\,a,\ e,\ i,\ \Omega,\ \omega,\ M_0,\ t_0\,\}.$$

$a$ is the semi-major axis, $e$ the eccentricity, $i$ the inclination, $\Omega$ the longitude of the ascending node, $\omega$ the argument of perihelion, $M_0$ the mean anomaly at epoch $t_0$. Given those seven numbers, the state of every body at any time $t$ follows from Kepler's equation — the only thing that is ever body-specific is solving it:

$$M(t) = M_0 + n\,(t - t_0), \qquad M = E - e\,\sin E,\qquad n=\sqrt{GM_\odot/a^3}.$$

Figure 1 plots that population in the $(a,\,e)$ plane, with the inner planets marked on the x-axis for scale. Figure 2 shows the marginal distributions in eccentricity and inclination.

Figure 1. Every point is one asteroid from a stratified subset of the SBDB catalog. Planets are anchored on the top axis to give the $a$-scale a physical reference: the main belt sits between Mars (1.52 AU) and Jupiter (5.20 AU). Inside the belt the Kirkwood gaps (2.50, 2.82, 2.95, 3.27 AU) stand out, and so do the Hilda cluster near 3.96 AU and the Jupiter Trojans near 5.2 AU. Near-Earth and Mars-crosser populations sit to the left of the belt. Tap the button in the top-right to expand any figure on this page to full-screen.
Figure 2. Marginal histograms of eccentricity (left) and inclination (right). The vast majority of main-belt bodies have $e \lesssim 0.25$ and $i \lesssim 15°$; the long tails are NEAs, Mars-crossers and high-inclination families.

GPU Benchmark — Bodies × Epochs

With a population in hand we can measure throughput. The total work is $N_\text{bodies}\!\times\!N_\text{epochs}$ body-epochs. A modest 100k bodies × 2048 epochs is already 200 M body-epochs; at 1 M bodies and hour-resolution over a decade, we cross $10^{10}$. On CPU these sizes run in tens of seconds to minutes; on GPU they collapse to hundreds of milliseconds.

Each body-epoch inside the kernel does three things: (i) a fixed-iteration Kepler-equation Newton solve, (ii) a rotation from the orbital plane to the inertial frame, and (iii) an incomplete elliptic integral of the second kind $E(E_{\!k}\!\mid e^2)$ for arc length from perihelion (the same integral we'll visualise in §4). The elliptic integral is supplied by elliptic.elliptic12, which is backend-agnostic — the same call dispatches to NumPy, to a CUDA PyTorch tensor, or to JAX.

Python · PyTorch CUDABatched propagation + arc length
import torch
from elliptic import elliptic12

dev = 'cuda'
a, e, inc, Om, w, M0, t0 = [torch.as_tensor(col, device=dev) for col in cols]
t = torch.linspace(0, 3650, 2048, device=dev)

n = torch.sqrt(0.0002959122 / a**3)
M = (M0[:, None] + n[:, None] * (t[None, :] - t0[:, None]))
M = (M + 3.14159) % (2*3.14159) - 3.14159

E = M + e[:, None] * torch.sin(M)
for _ in range(12):                 # Newton, fixed iters → tracing-safe
    E = E - (E - e[:, None]*torch.sin(E) - M) / (1 - e[:, None]*torch.cos(E))

# Incomplete E(E_k | e^2) — arc length from perihelion
_, E_inc, _ = elliptic12(E, (e**2)[:, None].expand_as(E))
L = a[:, None] * E_inc         # (N_bodies, N_epochs)
NumPy CPU PyTorch CPU PyTorch CUDA
Figure 3. Throughput in millions of body-epochs per second. On CPU the pipeline is memory-bound by the elliptic-integral pass; on CUDA the arithmetic intensity collapses the total into the tens-of-ms range and saturates quickly with problem size.
Figure 4. GPU speedup over the NumPy CPU baseline, summed across all tested problem sizes. The dashed lines mark 10× and 30×. Above roughly $10^6$ body-epochs the fixed launch/dispatch cost is amortized and the curve flattens.
Figure 5. Runtime split by stage at the largest body count recorded in the benchmark (blue: Kepler propagation; orange: incomplete-elliptic-integral arc length). The elliptic-integral pass is the heavier of the two stages on all CPU backends, which is exactly why moving elliptic12 onto CUDA dominates the end-to-end speedup.
backendN bodiesN epochs propagate [ms]arc length [ms] total [ms]throughput [M b·ep/s] speedup

The Swarm in 3D

Numbers are one thing; seeing the population move is another. The viewer below runs the exact pipeline the Python benchmark measures — Kepler's equation solved for every body, every frame — just at a smaller subset suited to a phone or laptop GPU. The visible structures (belt concentration, Hildas, Trojans at Jupiter's $L_4$/$L_5$ points) are real population features, not artefacts of the renderer.

Drag to rotate, scroll to zoom. Zoom in close enough and the planet spheres gain name labels. The Time rate slider controls how many days of simulation advance per real-time second; Reference orbits toggles the dashed planetary orbit rings; Planets hides them entirely.

loading subset…
100 d/s
1.5
Drag to rotate · scroll to zoom
Figure 6. Live Keplerian propagation of a stratified 5000-body subset. Planetary positions (Mercury…Jupiter) use J2000 mean elements in the same inertial frame, so their rings line up with the asteroid swarm by construction. Default time rate is 100 days per second, tunable up to 3000 d/s — useful for watching Trojans librate around Jupiter's $L_4$/$L_5$ points.

Arc Length & Perimeter via Elliptic Integrals

The benchmark above spent most of its time inside one integral. Let's look at what that integral is and what it produces. For a Keplerian orbit with semi-major axis $a$ and eccentricity $e$, the arc length from perihelion to eccentric anomaly $E$ is

$$L(E) \;=\; a\;E\!\bigl(E\,\bigl|\,e^2\bigr)\;=\; a\!\int_0^{E}\!\!\sqrt{1 - e^2\sin^2\theta}\,\mathrm{d}\theta,$$

and the full orbital perimeter — the closed-form version of that same integral evaluated at $E = 2\pi$ — is

$$P \;=\; 4a\,E(e^2).$$

Both are obtained from one call to elliptic12 (or equivalently carlsonRF/carlsonRD). The panel below compares the exact elliptic result (red) against the common circular proxy $aE$ (grey dashed), which would give $2\pi a$ as the perimeter — adequate near $e=0$, wildly wrong at cometary eccentricities. Try the high-eccentricity NEA option to see the two curves diverge.

2.10
Figure 7. Left: Keplerian ellipse (Sun at one focus) with the arc from perihelion traced in red up to the current eccentric anomaly $E$. Right: normalised arc length $L(E)/a = E(E\mid e^2)$ (red, solid) versus the circular proxy $E$ (grey, dashed); the y-axis is fixed at $[0,\,2\pi]$ so the curve shape — controlled entirely by $e$ — is directly comparable across asteroids. The dashed horizontal marks the orbit perimeter $P/a = 4E(e^2) < 2\pi$, which falls further below $2\pi$ as eccentricity grows. Exact arc length $L$, circular approximation, and relative error at the current $E$ are shown in AU in the corner.

Validation — How Far Can Pure Kepler Take You?

Before you use any of this, it's worth asking how far a pure-Kepler propagator can be trusted. Over short horizons the osculating-ellipse solution tracks the true ephemeris to within fractions of an AU. Over decades, planetary perturbations accumulate; over centuries they dominate. The table below compares the pure-Kepler prediction against JPL Horizons on a small set of named asteroids for a single test date.

ObjectΔT [yr] pure-Kepler r [AU]Horizons r [AU] |Δr| [AU]

Green: $|\Delta r| < 10^{-4}$ AU   (geometry-quality agreement).  Amber: $10^{-4} \le |\Delta r| < 10^{-2}$ AU   (secular drift starting to matter).  Red: $\ge 10^{-2}$ AU.

The lesson is neither that Keplerian propagation is useless nor that it replaces Horizons — it is that for geometric questions (arc length, perimeter, phase coverage, statistical population properties), a pure-ellipse pipeline is the right tool, and the throughput gain from GPU is enormous.

Reproduce

Every figure and table on this page is produced by scripts in the repository. Pick the stack you already have — Python or MATLAB / Octave — and follow the five steps in order. Steps 1–2 are shared (they produce the JSON files the page reads); steps 3–4 are where the two stacks diverge; step 5 just serves the page locally.

You need a recent Python 3 (for the data-preparation scripts, even in the MATLAB path) and, for GPU timings, either a CUDA-capable PyTorch install (Python) or MATLAB's Parallel Computing Toolbox with gpuArray support (MATLAB). Everything also runs CPU-only — the GPU columns just become .

1Install the package and GPU deps

Clone the repo, then install the Python package (editable) plus your preferred tensor backend.

git clone https://github.com/moiseevigor/elliptic && cd elliptic
pip install -e .
# optional GPU backend — pick one that matches your CUDA version
pip install torch                       # CUDA-enabled build from pytorch.org
pip install -U "jax[cuda12]"             # alternative: JAX

2Fetch the catalog and build a browser subset

Download real orbital elements from JPL SBDB and create the compact JSON the 3-D viewer on this page consumes.

python3 scripts/prepare_asteroid_dataset.py --max 200000
python3 scripts/sample_asteroid_subset.py    --n   10000

3Run the GPU benchmark

Times NumPy CPU, PyTorch CPU and PyTorch CUDA on a sweep of body/epoch sizes; writes benchmark_results.json.

python3 python/examples/gpu_asteroid_benchmark.py \
    --bodies 10000,100000,1000000 \
    --epochs 128,512,2048

4Validate against JPL Horizons

Queries Horizons for a handful of named asteroids and compares them to the pure-Kepler prediction for the given date.

python3 python/examples/gpu_asteroid_validation.py --date 2026-01-01

5Serve the page locally

The page reads the JSON outputs from the previous steps. Any static web server will do.

cd examples/gpu-asteroid-swarm && python3 -m http.server 8080
# then open http://localhost:8080/

1Clone and add the MATLAB paths

The MATLAB sources live under matlab/. Add them once per session; the commands below also work on GNU Octave 8+.

% from a shell
git clone https://github.com/moiseevigor/elliptic && cd elliptic

% inside MATLAB / Octave
addpath('matlab/src');
addpath('matlab/examples');

2Fetch the catalog (Python helper, one-off)

The data scripts are Python. Run them once; MATLAB then reads the resulting .npz and the subset JSON.

% from a shell
python3 scripts/prepare_asteroid_dataset.py --max 200000
python3 scripts/sample_asteroid_subset.py    --n   10000

3Run the CPU vs gpuArray benchmark

Writes benchmark_results_matlab.json next to the Python output. Falls back to synthetic elements if the .npz is missing.

gpu_asteroid_benchmark('bodies', [1e4 1e5 1e6], ...
                       'epochs', [128 512 2048]);

4Validate against JPL Horizons

Same dataset, same comparison format as the Python validator.

gpu_asteroid_validation();

5Serve the page locally

The page itself is static HTML/JS and works the same regardless of which stack populated the JSON files.

% from a shell, inside the repo
cd examples/gpu-asteroid-swarm
python3 -m http.server 8080
% then open http://localhost:8080/

All code is open-source under the repository license. The data attributions belong to the JPL Solar System Dynamics group (SBDB, Horizons) and the Minor Planet Center (MPCORB).