Determinism And Reproducibility

vibeSpatial defaults to maximum throughput, not maximum reproducibility.

Intent

Define when GPU-oriented operations may be nondeterministic, what deterministic mode guarantees, and how future kernels should implement reproducible reductions without pretending cross-device bitwise equality exists.

Request Signals

  • determinism

  • reproducibility

  • bitwise identical

  • reduction order

  • atomics

  • stable output

Open First

  • docs/architecture/determinism.md

  • src/vibespatial/runtime/determinism.py

  • scripts/check_determinism.py

  • docs/architecture/runtime.md

  • docs/architecture/precision.md

Verify

  • uv run pytest tests/test_determinism_policy.py -q

  • uv run python scripts/check_determinism.py --rows 512 --groups 32 --repeats 100

  • uv run python scripts/check_docs.py --check

Risks

  • Claiming cross-device reproducibility would be false; different architectures and driver stacks may choose different legal implementations.

  • Hidden nondeterminism in reductions, scans, or floating atomics makes debugging and scientific verification much harder.

  • Forcing deterministic order everywhere would waste performance on operations that are not actually sensitivity-bound.

Canonical Rule

  • Default mode is performance-first.

  • Deterministic mode is explicit and opt-in.

  • Deterministic mode guarantees bitwise-identical output only for the same input, same device architecture, and same driver/runtime stack.

  • Cross-device reproducibility is not guaranteed.

The runtime flag is:

  • VIBESPATIAL_DETERMINISM=default|deterministic

What Changes In Deterministic Mode

For affected kernels, deterministic mode requires:

  • stable output order

  • fixed reduction order

  • fixed scan order

  • no floating-point atomics as the final accumulation mechanism

  • explicit restore-order after compaction or partitioning

Preferred implementation patterns:

  • CCCL or CUB stable sort before group reduction

  • fixed tree-reduction shapes

  • sorted-key gather or reduce-by-key for grouped aggregates

  • staged ambiguity compaction followed by deterministic restore

Operations Affected

The main affected categories are:

  • metric reductions

    • area totals

    • length totals

    • grouped sums and counts

  • structured constructive work

    • dissolve

    • overlay area totals

    • any grouped union pipeline that reduces or restores rows

  • query aggregation surfaces

    • spatial join count/sum style kernels

Pure geometry-local coarse work is usually not determinism-sensitive unless it changes emit order.

Performance Budget

Deterministic mode is allowed to cost more.

Repo policy:

  • up to 2x overhead is acceptable for reduction-heavy metric and constructive kernels

  • up to 1.5x overhead is acceptable for order-sensitive coarse or predicate paths

Each affected kernel or pipeline should publish its measured overhead once a GPU implementation exists.

Current Baseline

The current dissolve baseline is CPU-hosted and already stable in row order, so the reproducibility probe in scripts/check_determinism.py is mainly proving the contract and artifact shape today:

  • repeated dissolve + area output hashing

  • same-device bitwise check

  • default vs deterministic elapsed comparison

That is the proof surface future GPU reductions should keep green as they land.