Determinism And Reproducibility Policy

Context

GPU-first geometry systems eventually hit nondeterministic arithmetic surfaces:

  • floating-point reductions

  • scan-like accumulation

  • compaction and restore order

  • floating atomics

Those are acceptable in the default fast path, but not acceptable for every user. Scientific, debugging, compliance, and regression-triage workflows need a clear determinism contract before more GPU reductions land.

The repo already had pieces of this implicitly:

  • stable sorting in dissolve planning

  • explicit row restoration in staged GPU-friendly designs

  • precision and robustness policies that distinguish correctness from speed

What was missing was a single policy defining when reproducibility is promised and what the guarantee actually means.

Decision

Adopt a two-mode determinism policy:

  • default

    • performance-first

    • allows faster reduction and scan implementations

    • does not promise bitwise-identical output for reduction-sensitive GPU work

  • deterministic

    • explicit opt-in via VIBESPATIAL_DETERMINISM=deterministic

    • requires stable output order plus fixed reduction and scan order for affected kernels

    • forbids floating-point atomics as the final accumulation mechanism

    • guarantees bitwise-identical output only for same input, same device architecture, and same driver/runtime stack

Cross-device bitwise reproducibility is explicitly rejected as a contract.

Consequences

  • determinism becomes a named runtime policy instead of an implementation rumor

  • future GPU reductions must declare whether they honor deterministic mode and what overhead they incur

  • kernel authors now have an explicit preferred implementation shape: stable sort, fixed-order reduce, deterministic restore

  • debugging and CI now have a single probe command for the baseline dissolve + area reproducibility path

Alternatives Considered

  • promise cross-device reproducibility

    • rejected as not technically defensible

  • force deterministic order in all modes

    • rejected because it would waste throughput on unaffected operations

  • leave determinism undocumented until more GPU reductions land

    • rejected because the performance-first default needs an explicit counter-mode

Acceptance Notes

The decision adds:

  • src/vibespatial/runtime/determinism.py

  • docs/architecture/determinism.md

  • scripts/check_determinism.py

The current proof surface is a dissolve + area aggregation probe repeated many times on the same input. Today that path is CPU-hosted and already stable, so the measured overhead is effectively the control baseline for future GPU work.

The important part is not the current overhead number. The important part is that future GPU reductions now have:

  • a named mode switch

  • a same-device bitwise contract

  • a repeatable verification command