Dual Precision Dispatch¶

Context¶

Consumer GPUs often have dramatically worse fp64 throughput than fp32, while GeoPandas and Shapely semantics expect double-precision-like accuracy for real world coordinates. The repo needs a precision strategy before kernel work expands because the choice affects buffer layout, kernel signatures, and later robustness contracts.

Decision¶

Store authoritative coordinates in fp64, then choose compute precision at dispatch time.

auto chooses by device profile and kernel class
fp32 forces staged fp32 execution with centering and compensation
fp64 forces native fp64 execution

On consumer-style GPUs:

coarse and metric kernels may use staged fp32
predicate kernels may use staged fp32 only with selective fp64 refinement
constructive kernels stay on fp64 until later robustness work proves a safe alternative

On datacenter-style GPUs with favorable fp64 throughput:

default broadly to native fp64

Consequences¶

Owned buffers keep one authoritative coordinate representation.
Kernel APIs need a shared precision plan contract instead of ad hoc precision booleans.
Later kernel work can use consumer GPU throughput without forcing canonical fp32 storage.
Robustness work remains a separate decision for exact predicates and overlay.

Alternatives Considered¶

fp64 everywhere
fp32 everywhere
canonical dual fp32/fp64 buffer storage
a single global precision switch with no kernel-class distinctions

Acceptance Notes¶

The first landed policy exposes PrecisionPlan selection and documents staged fp32 versus native fp64 defaults. Numerical corpus validation and exact predicate policy remain follow-up work for later work.