Runtime Model¶
vibeSpatial is GPU-first, not GPU-optional.
Intent¶
Define runtime-selection rules, fallback visibility requirements, and the first files to inspect when execution behavior changes.
Request Signals¶
runtime
gpu
cuda
fallback
execution mode
kernel
cccl
diagnostics
Open First¶
docs/architecture/runtime.md
src/vibespatial/runtime/_runtime.py
src/geopandas/init.py
src/vibespatial/api/init.py
Verify¶
uv run pytest
Risks¶
Silent CPU fallback hides unsupported GPU behavior.
Runtime-selection changes can desync the GeoPandas shim from the runtime layer.
Kernel-oriented changes can look correct locally while breaking the upstream contract.
Core Rules¶
Design APIs around bulk device execution and parallel kernels.
Prefer
cuda-pythonfor runtime control and kernel launch plumbing.Prefer CCCL for reusable data-parallel building blocks.
Runtime availability means a real CUDA device is present, not just that the Python package imports successfully.
CPU execution exists to preserve correctness and debuggability, not to define the architecture.
Canonical geometry storage should stay
fp64; compute precision may dispatch separately from storage precision.Null and empty geometries are distinct states and must stay distinct through buffer layout and kernel outputs.
Predicate and constructive kernels must declare a robustness guarantee, not just a precision mode.
Deterministic reproducibility is opt-in; default mode stays performance-first.
autodispatch must use per-kernel crossover thresholds, not one global size gate.Generic runtime probing must not claim GPU execution for
autoby itself; the actual switch to GPU happens only inside kernel-specific dispatch planning.Adaptive planning may re-evaluate at chunk boundaries, but not mid-kernel.
Repo-owned
GeoSeriesandGeoDataFramemethods must carry explicit dispatch registrations.Repo-owned kernel modules must register at least one kernel variant before they are allowed to land.
Phase 9 bounds execution is the first live cuda-python kernel and keeps family-specialized CPU and GPU variants side by side so dispatch can stay performance-driven instead of one-size-fits-all.
Fallback¶
automode may fall back to CPU when GPU execution is unavailable.Explicit
gpumode must fail loudly if the required GPU path is unsupported.Fallback events should be observable. Silent host execution is not acceptable.
New fallback surfaces should be paired with tests or diagnostics.
Non-user host-to-device and device-to-host transfers must remain visible.
Device-to-host transfers belong only in explicit materialization surfaces such as
to_pandas,to_numpy,values, and__repr__.
Session Execution Mode Override¶
The session-wide execution mode follows the determinism.py pattern:
VIBESPATIAL_EXECUTION_MODEenv var (auto,cpu,gpu).set_execution_mode()programmatic override (takes priority over env var).get_requested_mode()reads: explicit override > env var >autodefault.CPU mode causes early returns in IO (
_try_gpu_read_file, WKB decode/encode),DeviceGeometryArrayoperations (to_crs,dwithin,_binary_predicate,clip_by_rect), binary predicates, andgeoseries_from_owned.Setting the mode invalidates the adaptive runtime snapshot cache.
All entry points call
get_requested_mode()to determine dispatch; internal GPU-only helpers are safe because their callers gate on mode first.
Provenance Rewrite Override¶
The provenance rewrite system (ADR-0039) follows the same pattern:
VIBESPATIAL_PROVENANCE_REWRITESenv var (default: enabled;0/false/no/offto disable).set_provenance_rewrites(bool | None)programmatic override (takes priority over env var;Noneclears override back to default).provenance_rewrites_enabled()reads: explicit override > env var >True.Gated at five sites:
attempt_provenance_rewrite()inprovenance.py(covers R1 and all consumption-time binary predicate rules), the R5/R6 branches ingeometry_array.py:buffer(), the R7 branch ingeometry_array.py:simplify(), and the R2 branch insjoin.py:_geom_predicate_query().
Index-Array Boundary Model (ADR-0036)¶
Spatial kernels produce only index arrays (np.ndarray with integer dtype).
Attribute assembly is always pandas on host. GPU VRAM is reserved for geometry.
The boundary is enforced by
SpatialJoinIndices(frozen dataclass inspatial_query_types.py) and__debug__-gated dtype assertions at kernel return points inspatial_query_utils.pyandspatial_nearest.py.sjoin._frame_joinis structured into three delineated blocks: geometry extraction, attribute reindexing (geometry-free), and geometry reassembly. The outer-join geometry path is isolated in_reassemble_outer_geometry.Overlay’s
_overlay_intersectiondelegates attribute merging to_assemble_intersection_attributes, which receives only index arrays and attribute-only DataFrames. When both operands have owned geometry backing,_overlay_intersectiondispatches throughbinary_constructive_ownedat buffer level (viaOwnedGeometryArray.take), bypassing Shapely materialization. Falls back to the standard GeoSeries path onNotImplementedError(e.g. GeometryCollection results)._overlay_differencealso dispatches throughbinary_constructive_ownedwhen owned backing is available: selective right materialization (only unique participating rows viatake), grouped union on host, then GPU-accelerated difference._overlay_symmetric_diff,_overlay_union, and_overlay_identityinherit this via delegation.I/O paths (
io_geoparquet.py) keep Arrow tables through geometry decode and defer.to_pandas()to the GeoDataFrame construction boundary.Contract tests in
tests/test_index_array_boundary.pyvalidate the boundary invariants across spatial query, sjoin, overlay, dissolve, and clip.
Compatibility¶
GeoPandas behavior is measured with vendored upstream tests.
Upstream parity matters more than mirroring GeoPandas internals.
Rebuild abstractions only when the test contract or performance data demands them.