CUDA Kernels¶

Kernel inventory¶

vibeSpatial’s GPU kernels live in src/vibespatial/kernels/. Each kernel is an NVRTC source compiled at runtime.

Kernel	Module	Operations
Bounds	`kernels/core.py`	Geometry bounds, total bounds, Morton keys
Predicates	`kernels/predicates.py`	Point-in-polygon, point-within-bounds
Segment intersection	`segment_primitives.py`	Extraction (count-scatter), candidate generation (sort-sweep + scatter), classification (Shewchuk adaptive)
Overlay	`overlay_gpu.py`	Half-edge graph, split events, face extraction
Stroke	`stroke_kernels.py`	Point buffer, offset curve
Make valid	`make_valid_gpu.py`	GPU polygon repair
Binary constructive	`constructive/binary_constructive.py`	Intersection, union, difference, symmetric difference (PIP + overlay dispatch)
Envelope	`constructive/envelope.py`	Bounding-box polygon (5-vertex closed ring)
Exterior ring	`constructive/exterior.py`	Polygon exterior ring extraction (LineString output)
Normalize	`constructive/normalize.py`	Ring rotation (lex-min vertex), linestring reversal, multi-part sorting
Simplify	`constructive/simplify.py`	Visvalingam-Whyatt vertex elimination
Validity / Simplicity	`constructive/validity.py`	is_valid (OGC: ring closure, min coords, self-intersection, hole containment, ring crossing/overlap, multi-touch), is_simple (self-intersection)
Clip by rect	`constructive/clip_rect.py`	Bounds-filtered rectangle clip (point, line, polygon families via Sutherland-Hodgman / Liang-Barsky GPU kernels)
Distance metrics	`spatial/distance_metrics.py`	Hausdorff (min-of-max brute force), discrete Frechet (DP coupling matrix)
Polygon intersection	`kernels/constructive/polygon_intersection.py`	Element-wise Sutherland-Hodgman polygon clipping (count-scatter, device-resident output)
Polygon difference	`kernels/constructive/polygon_difference.py`	Element-wise polygon difference via overlay topology pipeline (face selection)
Segmented union	`kernels/constructive/segmented_union.py`	Per-group polygon union via binary-tree reduction of overlay_union_owned

Adding a new kernel¶

Use the scaffold script:

uv run python scripts/generate_kernel_scaffold.py my_kernel_name

This creates:

src/vibespatial/kernels/my_kernel_name.py – kernel source + compilation
tests/test_my_kernel_name.py – Shapely oracle test fixture
Manifest entry for precompilation

Precision compliance¶

Every kernel must support dual-precision dispatch per ADR-0002. The PrecisionPlan selects fp32 or fp64 based on device capability. Kernels receive a precision_mode parameter and must use the appropriate types.

Precompilation¶

NVRTC kernels can be precompiled to reduce first-launch latency:

VIBESPATIAL_PRECOMPILE=1 uv run python -c "import vibespatial"