Constructive Result Unification Execution Plan¶
Intent¶
Turn the overlay/clip/dissolve architecture discussion into an execution plan with tracked milestones, explicit deletion targets, fusion guidance, and verification gates.
This plan is for doing the larger refactor now, not for incremental patching of the current wrapper stack. The goal is to make the constructive result boundary correct once and then build planner-selected acceleration on top of that stable shape.
Request Signals¶
constructive result
NativeTabularResultoverlay refactor
clip refactor
dissolve refactor
device-native result boundary
execution plan
milestone plan
fusion plan
workload family
true operation fast path
overlay architecture
delete constructive wrappers
Open First¶
docs/dev/constructive-result-unification-execution-plan.mddocs/decisions/0042-device-native-result-boundary.mddocs/decisions/0016-overlay-reconstruction-plan.mddocs/architecture/overlay-reconstruction.mddocs/architecture/clip-fast-paths.mddocs/architecture/dissolve.mddocs/architecture/fusion.mddocs/architecture/runtime.mdsrc/vibespatial/api/_native_results.pysrc/vibespatial/api/tools/overlay.pysrc/vibespatial/api/tools/clip.pysrc/vibespatial/overlay/dissolve.pysrc/vibespatial/runtime/fusion.py
Verify¶
uv run python scripts/check_docs.py --checkuv run pytest tests/test_overlay_api.py tests/test_clip_rect.py tests/test_dissolve_pipeline.py tests/test_gpu_dissolve.py -quv run pytest tests/upstream/geopandas/tests/test_overlay.py -k geometry_not_named_geometryuv run pytest tests/upstream/geopandas/tests/test_dissolve.py -k dissolve_multi_agguv run pytest tests/upstream/geopandas/tests/test_pandas_methods.py -k groupby_metadatauv run python scripts/health.py --tier contract --checkuv run python scripts/health.py --tier gpu --checkuv run python scripts/benchmark_pipelines.py --suite full --repeat 1 --gpu-sparkline
Risks¶
Replacing wrapper types without hardening the export boundary first can just move the same metadata bugs into a new file.
Reworking clip and dissolve separately from overlay can recreate the current divergence under new type names.
Overfusing generic overlay into a mega-kernel would hide reusable structures, erase diagnostics, and likely make correctness work harder rather than faster.
Leaving
GeoDataFramecomposition in place anywhere on the hot path would preserve the main architectural failure mode even if native result types are renamed.Planner-selected fast paths can regress semantics if they are not forced back through the same canonical result contract.
The push can devolve into “make tests pass” unless each milestone also deletes obsolete result wrappers and reduces host-side assembly.
End-to-end speed can still regress if GPU coverage rises only through helper stages while public constructive workflows remain host-dominated.
Mission¶
Make NativeTabularResult the canonical constructive result across overlay,
clip, and dissolve.
The refactor is successful only if all of the following are true:
overlay, clip, and dissolve native paths converge on one result boundary
GeoPandas export becomes one explicit terminal boundary instead of repeated intermediate composition
planner-selected workload families become the main mechanism for “true operation” fast paths
fused execution happens inside those planner-selected families, not as one generic overlay mega-kernel
the current metadata/export failures disappear because the contract is simpler, not because more compatibility glue was layered on top
This plan treats the current constructive wrapper pyramid as transitional debt:
PairwiseConstructiveResultLeftConstructiveResultSymmetricDifferenceConstructiveResultPairwiseConstructiveFragmentLeftConstructiveFragmentConcatConstructiveResultGroupedConstructiveResultClipNativeResult
These types may survive temporarily during migration, but the plan is not complete until they are deleted or reduced to thin compatibility shims with no architectural significance.
Target Shape¶
The target architecture is:
one canonical constructive result:
NativeTabularResultone geometry-only carrier:
GeometryNativeResultone explicit compatibility boundary:
to_geodataframe(),to_arrow(),to_parquet(),to_feather()many execution families selected by a planner before heavy work starts
The planner should select among workload families such as:
clip_rewritebroadcast_right_intersectionbroadcast_right_differencecontainment_bypass + batched_sh_clip + remainder_overlaycoverage_uniongrouped_uniongeneric_reconstruction
The planner output should describe:
operation
workload shape
geometry family mix
topology class
semantic flags such as
keep_geom_typeresult shape
execution family
fusion opportunities
Within a selected family, fused execution is encouraged where the stages are ephemeral and device-local. Fused execution is not the goal by itself; it is a mechanism for reducing launch count, intermediate traffic, and host re-entry once the planner has already identified the true operation shape.
The canonical result contract must own:
primary geometry column
secondary geometry columns
active geometry name
CRS
column order
attrs / provenance that survive export
No constructive path should build its final answer by concatenating
GeoDataFrames or by repeatedly calling to_geodataframe() on intermediate
parts.
Non-Goals¶
This push is not about:
preserving the current constructive wrapper taxonomy
building a whole-program lazy graph runtime
writing one universal fused overlay kernel
changing public APIs for its own sake
accepting host-side composition because “there are no users yet”
making the tests green by papering over export bugs with more pandas surgery
This push is also not done when:
overlay still composes results through intermediate
GeoDataFramesclip or dissolve still use a different native result model
the planner still reasons in terms of public surface names instead of workload families
fused paths exist but export still falls back through the old host boundary
Working Principles¶
Make the result boundary authoritative before optimizing around it.
Delete wrappers instead of shuffling them into more files.
Keep one explicit compatibility boundary and make it correct.
Choose execution family before heavy work starts; do not re-plan mid-kernel.
Use fused stage clusters inside a workload family, not across every algorithmic boundary.
Persist reusable structures such as indexes, sorted half-edges, and grouped offsets when the pipeline needs them again.
Keep stable-order semantics explicit anywhere GeoPandas parity depends on deterministic row ordering.
Treat
NativeTabularResultas the public internal contract for constructive work. Everything else lowers into it.Verify every milestone with upstream contract tests, health gates, and the mandatory end-to-end profile.
Tracking¶
[x] M0. Baseline, contract freeze, and deletion inventory
[x] M1. Shared result core and canonical contract extraction
[x] M2. Explicit export boundary rewrite
[x] M3. Overlay cutover to canonical native tabular results
[x] M4. Clip cutover to canonical native tabular results
[x] M5. Dissolve cutover to canonical native tabular results
[x] M6. Planner-selected workload families
[x] M7. Fused stage clusters and specialized kernel family acceleration
[x] M8. Wrapper deletion, convergence, and final verification
Milestone M0: Baseline, Contract Freeze, And Deletion Inventory¶
Goal¶
Freeze the target architecture, capture before-state evidence, and make the wrapper deletion target explicit before implementation starts.
Primary Surfaces¶
docs/dev/constructive-result-unification-execution-plan.mddocs/decisions/0042-device-native-result-boundary.mdsrc/vibespatial/api/_native_results.pyscripts/health.pybenchmark and upstream overlay/dissolve verification rails
Checklist¶
[x] Capture the current overlay, clip, and dissolve native result families.
[x] Write down which wrapper types are transitional and must be deleted.
[x] Capture the current upstream failure set for:
geometry_not_named_geometry,dissolve_multi_agg, andgroupby_metadata.[x] Capture current
contractandgpuhealth outputs as the before-state.[x] Capture the current full end-to-end pipeline profile on the target GPU.
[x] Confirm that the canonical target is
NativeTabularResult, not a new sibling abstraction.[x] Confirm that fused execution will be planner-selected by workload family, not implemented as a single universal overlay kernel.
Baseline Snapshot Captured 2026-04-14¶
Target machine visibility: GPU
0isNVIDIA GeForce RTX 4090;/dev/nvidia0,/dev/nvidiactl,/dev/nvidia-uvm,/dev/nvidia-uvm-tools, and/dev/nvidia-modesetare visible;CUDA_VISIBLE_DEVICESis unset.Property dashboard before-state:
6/6clean, total distance0.00.Current constructive wrapper inventory:
_native_results.pycurrently definesPairwiseConstructiveResult,LeftConstructiveResult,SymmetricDifferenceConstructiveResult,PairwiseConstructiveFragment,LeftConstructiveFragment,ConcatConstructiveResult, andGroupedConstructiveResult;api/tools/clip.pydefinesClipNativeResult.Current hot-file sizes:
api/tools/overlay.py5786lines,api/_native_results.py2584lines,api/tools/clip.py3024lines,overlay/dissolve.py2956lines.Focused upstream failure slice:
test_geometry_not_named_geometrycurrently fails for[union-True],[intersection-True], and[identity-True];test_dissolve_multi_aggfails onPandas4Warning;test_groupby_metadatafails for all four[geometry|geom] x [None|EPSG:4326]cases.Export-seam diagnosis from the focused slice: overlay failures route through
ConcatConstructiveResult/to_geodataframe()andGeoDataFrame.crs; dissolve and groupby failures route through_materialize_attribute_geometry_frame()and itsreindex(..., copy=False)frame rebuild path.Contract health before-state:
Repo Health — contract: FAIL; required surfaces passing7/8; required red isoverlayat236/260; optional red isperformance_railsat25/26.GPU health before-state:
Repo Health — gpu: FAIL; property summary6/6clean, distance0.00; GPU acceleration20.08%(4349GPU dispatches /21654total dispatches); CPU dispatches16717; fallback dispatches16.Full upstream GPU-sensitive sweep inside the GPU tier:
1954passed,423skipped,14xfailed,8failed in114.61s. The eight real failures are the same focused export/metadata slice captured above.1M pipeline baseline summary:
join-heavytotal110.6ms, planner-selected runtimegpu, actual runtimehybrid;dissolve_groupsat60.15msis the dominant CPU stage.constructivetotal65.9ms, actual runtimehybrid;write_outputat59.07msdominates.predicate-heavytotal104.8ms, actual runtimegpu;read_geojsonat76.28msdominates.zero-transfertotal40.57ms, actual runtimegpu;read_inputat19.02msandwrite_outputat11.88msdominate.10k full shootout baseline summary:
benchmark_results/2026-04-14-constructive-result-unification-m0/shootout_suite_10k.jsoncaptures the full directory run atscale=10k,repeat=3,warmup=true. All10/10workflows passed and fingerprint-matched;4/10are already at GeoPandas parity or better and6/10are still below parity.10k shootout parity leaders:
site_suitability2.847x(654.34msGeoPandas vs228.00msvibeSpatial),redevelopment_screening1.783x(685.98msvs399.89ms),network_service_area1.374x(91.55msvs66.28ms), andnearby_buildings1.021x(97.37msvs95.08ms).10k shootout below-parity workloads:
flood_exposure0.994x(36.12msvs36.26ms),transit_service_gap0.937x(230.01msvs244.38ms),parcel_zoning0.856x(63.64msvs75.01ms),corridor_flood_priority0.795x(158.52msvs199.24ms),vegetation_corridor0.679x(292.94msvs431.79ms), andaccessibility_redevelopment0.265x(215.91msvs826.97ms).M0 decisions locked: the canonical target is
NativeTabularResult; acceleration work should be planner-selected by workload family with fused stage clusters inside those families, not a single generic overlay mega-kernel.
Exit Criteria¶
the target contract is written down unambiguously
the deletion inventory is explicit
the before-state is documented well enough to measure structural progress
Milestone M2: Explicit Export Boundary Rewrite¶
Goal¶
Rewrite the GeoPandas export boundary once so geometry-name restoration, CRS, column order, and multi-geometry handling stop depending on brittle pandas rebuild tricks.
Primary Surfaces¶
src/vibespatial/api/_native_results.pysrc/vibespatial/api/geodataframe.py
Checklist¶
[x] Rewrite
_materialize_attribute_geometry_frame()around the canonical contract instead of post-hoc frame surgery.[x] Remove reliance on
reindex(..., copy=False)plus manual class mutation for correctness.[x] Make active-geometry restoration explicit and test it directly.
[x] Preserve secondary geometry columns without silently clobbering the active geometry state.
[x] Add direct regression tests for: geometry-not-named-geometry, CRS access, groupby metadata, and column order.
[x] Confirm Arrow / Feather / GeoParquet export still works from the new boundary without forcing
GeoDataFramematerialization first.
Exit Criteria¶
the metadata/export seam is singular and trustworthy
the known upstream metadata failures are either already green or clearly isolated above the export layer
Completion Notes Captured 2026-04-15¶
Rewrote
_materialize_attribute_geometry_frame()insrc/vibespatial/api/_native_result_core.pyto build the final ordered payload directly from attributes plus explicit geometry columns, instead of rebuilding a frame throughreindex(..., copy=False).Tightened the core contract so attribute columns cannot overlap primary or secondary geometry names, and removed the remaining
Pandas4Warningfrom loader-backed column renames in the shared core.Reworked constructive wrapper lowering in
src/vibespatial/api/_native_results.pyso projected frames split into true attributes plus explicit secondary geometry columns before Arrow storage. The lowering path now preserves legacy overwrite semantics where pairwise overlay intentionally clobbers a temporarygeometrycolumn, while left-row-preserving exports keep secondary geometry columns unless the final active geometry name would collide.Fixed two boundary-specific structural bugs exposed by the rewrite:
NativeAttributeTable.concat()now preserves row counts for zero-attribute tables, and projected / left-preserving output-order reconstruction now normalizes integer-labeled columns against the actual Arrow-backed schema instead of reintroducing stale raw labels.Symmetric-difference lowering now computes one final merged column order after concat, so the active geometry column lands in the correct final position instead of being interleaved between attribute blocks.
Added direct regression coverage in
tests/test_native_result_core.pyfor the no-Pandas4Warningexport boundary and zero-column attribute concat, and updatedtests/test_overlay_api.pyso the no-materialization assertions track the new pairwise / left lowering choke points instead of the old attribute helper.Focused upstream regressions are green:
test_geometry_not_named_geometry,test_dissolve_multi_agg,test_groupby_metadata, upstream clip donut / keep-geom-type setup via symmetric difference, and upstream Arrow column-order preservation.Contract and GPU ratchets are clean against the committed baselines: contract baseline comparison returns
exit_code 0with no regressions; GPU baseline comparison also returnsexit_code 0with no regressions. Property distance stayed at0.0(6/6clean).Mandatory 1M profile after the rewrite shows no new host-side stall:
join-heavyremains dominated bydissolve_groups(44.92ms),constructivebywrite_output(61.84ms),predicate-heavybyread_geojson(77.71ms), andzero-transferbyread_input(17.36ms) pluswrite_output(14.31ms).
Milestone M3: Overlay Cutover To Canonical Native Tabular Results¶
Goal¶
Make overlay native paths return NativeTabularResult directly and stop using
constructive fragments and wrapper composition as the primary execution model.
Primary Surfaces¶
src/vibespatial/api/tools/overlay.pysrc/vibespatial/overlay/src/vibespatial/api/_native_results.pytests/test_overlay_api.py
Checklist¶
[x] Make intersection lower directly to
NativeTabularResult.[x] Make difference lower directly to
NativeTabularResult.[x] Make identity compose canonical native tabular results, not
GeoDataFrame-producing fragments.[x] Make symmetric difference compose canonical native tabular results, not wrapper objects.
[x] Make union compose canonical native tabular results, not fragment trees.
[x] Reduce
overlay()to validation, policy, planning, dispatch logging, and one final export boundary.[x] Keep no-materialization write paths green for Arrow / Feather / GeoParquet.
[x] Remove overlay hot-path dependence on the constructive wrapper classes.
Exit Criteria¶
all overlay native entrypoints return
NativeTabularResultoverlay no longer relies on repeated
to_geodataframe()compositionoverlay export-without-materialization tests still pass
Completion Notes Captured 2026-04-15¶
Cut overlay over to direct
NativeTabularResultreturn values insrc/vibespatial/api/tools/overlay.py. Intersection and difference now lower immediately through direct native-tabular builders, and identity / symmetric-difference / union compose canonical native tabular results instead of building fragment trees.Added direct shared builders in
src/vibespatial/api/_native_results.pyfor pairwise and left-row-preserving constructive results, plus a canonical native-tabular symmetric-difference combiner and a shared rename helper that preserves legacy geometry-name collision semantics.Removed overlay hot-path dependence on
PairwiseConstructiveFragment,LeftConstructiveFragment,ConcatConstructiveResult, andSymmetricDifferenceConstructiveResult. Those legacy wrapper families remain only as migration support for other surfaces.Fixed two export-boundary edge cases exposed by the cutover: primary-geometry renames now drop colliding secondary geometry columns instead of constructing an invalid native result, and host-side geometry concat now normalizes fragment CRS metadata to the caller-selected output CRS before concatenation.
Updated
tests/test_overlay_api.pyso the overlay-native assertions check direct native-tabular construction, verify that legacy wrapper lowering is not part of the hot path, and keep Arrow / Feather / GeoParquet no-materialization writes covered.Focused verification is green:
tests/test_overlay_api.pytargeted M3 slice,tests/test_native_result_core.py, upstreamtest_geometry_not_named_geometry, upstreamtest_crs_mismatch[union], upstreamtest_crs_mismatch[symmetric_difference], and the upstream overlay keep-geometry / geometry-name slice.Contract and GPU ratchets are clean against the committed baselines. Contract improved the overlay surface from
236/260to244/260required passing while keeping the ratchet process exit clean; GPU health isPASSat20.10%acceleration with properties still6/6clean and total property distance0.0.Mandatory 1M profile after the cutover shows no new host-side stall:
join-heavyis still dominated bydissolve_groups(60.30ms),constructivebywrite_output(57.91ms),predicate-heavybyread_geojson(75.90ms), andzero-transferbyread_input(17.81ms) pluswrite_output(14.81ms).
Milestone M4: Clip Cutover To Canonical Native Tabular Results¶
Goal¶
Bring clip onto the same constructive result model while preserving its clip-specific semantic cleanup and device fast paths.
Primary Surfaces¶
src/vibespatial/api/tools/clip.pysrc/vibespatial/constructive/clip_rect.pytests/test_clip_rect.py
Checklist¶
[x] Replace
ClipNativeResultas the architectural center with directNativeTabularResultconstruction.[x] Preserve clip row ordering and ordered-row restoration in the canonical result path.
[x] Preserve
keep_geom_typesemantics before the export boundary.[x] Preserve GeoSeries and GeoDataFrame source behavior under the same shared result contract.
[x] Keep rectangle fast paths device-native where the workload family allows it.
[x] Keep no-materialization Arrow / Feather / GeoParquet export paths green.
Exit Criteria¶
clip uses the same native result model as overlay
clip-specific cleanup no longer requires a bespoke result wrapper family
Completion Notes Captured 2026-04-15¶
Cut clip over to direct
NativeTabularResultreturn values insrc/vibespatial/api/tools/clip.py.evaluate_geopandas_clip_native()and the publicclip()entrypoint now run through the canonical native-tabular boundary, withClipNativeResultretained only as compatibility / explicit materializer support instead of the architectural center.Added shared clip lowering in
src/vibespatial/api/_native_results.pyvia_clip_constructive_parts_to_native_tabular_result(). The direct path now preserves ordered row restoration, duplicate-index source ordering, device fast paths, and the canonical GeoSeries / GeoDataFrame export split.Fixed three clip-boundary regressions exposed by the cutover: duplicate-label source indexes now project attributes positionally instead of reindexing by label; zero-area boundary-touch polygons stay preserved instead of being dropped as nonpositive-area noise; device-backed clip outputs are restored to
DeviceGeometryArrayafter native cleanup instead of silently downgrading to host-backedGeometryArray.Tightened the shared result core in
src/vibespatial/api/_native_result_core.pyso Arrow-backed attribute tables preserve logical column labels across native concat / take / rename paths. This removed the remaining integer-column drift at the shared export boundary and restored GeoPandas-visible column order for clip results.Updated clip boundary and public tests to assert the direct native-tabular contract, route materialization through the explicit
_clip_native_tabular_to_spatial()boundary, and keep no-materialization GeoParquet coverage green.Focused verification is green:
tests/test_index_array_boundary.py,tests/test_clip_public_api.py,tests/test_clip_rect.py, the upstream clip regression slice coveringclip_with_polygon,clip_empty_mask,clip_multipoly_keep_slivers, andclip_single_multipoly_no_extra_geoms, plus the targeted upstream clip keep-sliver / polygon-mask slice.Repo health remains ratchet-clean after the cutover: property dashboard is still
6/6clean with total distance0.0;contract --checkstill exits clean against the committed baseline withclipnow43/43passing andoverlayremaining the only required red at244/260;gpu --checkisPASSwith20.08%acceleration (4366 / 21747GPU dispatches,16fallbacks).Mandatory 1M profile after the cutover shows no new host-side stall:
join-heavyis still dominated bydissolve_groups(53.17ms),constructivebywrite_output(61.14ms),predicate-heavybyread_geojson(76.11ms), andzero-transferbyread_input(19.29ms) pluswrite_output(14.67ms).
Milestone M5: Dissolve Cutover To Canonical Native Tabular Results¶
Goal¶
Bring grouped constructive work onto the same canonical result boundary without reopening Python-group-iteration architecture.
Primary Surfaces¶
src/vibespatial/overlay/dissolve.pytests/test_dissolve_pipeline.pytests/test_gpu_dissolve.pyupstream dissolve and pandas metadata tests
Checklist¶
[x] Replace
GroupedConstructiveResultas the primary constructive boundary with directNativeTabularResultoutput.[x] Make grouped union and grouped attribute aggregation lower into the same canonical result contract.
[x] Make
LazyDissolvedFrame.to_native_result()return the canonical native tabular result.[x] Preserve stable in-group row order and deterministic group ordering.
[x] Make
dissolve_multi_aggand groupby metadata tests pass through the new shared export boundary.[x] Keep grouped work staged so future GPU grouped-union work can still map onto CCCL-friendly primitives.
Completion Notes Captured 2026-04-15¶
overlay/dissolve.pynow emitsNativeTabularResultdirectly for grouped constructive work.evaluate_geopandas_dissolve_native(),_grouped_constructive_result(), andLazyDissolvedFrame.to_native_result()all target the canonical native tabular boundary instead ofGroupedConstructiveResult._native_results.pynow exposes a direct grouped builder so grouped union and grouped attribute aggregation lower through the same shared constructive contract as overlay and clip.bench/pipeline.pywas cut over to the same builder for the direct grouped-dissolve path.Stable grouped ordering stayed intact through the cutover, and the focused upstream dissolve metadata slice is green again:
test_dissolve_multi_aggpassed, and all fourgroupby_metadatacases passed through the shared export boundary.Closing M5 surfaced a real Arrow/WKB point-boundary bug while running the GPU health gate: strict-native Parquet reads were collapsing partial-
NaNpoint coordinates toPOINT EMPTY. The fix landed in bothio/wkb.pyandio/pylibcudf.py, andtests/test_io_arrow.pynow locks the expected behavior with a direct partial-NaNWKB regression test.Verification after the cutover stayed clean at the repo-health level: property dashboard remained
6/6clean with distance0.00; contract--checkstayed baseline-clean witharrow_parquetat133/133,clipat43/43, unchanged required overlay debt at244/260, and the existing optionalperformance_railsdebt at25/26.GPU health returned to
PASSafter the WKB fix:1962 passed,423 skipped,14 xfailedin the strict-native upstream sweep; overall GPU acceleration remained20.08%(4366 GPU / 21747 dispatches) with16observed fallbacks and no property regression.The mandatory 1M profile did not introduce a new host-side stall:
join-heavyremains dominated bydissolve_groupsat45.06ms,constructivebywrite_outputat57.96ms,predicate-heavybyread_geojsonat75.99ms, andzero-transferbyread_input/write_outputat19.37ms/12.31ms.
Exit Criteria¶
dissolve uses the same result model as overlay and clip
grouped constructive export no longer depends on a bespoke wrapper type
Milestone M6: Planner-Selected Workload Families¶
Goal¶
Add a planner that chooses the true execution family for constructive work before heavy execution starts, instead of only dispatching by public API surface name.
Primary Surfaces¶
src/vibespatial/api/tools/overlay.pysrc/vibespatial/overlay/strategies.pysrc/vibespatial/overlay/gpu.pysrc/vibespatial/runtime/fusion.pyplanner-facing runtime metadata
Checklist¶
[x] Introduce a constructive planning object that describes: operation, workload shape, topology class, semantics flags, result shape, execution family, and fusion opportunities.
[x] Teach overlay planning to distinguish at least:
clip_rewrite,broadcast_right_intersection,broadcast_right_difference,coverage_union,grouped_union, andgeneric_reconstruction.[x] Record selected execution family in dispatch telemetry.
[x] Ensure every execution family still returns the canonical native tabular result.
[x] Keep planning decisions at the public boundary or chunk boundary, not mid-kernel.
[x] Preserve explicit CPU fallback visibility when no valid GPU family exists.
Completion Notes Captured 2026-04-15¶
src/vibespatial/overlay/strategies.pynow owns a real constructive planning object withoperation,workload_shape,topology_class,semantics_flags,result_shape,execution_family, and a stagedfusion_plan.The overlay planner now distinguishes the target workload families explicitly:
clip_rewrite,broadcast_right_intersection,broadcast_right_difference,coverage_union,grouped_union, andgeneric_reconstruction.src/vibespatial/overlay/gpu.pyandsrc/vibespatial/api/tools/overlay.pyboth record the same planner-selected telemetry detail, includingexecution_family,topology_class,result_shape,semantics, andfusion_stages.Planning now happens at the public boundary or the owned chunk boundary before heavy work starts; the runtime no longer has to infer the execution family from scattered mid-pipeline branches.
Focused M6 verification is green:
tests/test_spatial_overlay.py, the clip-rewrite dispatch assertion intests/test_overlay_api.py, property distance stayed0.0, and contract / GPU health remained baseline-clean.
Exit Criteria¶
constructive planning is about workload families, not only public methods
specialized fast paths are selected by planner evidence, not by ad hoc API branching
Milestone M7: Fused Stage Clusters And Specialized Kernel Family Acceleration¶
Goal¶
Use the planner-selected execution family to apply fused stage clusters and specialized kernels where they improve throughput without erasing explicit algorithm boundaries.
Primary Surfaces¶
src/vibespatial/runtime/fusion.pysrc/vibespatial/overlay/reconstruction.pysrc/vibespatial/overlay/gpu.pyspecialized clip / bypass / grouped-union kernels
Checklist¶
[x] Identify ephemeral stage clusters that can be fused safely inside each workload family.
[x] Keep reusable structures such as indexes, group offsets, and stable-sorted edge order persisted rather than fused away.
[x] Extend existing specialized families such as containment bypass and batched SH clip where the planner can prove the narrower shape.
[x] Add fused device-local tagging / filtering / compaction steps where lower-dimensional cleanup can remain native.
[x] Avoid implementing a single monolithic “generic overlay” fused kernel.
[x] Add benchmarks and telemetry that identify which execution family and fused stage cluster actually ran.
Completion Notes Captured 2026-04-15¶
The overlay planner now emits staged
FusionPlanobjects per execution family instead of treating fusion as a later side note. Broadcast-right intersection records a fused chain forcontainment_bypass -> batched_sh_clip -> row_isolated_overlay, grouped set operations persistcandidate_pairs/group_offsetsand then fuse thesegmented_union -> row_isolated_overlaychain, and generic reconstruction keeps a minimal staged shape.src/vibespatial/overlay/gpu.pynow routes the existing specialized GPU families through those planner-selected execution families, so containment bypass, batched SH clip, grouped right-neighbour union, and the generic row-isolated reconstruction path are all selected by explicit planner evidence instead of ad hochow/ shape checks.Telemetry now exposes the fused stage cluster that actually ran through the
fusion_stages=detail field, and the mandatory full profile still shows the generic reconstruction pipeline as staged rather than collapsed into one opaque kernel.Full 1M profile after the family/fusion work stayed healthy on the target RTX 4090:
join-heavyis still dominated bydissolve_groups(62.40ms),constructivebywrite_output(62.77ms),predicate-heavybyread_geojson(78.27ms), andzero-transferbyread_input/write_output(85.37ms/84.07ms).
Exit Criteria¶
fused execution is real for selected workload families
generic overlay still remains a staged reconstruction pipeline
performance gains come from planner-selected true-operation paths, not from hiding the algorithm inside one opaque kernel
Milestone M8: Wrapper Deletion, Convergence, And Final Verification¶
Goal¶
Delete obsolete wrapper types, collapse the remaining compatibility debt, and prove the unified architecture through contract, health, and profiling rails.
Primary Surfaces¶
src/vibespatial/api/_native_results.pyoverlay / clip / dissolve public adapters
test and health rails
docs and intake index
Checklist¶
[x] Delete or fully demote the obsolete constructive wrapper classes.
[x] Simplify
to_native_tabular_result()so it reflects the new steady state rather than migration glue.[x] Remove dead overlay / clip / dissolve conversion helpers and host composition fallbacks that no longer serve a real boundary.
[x] Update architecture docs to describe the new steady state instead of the migration path.
[x] Re-run upstream contract surfaces and ensure the steady-state reds are gone or explicitly baselined for product reasons rather than architecture debt.
[x] Re-run
contractandgpuhealth ratchets.[x] Run the mandatory full end-to-end pipeline profile and record the stage summary for the target machine.
Completion Notes Captured 2026-04-15¶
src/vibespatial/api/_native_results.pynow lowers pairwise and left-row-preserving constructive results through direct shared builders (_pairwise_constructive_to_native_tabular_result()and_left_constructive_to_native_tabular_result()) instead of building fragment wrapper objects first.The obsolete fragment / concat migration layer was removed from the steady state:
PairwiseConstructiveFragment,LeftConstructiveFragment,ConcatConstructiveResult, andSymmetricDifferenceConstructiveResultno longer define the architecture. Thin compatibility helper functions remain only so the no-fragment regression tests can keep proving those symbols are not on the hot path.src/vibespatial/api/tools/overlay.pynow calls the direct builders explicitly for intersection, difference, identity, symmetric difference, and union, so overlay no longer relies on wrapper-shaped lowering even internally.to_native_tabular_result()now reflects the actual steady state:NativeTabularResult,GeometryNativeResult,GroupedConstructiveResult,ClipNativeResult, and relation-join export results.Architecture docs were updated to describe the steady state in
docs/architecture/overlay-reconstruction.mdanddocs/architecture/fusion.md.Final verification snapshot: focused M8 regression slices passed, property dashboard stayed
6/6clean with distance0.00,contract --checkis now fully green for required surfaces with overlay at255/261and the known optionalperformance_railsdebt still at25/26,gpu --checkisPASSat20.08%acceleration (4380 / 21778GPU dispatches,16fallbacks), and the contract baseline file was updated to lock in the improved overlay and Arrow/Parquet counts.The required overlay debt was burned down completely after the final keep-geom-type / exact-host cleanup pass, so the remaining red in repo health is product-oriented optional
performance_rails, not constructive boundary architecture debt.The public-API
10kshootout improved from4/10parity-or-better workloads at M0 to6/10parity-or-better after the final overlay cleanup, withtransit_service_gapandvegetation_corridormoving above GeoPandas parity.
Exit Criteria¶
overlay, clip, and dissolve all converge on
NativeTabularResultone explicit constructive export boundary remains
wrapper debt is removed rather than renamed
planner-selected workload families and fused stage clusters are live
contract, health, and end-to-end profile evidence support the new shape