Spatial Query And Join Assembly¶
Context¶
o17.4.3 needs to turn the Phase-3 coarse filters and Phase-4 predicate engine
into GeoPandas-compatible sindex.query, sjoin, dwithin, and nearest
results without losing pandas join semantics.
Decision¶
Keep pandas/DataFrame join assembly in the vendored GeoPandas helpers and land a repo-owned spatial query engine as the future GPU/performance seam.
The owned engine is responsible for:
flat spatial index candidate generation
exact predicate refine on compacted candidate rows
distance-expanded candidate generation for
dwithinbounded nearest reduction when
max_distanceis provided
The vendored layer remains responsible for:
join
howsemanticssuffix and overlapping-column handling
index restoration
geometry-column retention
Unsupported geometry inputs continue to fall back to the original STRtree path.
Current host benchmarks show the owned query path is not yet competitive with
STRtree, so the vendored sindex adapter remains on STRtree for default host
execution today.
Consequences¶
targeted upstream
sindexandsjointests keep assembly semantics stablejoin/result semantics stay aligned with GeoPandas instead of being reimplemented
future GPU/CCCL work can replace candidate generation and bounded nearest internals without changing the public assembly behavior
Alternatives Considered¶
Reimplement full pandas/DataFrame join assembly in repo-owned code. Rejected because it would duplicate a lot of mature GeoPandas semantics.
Keep all query logic on STRtree and only patch
sjoin. Rejected because it would leave no owned query engine for GPU-oriented work.Attempt an all-in nearest implementation now. Rejected because unbounded nearest still benefits from the existing STRtree path, while bounded nearest is the GPU-friendly path worth specializing first.
Acceptance Notes¶
This decision lands the owned query engine plus vendored assembly integration and
validates them against targeted upstream sindex and sjoin families. Unbounded
nearest still uses STRtree today; bounded nearest now has an owned staged path.