Staged Binary Predicate Refine Pipeline

Context

o17.4.2 must provide exact binary-predicate refine kernels that later sindex, sjoin, and overlay work can reuse. The repo already has coarse bounds filters, exactness policy, and a first point-predicate pipeline. The missing piece is a shared exact-refine engine for aligned binary predicates.

Decision

Adopt a shared staged exact-refine engine for binary predicates:

  • choose a coarse bounding-box relation per predicate family

  • compact only the rows that survive the coarse relation

  • run the exact predicate on the compacted candidate set

  • scatter exact results back to the full output shape

Route the vendored GeoPandas GeometryArray binary predicate path through this engine boundary for the supported predicate set, while keeping current host execution on direct vectorized Shapely until performance data justifies moving the adapter onto the staged refine path.

Consequences

  • exact predicate semantics now have one stable internal API

  • targeted upstream GeoPandas predicate tests exercise repo-owned routing

  • the future GPU implementation can swap in CCCL-based compaction and device-side exact kernels without changing the GeoPandas adapter contract

  • current host workloads avoid a measured regression from coarse-filter staging

Alternatives Considered

  • Leave the vendored GeoPandas path on direct Shapely calls and build a separate internal engine later. Rejected because it would duplicate semantics and defer adapter integration.

  • Build one monolithic raw CUDA kernel per predicate. Rejected because it would hide compaction boundaries and fight the repo’s CCCL-first direction.

  • Use direct Shapely for every row forever. Rejected because it gives no reusable GPU path and no place to hang exact-refine diagnostics.

Acceptance Notes

This decision lands CPU variants only. Exact binary-predicate semantics, vendored GeoPandas routing, targeted upstream validation, and the staged engine contract are in place now. GPU predicate variants remain future work on top of this pipeline.