Geometry Buffer Schema¶
Use separated fp64 coordinate buffers plus hierarchical offsets as the owned geometry core.
Intent¶
Define the concrete buffer schema for the six primary geometry families before adapter and kernel work begins.
Request Signals¶
geometry buffer
offsets
geoarrow layout
owned array
coordinate buffers
multipart schema
Open First¶
docs/architecture/geometry-buffers.md
docs/architecture/mixed-geometries.md
docs/architecture/precision.md
src/vibespatial/geometry/buffers.py
src/vibespatial/geometry/owned.py
src/vibespatial/kernels/core/geometry_analysis.py
docs/decisions/0008-owned-geometry-buffer-schema.md
Verify¶
uv run pytest tests/test_geometry_buffers.pyuv run python scripts/check_docs.py --check
Risks¶
Choosing an eager object-like layout now would force expensive rewrites once kernels want contiguous payloads.
Overfitting the schema to one geometry family would make mixed arrays and multipart kernels awkward later.
Mixing canonical storage concerns with execution-local staging would blur residency and precision boundaries.
Canonical Rule¶
Canonical owned storage uses separated
xandycoordinate buffers.Canonical coordinate precision is
fp64.Nulls use a validity bitmap.
Empties use valid rows with zero-length spans.
Multipart structure is represented with prefix-offset buffers, not nested Python objects.
Family Schemas¶
Point¶
geometry_offsets: row -> coordinatevalid non-empty rows own exactly one coordinate pair
empty rows own zero coordinate pairs
LineString¶
geometry_offsets: row -> coordinatepayload slice:
x[start:end],y[start:end]
Polygon¶
geometry_offsets: row -> ringring_offsets: ring -> coordinatepayload hierarchy: row -> ring -> coordinate
MultiPoint¶
geometry_offsets: row -> coordinatesame physical shape as
LineString, different semantics
MultiLineString¶
geometry_offsets: row -> partpart_offsets: part -> coordinatepayload hierarchy: row -> line part -> coordinate
MultiPolygon¶
geometry_offsets: row -> polygon partpart_offsets: polygon part -> ringring_offsets: ring -> coordinatepayload hierarchy: row -> polygon part -> ring -> coordinate
Mixed-Geometry Integration¶
The mixed-array contract from
o17.2.12remains canonical.Mixed arrays should store a coarse family tag plus a family-relative row offset.
Family payload buffers must stay reusable without copying coordinate payloads during sort-partition execution.
Adapter Surface¶
The owned-array bootstrap surface currently supports:
Shapely geometry sequences -> owned arrays
WKB sequences -> owned arrays
GeoArrow-style buffer views -> owned arrays
owned arrays -> Shapely
owned arrays -> WKB
owned arrays -> GeoArrow-style buffer views
The current GeoArrow path is a typed buffer-view contract, not a full pyarrow
extension-array integration. That narrower surface is enough for CPU
validation, buffer inspection, and later IO work to target.
Offset Rules¶
Offsets are prefix arrays with length
N + 1.Empty valid geometries use equal adjacent offsets.
Nullness is never encoded by offset shape.
int32is the default offset dtype for Phase 2; revisit only when measured scale requires wider offsets.
Execution Boundaries¶
Centered fp32 work buffers from the precision policy are execution-local artifacts, not canonical storage.
Permutation buffers for mixed execution are execution-local artifacts, not canonical storage.
Residency attaches to the owned buffer object as a whole, not separately to every offset array.
Buffer-boundary diagnostics belong to the owned array object and must survive transfers and explicit fallback decisions.