vibespatial.io.wkt_gpu

GPU WKT reader – structural analysis, coordinate extraction, and assembly.

GPU-accelerated WKT parser. Given a device-resident byte array containing one or more WKT geometries (one per line), this module performs:

  1. Line splitting – detect newline boundaries to delimit individual geometry strings.

  2. Parenthesis depth – reuse gpu_parse.bracket_depth with open_chars="(", close_chars=")".

  3. Geometry type classification – a custom NVRTC kernel scans the start of each geometry string and emits a family tag (POINT=0, LINESTRING=1, POLYGON=2, MULTIPOINT=3, MULTILINESTRING=4, MULTIPOLYGON=5) plus an EMPTY flag. Handles case-insensitive matching and EWKT SRID=NNNN; prefixes.

  4. Coordinate extraction – locate coordinate regions, extract numeric values via gpu_parse primitives, and build per-geometry offset arrays.

  5. OwnedGeometryArray assembly – pack device-resident coordinates and offsets into the standard columnar geometry representation.

All operations run on the GPU with zero host materialization until the caller explicitly requests results.

Tier classification (ADR-0033):
  • Line splitting: Tier 2 (CuPy element-wise + flatnonzero)

  • Parenthesis depth: delegates to gpu_parse.bracket_depth (Tier 1)

  • Type classification: Tier 1 (custom NVRTC – text-specific prefix matching)

  • Coordinate region finding: Tier 1 (custom NVRTC – paren-start scan)

  • Number extraction: delegates to gpu_parse primitives (Tier 1/2)

  • Per-geometry counting: Tier 1 (custom NVRTC – span-local counting)

  • Ring counting: Tier 1 (custom NVRTC – depth-aware paren counting)

  • Offset building: Tier 2 (CuPy cumsum) / CCCL exclusive_sum

  • Assembly: follows geojson_gpu.py patterns

Precision (ADR-0002):

Structural and counting kernels are integer-only byte classification. No floating-point coordinate computation occurs in those kernels, so no PrecisionPlan is needed (same rationale as gpu_parse/structural.py). Coordinate parsing delegates to gpu_parse.parse_ascii_floats which always produces fp64 – storage precision is always fp64 per ADR-0002.

Attributes

Classes

WktStructuralResult

Result of WKT structural analysis.

Functions

wkt_structural_analysis(→ WktStructuralResult)

Perform structural analysis and geometry type detection on WKT input.

read_wkt_gpu(...)

Parse WKT bytes on GPU and return device-resident geometry.

Module Contents

vibespatial.io.wkt_gpu.cp = None
vibespatial.io.wkt_gpu.KERNEL_PARAM_I64
class vibespatial.io.wkt_gpu.WktStructuralResult

Result of WKT structural analysis.

All arrays are device-resident CuPy arrays except n_geometries which is a Python int.

Attributes

d_depthcp.ndarray

Per-byte parenthesis depth, int32, shape (n_bytes,).

d_geom_startscp.ndarray

Start byte offset of each geometry, int64, shape (n_geometries,).

d_family_tagscp.ndarray

Geometry family tag per geometry, int8, shape (n_geometries,). Values: 0=POINT, 1=LINESTRING, 2=POLYGON, 3=MULTIPOINT, 4=MULTILINESTRING, 5=MULTIPOLYGON, -2=unknown/unsupported.

d_empty_flagscp.ndarray

Per-geometry EMPTY flag, uint8, shape (n_geometries,). 1 if the geometry uses the EMPTY keyword, 0 otherwise.

n_geometriesint

Number of geometries detected.

d_depth: cupy.ndarray
d_geom_starts: cupy.ndarray
d_family_tags: cupy.ndarray
d_empty_flags: cupy.ndarray
n_geometries: int
vibespatial.io.wkt_gpu.wkt_structural_analysis(d_bytes: cupy.ndarray) WktStructuralResult

Perform structural analysis and geometry type detection on WKT input.

Given a device-resident byte array containing one or more WKT geometry strings separated by newlines, this function:

  1. Detects line boundaries (newline positions) to delimit geometries.

  2. Computes per-byte parenthesis depth using bracket_depth.

  3. Classifies each geometry by type keyword and detects EMPTY.

The input may contain:

  • Standard WKT: POINT(1 2)

  • EWKT with SRID prefix: SRID=4326;POINT(1 2)

  • Mixed case: Point(1 2), LINESTRING(...)

  • 3D/M suffixes: POINT Z(1 2 3), POINTZ(1 2 3)

  • Empty geometries: POINT EMPTY

Parameters

d_bytescp.ndarray

Device-resident uint8 array of WKT text bytes, shape (n,). Multiple geometries are separated by newline characters (\n, 0x0A). Trailing newlines are handled gracefully.

Returns

WktStructuralResult

Dataclass containing all structural analysis outputs on device.

Notes

WKT has no string quoting, so bracket_depth receives an all-zeros quote-parity array. This causes the depth kernel to treat every parenthesis as structural.

The parenthesis depth array uses the same convention as the GeoJSON bracket depth:

  • Depth 0: outside all geometry parentheses

  • Depth 1: inside the outermost (...)

  • Depth 2+: nested rings, coordinate groups, etc.

Examples

>>> import cupy as cp
>>> wkt = b"POINT(1 2)\nLINESTRING(0 0, 1 1)\nPOLYGON EMPTY"
>>> d_bytes = cp.frombuffer(wkt, dtype=cp.uint8)
>>> result = wkt_structural_analysis(d_bytes)
>>> result.n_geometries
3
>>> result.d_family_tags.get()  # array([0, 1, 2], dtype=int8)
>>> result.d_empty_flags.get()  # array([0, 0, 1], dtype=uint8)
vibespatial.io.wkt_gpu.read_wkt_gpu(d_bytes: cupy.ndarray) vibespatial.geometry.owned.OwnedGeometryArray

Parse WKT bytes on GPU and return device-resident geometry.

Given a device-resident byte array containing one or more WKT geometry strings separated by newlines, this function performs full GPU-accelerated parsing: structural analysis, coordinate extraction, and assembly into an OwnedGeometryArray.

Supported geometry types:

  • POINT, LINESTRING, POLYGON (full support)

  • MULTIPOINT, MULTILINESTRING, MULTIPOLYGON (stretch)

  • EMPTY variants of all types

Parameters

d_bytescp.ndarray

Device-resident uint8 array of WKT text bytes, shape (n,). Multiple geometries are separated by newline characters (\n, 0x0A).

Returns

OwnedGeometryArray

Device-resident geometry array. Coordinates are always fp64. Structural metadata (offsets, validity) is materialized on both host and device per the standard _build_device_*_owned pattern.

Raises

ValueError

If the input contains only unsupported geometry types (e.g., GEOMETRYCOLLECTION) or cannot be parsed.

Notes

Precision (ADR-0002):

All coordinates are parsed and stored as fp64. The structural analysis and counting kernels are integer-only byte classification – no PrecisionPlan is needed for those stages.

Tier classification (ADR-0033):

Uses Tier 1 (custom NVRTC) for geometry-specific scanning and Tier 2 (CuPy) for element-wise operations. Number parsing delegates to the gpu_parse primitives.

Examples

>>> import cupy as cp
>>> wkt = b"POINT(1 2)\nLINESTRING(0 0, 1 1, 2 0)"
>>> d_bytes = cp.frombuffer(wkt, dtype=cp.uint8)
>>> owned = read_wkt_gpu(d_bytes)
>>> owned.row_count
2