vibespatial.io.kml_gpu

GPU KML reader – structural analysis, coordinate extraction, and assembly.

GPU-accelerated KML parser. Given a device-resident byte array containing a KML document, this module performs:

  1. XML comment masking – detect <!-- ... --> comment boundaries and produce a per-byte mask that suppresses tag matches inside comments.

  2. Tag boundary detection – use gpu_parse.pattern_match() to find specific KML structural tags (<coordinates>, <Placemark>, <Point>, <LineString>, <Polygon>, <MultiGeometry>, <outerBoundaryIs>, <innerBoundaryIs>).

  3. Coordinate region detection – pair <coordinates>/</coordinates> tags to identify byte ranges containing coordinate content.

  4. Placemark boundary detection – pair <Placemark>/</Placemark> tags to group geometry by feature.

  5. Geometry type detection – within each Placemark, detect the geometry type tag to classify as Point, LineString, Polygon, or MultiGeometry.

  6. Dimensionality detection – count commas vs spaces in coordinate regions to determine 2D (lon,lat) or 3D (lon,lat,alt) format.

  7. Coordinate extraction – extract numeric values via gpu_parse primitives and de-interleave into x (lon) and y (lat) arrays.

  8. OwnedGeometryArray assembly – build device-resident geometry with proper offset arrays for Point, LineString, and Polygon types.

All operations run on the GPU with zero host materialization until the caller explicitly requests results.

KML coordinate format: lon,lat[,alt] lon,lat[,alt] ... - Components within a tuple: COMMA separated - Tuples: SPACE or NEWLINE separated - Longitude is FIRST (KML convention, unlike WKT which is also lon-first)

Tier classification (ADR-0033):
  • Comment masking: Tier 1 (custom NVRTC – XML-specific comment detection)

  • Tag matching: delegates to gpu_parse.pattern_match (Tier 1)

  • Tag pairing + filtering: Tier 2 (CuPy element-wise + flatnonzero)

  • Geometry type assignment: Tier 1 (custom NVRTC – per-Placemark tag scan)

  • Comma/space counting: Tier 1 (custom NVRTC – per-region counting)

  • Region-to-Placemark assignment: Tier 1 (custom NVRTC – binary search)

  • Number extraction: delegates to gpu_parse primitives (Tier 1/2)

  • Offset building: Tier 2 (CuPy) / CCCL exclusive_sum

  • Assembly: follows wkt_gpu.py / geojson_gpu.py patterns

Precision (ADR-0002):

Structural and counting kernels are integer-only byte classification. No floating-point coordinate computation occurs in those kernels, so no PrecisionPlan is needed (same rationale as gpu_parse/structural.py). Coordinate parsing delegates to gpu_parse.parse_ascii_floats which always produces fp64 – storage precision is always fp64 per ADR-0002.

Attributes

Classes

KmlGpuResult

Result of GPU KML reading.

KmlStructuralResult

Result of KML structural analysis.

Functions

kml_structural_analysis(→ KmlStructuralResult)

Perform structural analysis on a KML document.

read_kml_gpu(→ KmlGpuResult)

Parse KML bytes on GPU and return device-resident geometry.

Module Contents

vibespatial.io.kml_gpu.cp = None
vibespatial.io.kml_gpu.KERNEL_PARAM_I64
vibespatial.io.kml_gpu.KML_FAMILY_POINT: int = 0
vibespatial.io.kml_gpu.KML_FAMILY_LINESTRING: int = 1
vibespatial.io.kml_gpu.KML_FAMILY_POLYGON: int = 2
vibespatial.io.kml_gpu.KML_FAMILY_MULTI: int = 6
vibespatial.io.kml_gpu.KML_FAMILY_UNKNOWN: int = -2
class vibespatial.io.kml_gpu.KmlGpuResult

Result of GPU KML reading.

Attributes

geometryOwnedGeometryArray

Device-resident geometry array.

n_placemarksint

Number of Placemarks (features) read.

attributesdict[str, list[str | None]] or None

Extracted Placemark attributes (name, description) as host-resident string lists. None when no attributes found.

geometry: vibespatial.geometry.owned.OwnedGeometryArray
n_placemarks: int
attributes: dict[str, list[str | None]] | None = None
class vibespatial.io.kml_gpu.KmlStructuralResult

Result of KML structural analysis.

All arrays are device-resident CuPy arrays except n_placemarks which is a Python int.

Attributes

d_coord_startscp.ndarray

Byte offset of the first content byte after each <coordinates> tag, int64, shape (n_coord_regions,). This is the position immediately after the > of the opening tag.

d_coord_endscp.ndarray

Byte offset of the < of each </coordinates> tag, int64, shape (n_coord_regions,). Content bytes are in the half-open range [d_coord_starts[i], d_coord_ends[i]).

d_placemark_startscp.ndarray

Byte offset of each <Placemark> tag, int64, shape (n_placemarks,).

d_placemark_endscp.ndarray

Byte offset one past the > of each </Placemark> tag, int64, shape (n_placemarks,).

d_family_tagscp.ndarray

Geometry family tag per Placemark, int8, shape (n_placemarks,). Values: 0=Point, 1=LineString, 2=Polygon, 6=MultiGeometry, -2=unknown/none.

n_placemarksint

Number of Placemarks detected.

d_coord_starts: cupy.ndarray
d_coord_ends: cupy.ndarray
d_placemark_starts: cupy.ndarray
d_placemark_ends: cupy.ndarray
d_family_tags: cupy.ndarray
n_placemarks: int
vibespatial.io.kml_gpu.kml_structural_analysis(d_bytes: cupy.ndarray) KmlStructuralResult

Perform structural analysis on a KML document.

Given a device-resident byte array containing a KML document, detects:

  1. XML comment regions (suppressed from all subsequent matching).

  2. <coordinates> / </coordinates> region boundaries.

  3. <Placemark> / </Placemark> feature boundaries.

  4. Geometry type per Placemark (Point, LineString, Polygon, MultiGeometry).

Handles KML namespace prefixes: tags like <kml:coordinates> and <kml:Placemark> are matched alongside their unprefixed variants. Also handles opening tags with attributes (e.g., <Placemark id="1">).

Parameters

d_bytescp.ndarray

Device-resident uint8 array of KML file bytes, shape (n,).

Returns

KmlStructuralResult

Dataclass containing all structural analysis outputs on device.

Notes

This function uses gpu_parse.pattern_match() for tag detection rather than building a full XML parser. This is sufficient because KML has a fixed, well-known tag vocabulary and we only need to locate a small set of specific tags.

XML comments (<!-- ... -->) are detected and masked so that tags inside comments are not matched. CDATA sections are not specifically handled since coordinate content in KML does not use CDATA.

The coordinate region boundaries identify the raw text between <coordinates> and </coordinates> tags. The actual coordinate parsing (splitting lon,lat[,alt] tuples) is handled downstream by a separate coordinate extraction step.

Examples

>>> import cupy as cp
>>> kml = b'''<kml><Document>
...   <Placemark><Point><coordinates>-122.08,37.42,0</coordinates></Point></Placemark>
...   <Placemark><LineString><coordinates>-122.08,37.42 -122.09,37.43</coordinates></LineString></Placemark>
... </Document></kml>'''
>>> d_bytes = cp.frombuffer(kml, dtype=cp.uint8)
>>> result = kml_structural_analysis(d_bytes)
>>> result.n_placemarks
2
vibespatial.io.kml_gpu.read_kml_gpu(d_bytes: cupy.ndarray) KmlGpuResult

Parse KML bytes on GPU and return device-resident geometry.

Given a device-resident byte array containing a KML document, this function performs full GPU-accelerated parsing: structural analysis, coordinate extraction, and assembly into an OwnedGeometryArray.

Supported geometry types:

  • Point (full support)

  • LineString (full support)

  • Polygon (full support, including inner rings/holes)

KML coordinate convention: longitude is FIRST (x=lon, y=lat). Altitude (3D) is detected and silently dropped.

Parameters

d_bytescp.ndarray

Device-resident uint8 array of KML file bytes, shape (n,).

Returns

KmlGpuResult

Frozen dataclass with geometry (OwnedGeometryArray), n_placemarks (int), and attributes (dict or None). Coordinates are always fp64. Structural metadata (offsets, validity) is materialized on both host and device per the standard _build_device_*_owned pattern.

Notes

Precision (ADR-0002):

All coordinates are parsed and stored as fp64. The structural analysis and counting kernels are integer-only byte classification – no PrecisionPlan is needed for those stages.

Tier classification (ADR-0033):

Uses Tier 1 (custom NVRTC) for geometry-specific tag scanning and Tier 2 (CuPy) for element-wise operations. Number parsing delegates to the gpu_parse primitives.

Examples

>>> import cupy as cp
>>> kml = b'''<kml><Document>
...   <Placemark><Point><coordinates>-122.08,37.42,0</coordinates></Point></Placemark>
... </Document></kml>'''
>>> d_bytes = cp.frombuffer(kml, dtype=cp.uint8)
>>> result = read_kml_gpu(d_bytes)
>>> result.geometry.row_count
1